Epicareer Might not Working Properly
Learn More
K

Site Reliability Engineer

$ 8,000 - $ 9,000 / month

Checking job availability...

Original
Simplified

Job Description


We are seeking talented and driven professionals to join our Site Reliability Engineering (SRE) team. This role involves helping organizations enhance the availability, performance, and resilience of their applications and services through the deployment and administration of Observability Platforms


Responsibilities:

  • Deploy and manage Observability platforms and agents for ingesting metrics, logs, and traces from various sources.
  • Parse and organize logs to extract relevant fields and data for processing and filtering.
  • Assist developers in instrumenting application code to collect custom Application Performance Monitoring (APM) data.
  • Record, script, and manage synthetic monitors for testing purposes.
  • Capture user sessions and data for real user monitoring (RUM).
  • Set up alerts and notifications for proactive monitoring.
  • Generate dashboards, visualizations, and reports to provide actionable insights.
  • Participate in and support root cause analysis (RCA) and application/service profiling sessions.
  • Educate and assist teams in leveraging observability tools effectively.

Requirements:

  • Diploma or Degree in Computer Science, Information Technology, or related disciplines.
  • At least 2-5 years of experience working with modern observability platforms.
  • Familiarity with observability concepts and standards such as OpenTelemetry.
  • Experience with observability tools like the Elastic Stack for monitoring cloud infrastructure and application performance.
  • Knowledge of developing, instrumenting, and profiling applications to enhance performance and reliability.

Preferred Qualifications:

Observability Certifications:

oElastic Certified Observability Engineer.

oDynatrace Associate/Professional.

oSplunk O11y Cloud Certified Metrics User.

Cloud/Developer Certifications:

oAWS Developer Associate.

oAzure Developer Associate.

Job Description


We are seeking talented and driven professionals to join our Site Reliability Engineering (SRE) team. This role involves helping organizations enhance the availability, performance, and resilience of their applications and services through the deployment and administration of Observability Platforms


Responsibilities:

  • Deploy and manage Observability platforms and agents for ingesting metrics, logs, and traces from various sources.
  • Parse and organize logs to extract relevant fields and data for processing and filtering.
  • Assist developers in instrumenting application code to collect custom Application Performance Monitoring (APM) data.
  • Record, script, and manage synthetic monitors for testing purposes.
  • Capture user sessions and data for real user monitoring (RUM).
  • Set up alerts and notifications for proactive monitoring.
  • Generate dashboards, visualizations, and reports to provide actionable insights.
  • Participate in and support root cause analysis (RCA) and application/service profiling sessions.
  • Educate and assist teams in leveraging observability tools effectively.

Requirements:

  • Diploma or Degree in Computer Science, Information Technology, or related disciplines.
  • At least 2-5 years of experience working with modern observability platforms.
  • Familiarity with observability concepts and standards such as OpenTelemetry.
  • Experience with observability tools like the Elastic Stack for monitoring cloud infrastructure and application performance.
  • Knowledge of developing, instrumenting, and profiling applications to enhance performance and reliability.

Preferred Qualifications:

Observability Certifications:

oElastic Certified Observability Engineer.

oDynatrace Associate/Professional.

oSplunk O11y Cloud Certified Metrics User.

Cloud/Developer Certifications:

oAWS Developer Associate.

oAzure Developer Associate.