
Site Reliability Engineer
$ 7,000 - $ 13,000 / month
Checking job availability...
Original
Simplified
Responsibilities:
- Serve as technical SME for implementing and operating Microservices on Kubernetes cloud-based platforms.
- Collaborate with the Cloud Technical Development and DevOps teams to deploy services to the Multi-Cloud Platform.
- Performing Load Tests and Chaos Tests to ensure the scalability and reliability of microservices.
- Build Observability for Microservices and cloud platforms like AWS, OCI, Azure, and GCP.
- Write and Execute the Disaster recovery plans in collaboration with the Development and DevOps team.
- Analyze and resolve production risks caused by insufficient resources, such as node groups, CPU, memory, HPA scheduling, JVM pre-warming, etc.
- Write and maintain scripts for automation using languages like Python, Go, or Bash.
- Define and maintain the KPIs (SLA/SLO/SLI) for all cloud microservices with development teams to better understand the business.
- Create and maintain technical documentation, including architecture diagrams, design documents, and standard operating procedures.
- Guarantee adherence to security and compliance standards, including ISO27001, SOC2, and GDPR.
- Lead incident response efforts to troubleshoot and resolve production issues quickly.
- Perform post-incident analysis to identify root causes and potential workarounds/solutions.
- Assist with product/technology selection, including implementation of POCs
- Be fluid and open to change and evolving processes and tools
- Help to mentor and train less senior members of the team
- Ability to be part of On-call rotation and provide support after work hours and on weekends.
- Other duties as assigned
Requirements:
- Bachelor's degree in Computer Science, Information Technology, or a related field.
- 1+ year of experience as a Site Reliability Engineer.
- Proficiency in programming and scripting languages like Java, Python, Bash, or PowerShell.
- Hands-on experience in SRE, DevOps, cloud operations, and cloud security best practices.
- Strong knowledge of security technologies, including Identity and access management, Network security, Application security, and Data protection.
- Strong problem-solving and analytical skills, with the ability to work independently and as part of a team.
- Experience in developing and maintaining technical documentation and implementing compliance requirements
Additional Skills (Preferred):
- Expert-level cloud certifications include AWS Solutions Architect, Professional, Azure Solutions
- Architect Expert, and GCP Professional Cloud Architect.
- Experience with container orchestration technologies (e.g., Kubernetes).