Epicareer Might not Working Properly
Learn More
P

DevOps Architect

$ 5,000 - $ 9,000 / month

Checking job availability...

Original
Simplified

Job Description

Job responsibility

1. DevOps System Construction:

o Responsible for the DevOps system construction of platform products, including CI/CD pipeline, automated deployment, alarm monitoring, etc.

Design and implement best practices for continuous integration, continuous delivery, and continuous deployment.

2. Operation and maintenance architecture design:

o Design highly available and scalable operation and maintenance architecture to support the stable operation of platform products.

o Formulate disaster recovery, backup, and recovery policies to ensure high system availability and data security.

3. Automatic operation and maintenance:

o Develop and maintain automated O&M tools and scripts to improve O&M efficiency.

Implement infrastructure as Code (IaC), using tools such as Terraform, Ansible and others to manage the infrastructure.

4. Monitoring and Alarm:

o Design and implement a comprehensive monitoring system covering servers, networks, applications and databases.

o Configure alarm rules to ensure that problems can be discovered and handled in a timely manner.

5. Performance optimization:

o Analyze and optimize system performance to solve performance bottlenecks under high concurrency and large data volume.

o Design caching policies and load balancing schemes to improve system response speed and throughput.

6. Safety management:

o Design and implement system security policies to prevent network attacks, data leaks and other security risks.

Conduct regular security assessments and vulnerability fixes to ensure systems meet security compliance requirements.

7. Teamwork and guidance:

Work closely with the development team and test team to ensure the smooth running of the DevOps process.

o Coach team members on DevOps tools and best practices.

o Write technical documentation to share operational experience and best practices.

8. Cost control:

o Optimize server resource allocation to reduce hardware and cloud service costs.

o Design resource utilization monitoring mechanism to avoid resource waste.

9. Introduction of new technologies:

o Track industry technology development trends, introduce new technologies to improve operation and maintenance capabilities.

o Evaluate and select suitable tools and technology stacks.

________________________________________

Job requirements:

1. Educational background:

o Bachelor degree or above in computer science, software engineering or related field.

2. Work experience:

o more than 3 years operation or DevOps related experience, platform product operation and maintenance experience is preferred.

3. Technical ability:

o Proficient in CI/CD toolchains (e.g. Jenkins, GitLab CI, ArgoCD).

Familiar with containerization technologies (e.g. Docker, Kubernetes) and microservice architectures.

Familiar with monitoring tools (e.g. Prometheus, Grafana, Zabbix).

o Familiar with automated O&M tools (e.g. Ansible, Terraform, Chef).

o Familiar with cloud services (such as AWS, Alibaba Cloud, Tencent Cloud) and DevOps toolchains.

4. Soft skills:

o Good communication skills and team work spirit.

o Strong analytical and problem solving skills.

o Have technical documentation and training ability.

o A work model that can accommodate long-distance travel.

________________________________________

Bonus point

1. Experience in large-scale distributed system operation and maintenance is preferred.

2. Experience in high concurrency and high availability system optimization is preferred.

3. Experience in cloud computing, big data and artificial intelligence is preferred.

4. Experience contributing to open-source projects or writing technical blogs is a plus.

Job Description

Job responsibility

1. DevOps System Construction:

o Responsible for the DevOps system construction of platform products, including CI/CD pipeline, automated deployment, alarm monitoring, etc.

Design and implement best practices for continuous integration, continuous delivery, and continuous deployment.

2. Operation and maintenance architecture design:

o Design highly available and scalable operation and maintenance architecture to support the stable operation of platform products.

o Formulate disaster recovery, backup, and recovery policies to ensure high system availability and data security.

3. Automatic operation and maintenance:

o Develop and maintain automated O&M tools and scripts to improve O&M efficiency.

Implement infrastructure as Code (IaC), using tools such as Terraform, Ansible and others to manage the infrastructure.

4. Monitoring and Alarm:

o Design and implement a comprehensive monitoring system covering servers, networks, applications and databases.

o Configure alarm rules to ensure that problems can be discovered and handled in a timely manner.

5. Performance optimization:

o Analyze and optimize system performance to solve performance bottlenecks under high concurrency and large data volume.

o Design caching policies and load balancing schemes to improve system response speed and throughput.

6. Safety management:

o Design and implement system security policies to prevent network attacks, data leaks and other security risks.

Conduct regular security assessments and vulnerability fixes to ensure systems meet security compliance requirements.

7. Teamwork and guidance:

Work closely with the development team and test team to ensure the smooth running of the DevOps process.

o Coach team members on DevOps tools and best practices.

o Write technical documentation to share operational experience and best practices.

8. Cost control:

o Optimize server resource allocation to reduce hardware and cloud service costs.

o Design resource utilization monitoring mechanism to avoid resource waste.

9. Introduction of new technologies:

o Track industry technology development trends, introduce new technologies to improve operation and maintenance capabilities.

o Evaluate and select suitable tools and technology stacks.

________________________________________

Job requirements:

1. Educational background:

o Bachelor degree or above in computer science, software engineering or related field.

2. Work experience:

o more than 3 years operation or DevOps related experience, platform product operation and maintenance experience is preferred.

3. Technical ability:

o Proficient in CI/CD toolchains (e.g. Jenkins, GitLab CI, ArgoCD).

Familiar with containerization technologies (e.g. Docker, Kubernetes) and microservice architectures.

Familiar with monitoring tools (e.g. Prometheus, Grafana, Zabbix).

o Familiar with automated O&M tools (e.g. Ansible, Terraform, Chef).

o Familiar with cloud services (such as AWS, Alibaba Cloud, Tencent Cloud) and DevOps toolchains.

4. Soft skills:

o Good communication skills and team work spirit.

o Strong analytical and problem solving skills.

o Have technical documentation and training ability.

o A work model that can accommodate long-distance travel.

________________________________________

Bonus point

1. Experience in large-scale distributed system operation and maintenance is preferred.

2. Experience in high concurrency and high availability system optimization is preferred.

3. Experience in cloud computing, big data and artificial intelligence is preferred.

4. Experience contributing to open-source projects or writing technical blogs is a plus.