Epicareer Might not Working Properly
Learn More

Data Engineer

Salary undisclosed

Checking job availability...

Original
Simplified

Job Summary:

We are seeking a skilled and proactive Data Engineer with strong experience in PySpark, Scala, and Kubernetes. The ideal candidate will be responsible for building and maintaining scalable data pipelines.

Key Responsibilities:

  • Design, develop, and maintain data pipelines using PySpark and Scala.
  • Optimize Spark jobs for performance and scalability.
  • Deploy and manage data workflows in containerized environments using Kubernetes.
  • Work closely with data scientists, analysts, and business stakeholders to understand data needs.
  • Ensure data quality, integrity, and security across all pipelines.
  • Participate in code reviews, design discussions, and performance tuning.
  • Monitor, troubleshoot, and resolve production issues in a timely manner.

Technical Skills Required:

Languages: Strong in PySpark and Scala

Big Data: Apache Spark, Hadoop ecosystem

Orchestration: Airflow / Argo / Prefect (optional but nice to have)

DevOps: Git, Jenkins, Helm, CI/CD pipelines

Databases: SQL and NoSQL databases

Job Summary:

We are seeking a skilled and proactive Data Engineer with strong experience in PySpark, Scala, and Kubernetes. The ideal candidate will be responsible for building and maintaining scalable data pipelines.

Key Responsibilities:

  • Design, develop, and maintain data pipelines using PySpark and Scala.
  • Optimize Spark jobs for performance and scalability.
  • Deploy and manage data workflows in containerized environments using Kubernetes.
  • Work closely with data scientists, analysts, and business stakeholders to understand data needs.
  • Ensure data quality, integrity, and security across all pipelines.
  • Participate in code reviews, design discussions, and performance tuning.
  • Monitor, troubleshoot, and resolve production issues in a timely manner.

Technical Skills Required:

Languages: Strong in PySpark and Scala

Big Data: Apache Spark, Hadoop ecosystem

Orchestration: Airflow / Argo / Prefect (optional but nice to have)

DevOps: Git, Jenkins, Helm, CI/CD pipelines

Databases: SQL and NoSQL databases