Epicareer Might not Working Properly
Learn More

Lead Data Engineer

Salary undisclosed

Checking job availability...

Original
Simplified

Description

  • Design, develop, and implement Spark Scala applications and data processing pipelines to process large volumes of structured and unstructured data.
  • Integrate Elasticsearch with Spark to enable efficient indexing, querying, and retrieval of data.
  • Optimize and tune Spark jobs for performance and scalability, ensuring efficient data processing and indexing in Elasticsearch.
  • Collaborate with data engineers, data scientists, and other stakeholders to understand requirements and translate them into technical specifications and solutions.
  • Design and deploy data engineering solutions on OpenShift Container Platform (OCP) using containerization and orchestration techniques.
  • Optimize data engineering workflows for containerized deployment and efficient resource utilization.
  • Collaborate with DevOps teams to streamline deployment processes, implement CI/CD pipelines, and ensure platform stability.
  • Monitor and optimize data pipeline performance, troubleshoot issues, and implement necessary enhancements.

Requirements

  • Strong expertise in designing and developing data infrastructure using Hadoop, Spark, and related tools (HDFS, Hive, Ranger, etc)
  • Experience with containerization platforms such as OpenShift Container Platform (OCP) and container orchestration using Kubernetes.
  • Proficiency in programming languages commonly used in data engineering, such as Spark, Python, Scala, or Java.
  • Knowledge of DevOps practices, CI/CD pipelines, and infrastructure automation tools (e.g., Docker, Jenkins, Ansible, BitBucket)
  • Experience with Grafana, Prometheus, Splunk will be an added benefit