Lead Data Engineer

Salary undisclosed

Checking job availability...

Original

Simplified

Design, develop, and implement Spark Scala applications and data processing pipelines to process large volumes of structured and unstructured data.
Integrate Elasticsearch with Spark to enable efficient indexing, querying, and retrieval of data.
Optimize and tune Spark jobs for performance and scalability, ensuring efficient data processing and indexing in Elasticsearch.
Collaborate with data engineers, data scientists, and other stakeholders to understand requirements and translate them into technical specifications and solutions.
Design and deploy data engineering solutions on OpenShift Container Platform (OCP) using containerization and orchestration techniques.
Optimize data engineering workflows for containerized deployment and efficient resource utilization.
Collaborate with DevOps teams to streamline deployment processes, implement CI/CD pipelines, and ensure platform stability.
Monitor and optimize data pipeline performance, troubleshoot issues, and implement necessary enhancements.

Strong expertise in designing and developing data infrastructure using Hadoop, Spark, and related tools (HDFS, Hive, Ranger, etc)

Experience with containerization platforms such as OpenShift Container Platform (OCP) and container orchestration using Kubernetes.
Proficiency in programming languages commonly used in data engineering, such as Spark, Python, Scala, or Java.
Knowledge of DevOps practices, CI/CD pipelines, and infrastructure automation tools (e.g., Docker, Jenkins, Ansible, BitBucket)
Experience with Grafana, Prometheus, Splunk will be an added benefit

Similar Jobs

1d ago

Prudential