Checking job availability...
Original
Simplified
Job Summary:
We are seeking a highly skilled Senior Data Engineer with extensive experience in Hadoop, Spark, PySpark, Scala, and data warehousing methodologies. The ideal candidate will be responsible for understanding business requirements, building optimized data pipelines, and ensuring high-quality code development within a collaborative team environment. Experience in the Core Banking and Finance domain is essential, and exposure to AML Domain is preferred but not mandatory.
Key Responsibilities:
Requirement Analysis & Development:
- Understand business, functional, and technical requirements to build effective data transformation jobs using Python, PySpark/SCALA, and Python Framework.
- Translate complex transformation logic to build data pipelines using PySpark/Spark-SQL/Hive for data ingestion from source systems to Data Lake (Hive/HBase/Parquet) and Enterprise Data Domain tables.
- Develop applications using Hadoop tech stack to deliver solutions efficiently, on-time, and within specifications.
Data Pipeline Optimization & Quality Assurance:
- Create optimized data pipelines and produce unit tests for Spark transformations and helper methods.
- Perform peer code quality reviews and enforce quality checks as a gatekeeper.
- Ensure smooth production deployments and verify post-production deployments.
Technical Expertise & Problem Solving:
- Exhibit strong knowledge of data structures, data manipulation, distributed processing, application development, and automation.
- Hands-on experience in enterprise data architectures, data models, and core banking/finance domains.
- Exposure to TWS job scheduling, Spark streaming, Kafka, and Machine Learning.
Technical Requirements:
Mandatory Experience:
- Total 4+ years of experience as a Data Engineer with at least 5+ years in Hadoop & Spark/PySpark.
- Strong hands-on experience in Hadoop, Spark, PySpark, Scala, Hive, Spark-SQL, Python, Impala.
- Proficiency in CI/CD pipelines, Git, Jenkins, Agile methodologies, and DevOps.
- Experience with Cloudera Distribution.
- In-depth knowledge of data warehousing methodology and Change Data Capture.
- Familiarity with Oracle, Spark streaming, Kafka, and Machine Learning.
Preferred Experience:
- Hands-on experience with RDBMS databases (MariaDB, SQL Server, MySQL, or Oracle) and stored procedures.
- Exposure to the AML domain is a plus but not mandatory.
- Good understanding of Core Banking and Finance domain.