Data Engineering Analyst

Salary undisclosed

Apply on

Original

Simplified

Key Responsibilities:

ETL Development: Use Talend to develop, optimize, and maintain ETL workflows for ingesting, processing, and transforming data from a variety of sources.
Data Pipeline Design: Develop scalable, high-performance data pipelines using Apache Spark and Python for batch and real-time data processing.
Data Integration: Extract and integrate data from multiple sources such as SQL Server, PostgreSQL, AWS Redshift, and Cloudera.
Data Quality: Ensure data quality and consistency by building and implementing validation, cleansing, and transformation logic.
Collaboration: Work closely with data scientists, business analysts, and other stakeholders to understand data requirements and support data-driven decision-making.
Data Modeling: Assist in designing and developing database tables and schemas, including optimizing the performance of databases like MySQL, PostgreSQL, and NoSQL databases (e.g., MongoDB, Cassandra).
Automation: Automate routine data management tasks using Python scripting, Talend, or other workflow automation tools.
Troubleshooting & Optimization: Identify and resolve performance bottlenecks, data discrepancies, and pipeline failures.

Required Qualifications:

Bachelor’s degree in Computer Science, Data Science, Information Systems, or a related field.
3+ years of hands-on experience with Talend for ETL development.
Strong proficiency in Spark, SQL and Python
Extensive experience with Apache Spark for big data processing (both batch and streaming).
Proficient in working with relational databases (SQL Server, PostgreSQL, MySQL) and non-relational databases (MongoDB, Cassandra).
Solid understanding of data structures, database design, and data modeling.
Strong problem-solving and troubleshooting skills.