
Data Scientist – Big Data
Job Description
Choosing Capgemini means choosing a company where you will be empowered to shape your career in the way you'd like, where you'll be supported and inspired by a collaborative community of colleagues around the world, and where you'll be able to reimagine what's possible. Join us and help the world's leading organizations unlock the value of technology and build a more sustainable, more inclusive world.
YOUR ROLE
Are you passionate about leveraging the power of Big Data technologies to solve complex challenges Join our team as a Data Scientist Big Data, where you'll work with cutting-edge tools to design and implement scalable, efficient, and reliable data solutions.
Key Responsibilities
Big Data Solutions: Design and implement robust application solutions using the Hadoop ecosystem, including tools like Hive, HDFS, YARN, and Spark.
Data Pipeline Development: Develop efficient data processing pipelines, leveraging HDFS file formats (e.g., Parquet, ORC, Sequence) for various use cases.
Automation and Monitoring: Create automation scripts using Jenkins, Shell, or Python for builds, testing frameworks, and app configurations. Monitor system resource utilization using tools like Grafana to ensure optimal performance.
Data Warehousing: Contribute to the development and maintenance of data warehousing systems, including data modeling and optimization.
DevOps Integration: Implement CI/CD processes using tools such as Bitbucket, Jenkins, and Azure DevOps (ADO) for automated deployment and version control.
Integration Services: Work with integration tools such as FileIT, MQ, or similar systems to enable seamless data flows.
Control-M: Develop and manage scheduling workflows using Control-M and ensure efficient resource allocation.
Collaboration and Problem Solving: Collaborate with cross-functional teams to identify and solve complex data challenges.
Qualifications and Skills
Required Experience
310 years of experience in:
o Big Data Ecosystem: Hive, HDFS, YARN, Spark (Spark SQL, PySpark), Scala, and Python.
o Scripting: Proficiency in Shell scripting and Python.
o CI/CD Tools: Jenkins, Azure DevOps (ADO), Bitbucket.
o Monitoring Tools: Grafana or equivalent.
o File Formats: In-depth knowledge of HDFS formats (e.g., Parquet, ORC, Sequence).
Solid understanding of concurrent software systems, ensuring scalability, maintainability, and robustness.
Experience in automation and building end-to-end scalable applications.
Preferred Skills
Knowledge of ETL tools like Dataiku.
Experience with Schedulers like Control-M.
Familiarity with integration services like FileIT or MQ.
Understanding of DevOps practices and tools suite.
Java/REST Services/Maven experience is a plus.
WHAT YOU'LL LOVE ABOUT WORKING HERE
We promote Diversity & Inclusion as we believe diversity of thought fuels excellence and innovation.
In Capgemini, you are the architect of your career growth. We equip people in maximizing their full potential by providing wide array of career growth programs that empower them to get the future they want.
Capgemini fosters impactful experiences for its people that would aid in bringing out the best in them for them, for the company, and for their clients.
Disclaimer:
Capgemini is an Equal Opportunity Employer encouraging diversity in the workplace. All qualified applicants will receive consideration for employment without regard to race, national origin gender identity/expression, age, religion, disability, sexual orientation, genetics, veteran status, marital status or any other characteristic protected by law.
This is a general description of the Duties, Responsibilities, Qualifications required for this position. Physical, mental, sensory, or environmental demands may be referenced in an attempt to communicate the manner in which this position traditionally is performed. Whenever necessary to provide individuals with disabilities an equal employment opportunity. Capgemini will consider reasonable accommodations that might involve varying job requirements and/or changing the way this job is performed, provided that such accommodations do not pose an undue hardship. Capgemini is committed to providing reasonable accommodations during our recruitment process. If you need assistance or accommodation, please reach out to your recruiting contact.
Click the following link for more information on your rights as an Applicant http://www.capgemini.com/resources/equal-employment-opportunity-is-the-law
Job Description
Choosing Capgemini means choosing a company where you will be empowered to shape your career in the way you'd like, where you'll be supported and inspired by a collaborative community of colleagues around the world, and where you'll be able to reimagine what's possible. Join us and help the world's leading organizations unlock the value of technology and build a more sustainable, more inclusive world.
YOUR ROLE
Are you passionate about leveraging the power of Big Data technologies to solve complex challenges Join our team as a Data Scientist Big Data, where you'll work with cutting-edge tools to design and implement scalable, efficient, and reliable data solutions.
Key Responsibilities
Big Data Solutions: Design and implement robust application solutions using the Hadoop ecosystem, including tools like Hive, HDFS, YARN, and Spark.
Data Pipeline Development: Develop efficient data processing pipelines, leveraging HDFS file formats (e.g., Parquet, ORC, Sequence) for various use cases.
Automation and Monitoring: Create automation scripts using Jenkins, Shell, or Python for builds, testing frameworks, and app configurations. Monitor system resource utilization using tools like Grafana to ensure optimal performance.
Data Warehousing: Contribute to the development and maintenance of data warehousing systems, including data modeling and optimization.
DevOps Integration: Implement CI/CD processes using tools such as Bitbucket, Jenkins, and Azure DevOps (ADO) for automated deployment and version control.
Integration Services: Work with integration tools such as FileIT, MQ, or similar systems to enable seamless data flows.
Control-M: Develop and manage scheduling workflows using Control-M and ensure efficient resource allocation.
Collaboration and Problem Solving: Collaborate with cross-functional teams to identify and solve complex data challenges.
Qualifications and Skills
Required Experience
310 years of experience in:
o Big Data Ecosystem: Hive, HDFS, YARN, Spark (Spark SQL, PySpark), Scala, and Python.
o Scripting: Proficiency in Shell scripting and Python.
o CI/CD Tools: Jenkins, Azure DevOps (ADO), Bitbucket.
o Monitoring Tools: Grafana or equivalent.
o File Formats: In-depth knowledge of HDFS formats (e.g., Parquet, ORC, Sequence).
Solid understanding of concurrent software systems, ensuring scalability, maintainability, and robustness.
Experience in automation and building end-to-end scalable applications.
Preferred Skills
Knowledge of ETL tools like Dataiku.
Experience with Schedulers like Control-M.
Familiarity with integration services like FileIT or MQ.
Understanding of DevOps practices and tools suite.
Java/REST Services/Maven experience is a plus.
WHAT YOU'LL LOVE ABOUT WORKING HERE
We promote Diversity & Inclusion as we believe diversity of thought fuels excellence and innovation.
In Capgemini, you are the architect of your career growth. We equip people in maximizing their full potential by providing wide array of career growth programs that empower them to get the future they want.
Capgemini fosters impactful experiences for its people that would aid in bringing out the best in them for them, for the company, and for their clients.
Disclaimer:
Capgemini is an Equal Opportunity Employer encouraging diversity in the workplace. All qualified applicants will receive consideration for employment without regard to race, national origin gender identity/expression, age, religion, disability, sexual orientation, genetics, veteran status, marital status or any other characteristic protected by law.
This is a general description of the Duties, Responsibilities, Qualifications required for this position. Physical, mental, sensory, or environmental demands may be referenced in an attempt to communicate the manner in which this position traditionally is performed. Whenever necessary to provide individuals with disabilities an equal employment opportunity. Capgemini will consider reasonable accommodations that might involve varying job requirements and/or changing the way this job is performed, provided that such accommodations do not pose an undue hardship. Capgemini is committed to providing reasonable accommodations during our recruitment process. If you need assistance or accommodation, please reach out to your recruiting contact.
Click the following link for more information on your rights as an Applicant http://www.capgemini.com/resources/equal-employment-opportunity-is-the-law