QC DEJobs

Search from over 1.9 Million Available Jobs, No Extra Steps, No Extra Forms, Just DirectEmployers

Job Information

Cummins Inc. Data Engineer in Pune, India

DESCRIPTION

Key Responsibilities:

  • Implement and automate deployment of distributed systems for ingesting and transforming data from various sources (relational, event-based, unstructured).

  • Continuously monitor and troubleshoot data quality and data integrity issues.

  • Implement data governance processes and methods for managing metadata, access, and retention for internal and external users.

  • Develop reliable, efficient, scalable, and quality data pipelines with monitoring and alert mechanisms using ETL/ELT tools or scripting languages.

  • Develop physical data models and implement data storage architectures as per design guidelines.

  • Analyze complex data elements and systems, data flow, dependencies, and relationships to contribute to conceptual, physical, and logical data models.

  • Participate in testing and troubleshooting of data pipelines.

  • Develop and operate large-scale data storage and processing solutions using distributed and cloud-based platforms (e.g., Data Lakes, Hadoop, Hbase, Cassandra, MongoDB, Accumulo, DynamoDB).

  • Use agile development technologies, such as DevOps, Scrum, Kanban, and continuous improvement cycles, for data-driven applications.

RESPONSIBILITIES

Competencies:

  • System Requirements Engineering: Translate stakeholder needs into verifiable requirements; establish acceptance criteria; track status throughout the system lifecycle; assess impact of changes.

  • Collaborates: Build partnerships and work collaboratively with others to meet shared objectives.

  • Communicates Effectively: Develop and deliver multi-mode communications that convey a clear understanding of the unique needs of different audiences.

  • Customer Focus: Build strong customer relationships and deliver customer-centric solutions.

  • Decision Quality: Make good and timely decisions that keep the organization moving forward.

  • Data Extraction: Perform ETL activities from various sources and transform them for consumption by downstream applications and users.

  • Programming: Create, write, and test computer code, test scripts, and build scripts using industry standards and tools.

  • Quality Assurance Metrics: Apply measurement science to assess solution outcomes using ITOM, SDLC standards, tools, metrics, and KPIs.

  • Solution Documentation: Document information and solutions based on knowledge gained during product development activities.

  • Solution Validation Testing: Validate configuration item changes or solutions using SDLC standards and metrics.

  • Data Quality: Identify, understand, and correct data flaws to support effective information governance.

  • Problem Solving: Solve problems using systematic analysis processes and industry-standard methodologies.

  • Values Differences: Recognize the value that different perspectives and cultures bring to an organization.

Education, Licenses, Certifications:

  • College, university, or equivalent degree in a relevant technical discipline, or relevant equivalent experience required.

  • This position may require licensing for compliance with export controls or sanctions regulations.

Nice to Have Experience:

  • Understanding of the ML lifecycle.

  • Exposure to Big Data open source technologies.

  • Familiarity with clustered compute cloud-based implementations.

  • Experience developing applications requiring large file movement for a cloud-based environment.

  • Exposure to building analytical solutions and IoT technology.

Work Environment:

  • Most work will be with stakeholders in the US, with an overlap of 2-3 hours during EST hours as needed.

  • This role will be Hybrid.

QUALIFICATIONS

Experience:

  • 3-5 years of experience in data engineering with a strong background in Azure Databricks and Scala/Python.

  • Hands-on experience with Spark (Scala/PySpark) and SQL.

  • Experience with Spark Streaming, Spark Internals, and Query Optimization.

  • Proficiency in Azure Cloud Services.

  • Experience in Agile Development and Unit Testing of ETL.

  • Experience creating ETL pipelines with ML model integration.

  • Knowledge of Big Data storage strategies (optimization and performance).

  • Critical problem-solving skills.

  • Basic understanding of Data Models (SQL/NoSQL) including Delta Lake or Lakehouse.

  • Quick learner.

Job Systems/Information Technology

Organization Cummins Inc.

Role Category Remote

Job Type Exempt - Experienced

ReqID 2409183

Relocation Package Yes

DirectEmployers