Skip to content
getujobs
Back Posted on 25/06/2026

Python, PySpark, ETL Developer

Infosys Limited Hyderabad

Applications close on July 25, 2026

  • PySpark
  • Python - Big Data

Data and Analytics | Full Time | 2 - 5 Years

Job Description

Responsibilities

Data Pipeline Development

  • Develop and maintain scalable batch ETL pipelines using Python and PySpark for data ingestion, transformation, and loading.
  • Implement reusable transformation logic, ensuring pipelines are modular, testable, and easy to maintain.
  • Optimize Spark jobs for performance (partitioning, caching, joins, shuffles) and cost efficiency.

Data Quality & Reliability

  • Apply data validation checks, handle schema evolution, and ensure accuracy and completeness of processed datasets.
  • Troubleshoot pipeline failures, analyze logs, and implement robust error handling and retry mechanisms.
  • Monitor job runs and support operational stability through alerts, runbooks, and timely incident resolution.

Collaboration & Delivery

  • Work with cross-functional teams to gather requirements, define data mappings, and deliver datasets aligned to business needs.
  • Participate in code reviews, follow engineering best practices, and contribute to continuous improvement of standards and tooling.
  • Document pipeline logic, dependencies, and operational procedures for smooth handovers and long-term maintainability.

Additional Responsibilities

  • Bachelor’s degree in Computer Science, Engineering, Information Systems, or a related field (or equivalent practical experience).
  • 2–5 years of hands-on experience building data pipelines using Python and PySpark.
  • Strong understanding of ETL concepts, data transformations, and handling large-scale datasets.
  • Proficiency in writing clean, maintainable code and debugging production issues.
  • Working knowledge of data structures, algorithms, and software development best practices.

Technical and Professional Requirements

Technology->Analytics – Packages->Python – Big Data,Technology->Big Data – Data Processing->PySpark, ETL

Preferred Skills

  • Python – Big Data
  • PySpark

Educational Requirements

Bachelor of Engineering