Python-Pyspark Developer

Infosys Limited
Hyderabad

5 - 9 Years
Full Time

PySpark
Python

Applications close on July 29, 2026

Please sign in or register for free to apply.

Job Description

Responsibilities

We are looking for an experienced Python PySpark Developer to design, develop, and optimize large-scale data processing systems. The ideal candidate will work on big data platforms, build scalable ETL pipelines, and process high-volume datasets using Spark and Python.

Key Responsibilities

Data Engineering & Development

Develop and maintain data pipelines using Python and PySpark

Process and transform large datasets in distributed environments

Build scalable ETL/ELT workflows

Big Data Processing

Work with Apache Spark (PySpark) for batch and real-time processing

Optimize Spark jobs for performance and efficiency

Handle structured and unstructured data

Data Integration

Ingest data from multiple sources:

Databases (SQL/NoSQL)

APIs

Files (CSV, JSON, Parquet)

Integrate with data platforms like:

Hadoop (HDFS)

Cloud (AWS, Azure, GCP)

Performance Optimization

Tune Spark jobs (partitioning, caching, parallelism)

Optimize SQL queries and transformations

Improve data processing efficiency and cost

Collaboration & Support

Work with data engineers, data scientists, and analysts

Translate business requirements into technical solutions

Participate in code reviews and agile development practices

Monitoring & Troubleshooting

Debug and resolve issues in data pipelines

Monitor job execution and data quality

Ensure reliability and availability of data workflows

Technical and Professional Requirements

Primary skills:Python, Pyspark

Preferred Skills

Python
PySpark

Educational Requirements

MCA,MSc,MTech,Bachelor of Engineering,BCA,BSc,BTech