Python-Pyspark Developer
-
Infosys Limited
- Hyderabad
- 5 - 9 Years
- Full Time
- PySpark
- Python
Applications close on July 29, 2026
Please sign in or register for free to apply.
Job Description
Responsibilities
We are looking for an experienced Python PySpark Developer to design, develop, and optimize large-scale data processing systems. The ideal candidate will work on big data platforms, build scalable ETL pipelines, and process high-volume datasets using Spark and Python.
Key Responsibilities
Data Engineering & Development
Develop and maintain data pipelines using Python and PySpark
Process and transform large datasets in distributed environments
Build scalable ETL/ELT workflows
Big Data Processing
Work with Apache Spark (PySpark) for batch and real-time processing
Optimize Spark jobs for performance and efficiency
Handle structured and unstructured data
Data Integration
Ingest data from multiple sources:
Databases (SQL/NoSQL)
APIs
Files (CSV, JSON, Parquet)
Integrate with data platforms like:
Hadoop (HDFS)
Cloud (AWS, Azure, GCP)
Performance Optimization
Tune Spark jobs (partitioning, caching, parallelism)
Optimize SQL queries and transformations
Improve data processing efficiency and cost
Collaboration & Support
Work with data engineers, data scientists, and analysts
Translate business requirements into technical solutions
Participate in code reviews and agile development practices
Monitoring & Troubleshooting
Debug and resolve issues in data pipelines
Monitor job execution and data quality
Ensure reliability and availability of data workflows
Technical and Professional Requirements
- Primary skills:Python, Pyspark
Preferred Skills
- Python
- PySpark
Educational Requirements
MCA,MSc,MTech,Bachelor of Engineering,BCA,BSc,BTech