Python+Spark Scala
-
Infosys Limited
- Hyderabad
- 2 - 5 Years
- Full Time
- Python
- Scala
- SparkSQL
Applications close on July 29, 2026
Please sign in or register for free to apply.
Job Description
Responsibilities
Big Data & Spark Development
Develop and maintain data processing pipelines using Apache Spark (PySpark & Scala)
Work with Spark DataFrames, RDDs, and Spark SQL
Implement transformations, joins, aggregations, and optimizations
Tune Spark jobs for performance, scalability, and reliability
Python & Scala Programming
Write clean, efficient, and scalable code in Python and Scala
Develop modular and reusable components
Integrate data pipelines with various applications and APIs
ETL & Data Engineering
Design and build ETL workflows for structured and unstructured data
Extract data from multiple sources (databases, APIs, flat files)
Perform data cleansing, transformation, and validation
Ensure data accuracy, consistency, and completeness
Data Platforms & Integration
Work with Hadoop ecosystem (HDFS, Hive, Spark)
Handle large datasets in data lakes and warehouses
Process data in formats like Parquet, ORC, JSON, CSV
Collaboration & Support
Work with data engineers, analysts, and business stakeholders
Troubleshoot pipeline issues and provide production support
Participate in Agile/Scrum processes
Maintain technical documentation
Additional Responsibilities
Core Skills
2–5 years of experience in Python development
Hands-on experience with Apache Spark (PySpark and/or Scala)
Strong understanding of data processing and ETL concepts
Good knowledge of SQL and relational databases
Technical and Professional Requirements
- Primary skills:Technology->Big Data – Data Processing->Spark,Technology->Java->Apache,Technology->Machine Learning->Python
Preferred Skills
- PYTHON
- SparkSQL
- Scala
Educational Requirements
MCA,MSc,MTech,Bachelor of Engineering,BCA,BSc,BTech