Job Description

  • Design, implement, and optimize ETL pipelines and data processing workflows using PySpark
  • Work on distributed computing frameworks for large-scale data processing
  • Collaborate with Databricks and other cloud platforms for data storage and transformation
  • Perform data analysis, validation, and integration from multiple sources
  • Troubleshoot and resolve data pipeline and processing issues
  • Maintain proper documentation of data workflows, pipelines, and processes
  • Ensure best practices for performance, scalability, and data governance

Key Performance Indicators

  • Timely delivery of data pipelines and ETL workflows
  • Accuracy, consistency, and integrity of processed data
  • Performance and scalability of data processing solutions
  • Effective collaboration with cross-functional teams


Skills Required
Pyspark, Spark, Databricks, Python, Etl, Data Integration, Distri...

Ready to Apply?

Take the next step in your AI career. Submit your application to Sigma Allied Services today.

Submit Application