Job Description
- Design, develop, and optimize large-scale data processing pipelines using PySpark.
- Utilize Apache tools and frameworks (e.g., Hadoop, Hive, HDFS) for data ingestion, transformation, and management.
- Ensure high performance and reliability of ETL jobs in production environments.
- Collaborate with Data Scientists, Analysts, and stakeholders to deliver robust data solutions.
- Implement data quality checks and maintain data lineage for transparency and auditability.
- Handle ingestion, transformation, and integration of structured and unstructured data sources.
- (If applicable) Leverage Apache NiFi for automated, repeatable data flow management.
- Write clean, efficient, and maintainable code in Python and Java.
- Contribute to architecture, performance tuning, and scalability strategies.
Required Skills:
- 5–7 years of experience in data engineering.
- Strong hands-on experience with ...
Ready to Apply?
Take the next step in your AI career. Submit your application to Stack Digital today.
Submit Application