Job Description

We are seeking a highly skilled Data Engineer with strong expertise in PySpark and the Cloudera Data Platform (CDP) . The ideal candidate will design, develop, and maintain scalable data pipelines while ensuring high data quality, performance, and availability across the organisation.

This role requires hands-on experience in big data ecosystems, cloud-native technologies, and advanced data processing frameworks. You will collaborate with cross-functional teams to build reliable and high-performance data solutions that drive business insights.

Key Responsibilities 1. Data Pipeline Development
  • Design, develop, and maintain scalable ETL/ELT pipelines using PySpark on CDP
  • Ensure data integrity, reliability, and performance optimisation
2. Data Ingestion
  • Develop ingestion frameworks to collect data from relational databases, APIs, streaming sources, and file systems
  • ...

Ready to Apply?

Take the next step in your AI career. Submit your application to GSS Tech Group today.

Submit Application