Job Description

Roles and Responsibilities :

  • Design, build, and optimize scalable ETL pipelines using Apache Airflow or similar frameworks to process and transform large datasets efficiently.
  • Utilize Spark (PySpark), Kafka, Flink, or similar tools to enable distributed data processing and real-time streaming solutions.
  • Deploy, manage, and optimize data infrastructure on cloud platforms such as AWS, GCP, or Azure, ensuring security, scalability, and cost-effectiveness.
  • Design and implement robust data models, ensuring data consistency, integrity, and performance across warehouses and lakes.
  • Enhance query performance through indexing, partitioning, and tuning techniques for large-scale datasets.
  • Manage cloud-based storage solutions (Amazon S3, Google Cloud Storage, Azure Blob Storage) and ensure data governance, security, and compliance.
  • Work closely with data scientists, analysts, and software engineers to support data-driven decision-maki...

Ready to Apply?

Take the next step in your AI career. Submit your application to Saasguru today.

Submit Application