Job Description
Roles and Responsibilities :
- Design, build, and optimize scalable ETL pipelines using Apache Airflow or similar frameworks to process and transform large datasets efficiently.
- Utilize Spark (PySpark), Kafka, Flink, or similar tools to enable distributed data processing and real-time streaming solutions.
- Deploy, manage, and optimize data infrastructure on cloud platforms such as AWS, GCP, or Azure, ensuring security, scalability, and cost-effectiveness.
- Design and implement robust data models, ensuring data consistency, integrity, and performance across warehouses and lakes.
- Enhance query performance through indexing, partitioning, and tuning techniques for large-scale datasets.
- Manage cloud-based storage solutions (Amazon S3, Google Cloud Storage, Azure Blob Storage) and ensure data governance, security, and compliance.
- Work closely with data scientists, analysts, and software engineers to support data-driven decision-maki...
Ready to Apply?
Take the next step in your AI career. Submit your application to Saasguru today.
Submit Application