Job Description

Role Overview:

As a Big Data Engineer, you'll design and build robust data pipelines on Cloudera using Spark (Scala/PySpark) for ingestion, transformation, and processing of high-volume data from banking systems.

Key Responsibilities:

  • Build scalable batch and real-time ETL pipelines using Spark and Hive
  • Integrate structured and unstructured data sources
  • Perform performance tuning and code optimization
  • Support orchestration and job scheduling (NiFi, Airflow)

Required education

Bachelor's Degree

Preferred education

Master's Degree

Required technical and professional expertise

  • Experience: 3–15 years
  • Proficiency in PySpark/Scala with Hive/Impala
  • Experience with data partitioning, bucketing, and optimization
  • Familiarity with Kafka, Iceberg, NiFi is a must
  • Knowledge of banking or financial datasets...

Ready to Apply?

Take the next step in your AI career. Submit your application to IBM today.

Submit Application