Job Description

Development & Tuning
• Apache Spark (Scala, Python)
• Spark performance tuning
• Hive query and table optimization
• HBase data modeling and tuning
Cloudera Platform Components
• HIVE
• HBASE
• Phoenix
• YARN
• HDFS
• Data Engineering
• ETL job design and development
• Data pipelines and batch processing
Scope of Work
• Perform application-aware performance analysis for AML data pipelines.
• Analyze and optimize pipelines running on Cloudera, Spark, Hive, HBase, JBoss, and MariaDB.
• Tune Spark, Hive, and HBase jobs, queries, and tables for performance and scalability.
• Review ETL job design and data pipeline architecture for efficiency, resilience, and scalability.
• Identify misconfigurations or misuse causing performance degradation or recurring issues.
• Assess and recommend tuning for Cloudera, Phoenix, YARN, and related platform configurations.
• Support large-scale data ingestion, transformation, and downstrea...

Ready to Apply?

Take the next step in your AI career. Submit your application to Exasoft today.

Submit Application