Job Description

Overall Responsibilities:

  • Design and develop large-scale, fault-tolerant data pipelines using Hadoop and related technologies
  • Optimize data workflows for performance and scalability
  • Collaborate with data scientists, analysts, and stakeholders to understand data needs
  • Maintain data quality, security, and governance standards
  • Analyze existing data architecture and recommend improvements
  • Automate data extraction, transformation, and loading processes
  • Troubleshoot and resolve data pipeline issues promptly
  • Document data systems, architecture, and processes
  • Software Requirements:

  • Strong proficiency in Hadoop ecosystem components like HDFS, MapReduce, YARN
  • Experience with Apache Spark, Hive, Pig, and optionally Kafka
  • Skilled in programming languages: Java, Scala, Python
  • Familiarity with ETL tools and pipelines
  • Knowledge of SQL and NoSQL data...
  • Ready to Apply?

    Take the next step in your AI career. Submit your application to Synechron today.

    Submit Application