Job Description
Overall Responsibilities:
Design and develop large-scale, fault-tolerant data pipelines using Hadoop and related technologiesOptimize data workflows for performance and scalabilityCollaborate with data scientists, analysts, and stakeholders to understand data needsMaintain data quality, security, and governance standardsAnalyze existing data architecture and recommend improvementsAutomate data extraction, transformation, and loading processesTroubleshoot and resolve data pipeline issues promptlyDocument data systems, architecture, and processesSoftware Requirements:
Strong proficiency in Hadoop ecosystem components like HDFS, MapReduce, YARNExperience with Apache Spark, Hive, Pig, and optionally KafkaSkilled in programming languages: Java, Scala, PythonFamiliarity with ETL tools and pipelinesKnowledge of SQL and NoSQL data...
Ready to Apply?
Take the next step in your AI career. Submit your application to Synechron today.
Submit Application