Job Description
You will work closely with ML engineers, data scientists, and DevOps teams to ensure scalable, high-performance, and reliable AI pipelines.
Responsibilities
- Design, deploy, and maintain GPU/TPU clusters, high-performance computing systems, and cloud infrastructure for AI workloads.
- Build and optimise data pipelines to support training and inference for large AI models.
- Collaborate with ML engineers to deploy models efficiently at scale.
- Monitor and troubleshoot infrastructure performance, availability, and security.
- Automate workflows and infrastructure using CI/CD and Infrastructure-as-Code tools.
- Evaluate and integrate emerging AI infrastructure technologies and frameworks.
- Ensure cost-efficient, reliable, and scalable AI operations across production and research environments.
- Maintain documentation for systems, workflows, and best practices.
Ready to Apply?
Take the next step in your AI career. Submit your application to Odiin.AI today.
Submit Application