Job Description
We are currently seeking an experienced DevOps Engineer who is passionate about deploying, automating, maintaining, and managing production systems. The ideal candidate will ensure that our systems are robust, secure, and able to give the best performance.
Key Responsibilities
- Manage and optimize DL infrastructure (Azure, Blob Storage, VNETs, hybrid/on-prem clusters).
- Scale and maintain Kubernetes clusters for distributed training and inference.
- Benchmark next-gen hardware (H100 vs. H200, cloud vs. deployment nodes) for cost/performance efficiency.
- Build high-performance multimodal data pipelines (Ray, PyArrow, SQL).
- Design robust CI/CD pipelines (GitHub Actions) and monitoring dashboards (Grafana, Prometheus).
- Support embedded evaluation (QNN boards, edge devices) and model compression validation.
- Drive strong software architecture, code quality, and dependency management.
Requirements...
Ready to Apply?
Take the next step in your AI career. Submit your application to Fast Code AI today.
Submit Application