Job Description

**About The Job:**

We are seeking a visionary and hands-on Senior AI Technical Lead to spearhead our Generative AI initiatives. While many can build a prototype, you are the expert who can take it to production. This role focuses on the end-to-end lifecycle of GenAI: from high-performance inference hosting and automated MLOps pipelines to rigorous model benchmarking and safety guardrails.
You will lead a high-performing team to design systems that are not only intelligent but are scalable, cost-optimized, and ethically governed.

**What Will You Do:**

MLOps & High-Performance Inference

+ Inference Server Management: Architect and optimize model serving using high-throughput engines like vLLM, NVIDIA Triton Inference Server, or TGI (Text Generation Inference).
+ Scalable Hosting: Deploy and manage LLMs on Kubernetes (K8s), implementing auto-scaling based on concurrency and token throughput.
+ MLOps Pipelines: Build robust CI/CD/CT (Continuous ...

Ready to Apply?

Take the next step in your AI career. Submit your application to Red Hat today.

Submit Application