Job Description
Role Purpose
Enable enterprise customers to operationalize AI workloads by deploying and optimizing model-serving platforms (, NVIDIA Triton, vLLM, KServe) within Rackspace’s Private Cloud and Hybrid environments.This role bridges AI engineering and platform operations, ensuring secure, scalable, and cost-efficient inference services.Key Responsibilities : -
Model Deployment & Optimization Package and deploy ML/LLM models on Triton, vLLM, or KServe within Kubernetes clusters.Tune performance (batching, KV-cache, TensorRT optimizations) for latency and throughput SLAs.Platform Integration Work with VMware VCF9, NSX-T, and vSAN ESA to ensure GPU resource allocation and multi-tenancy.Implement RBAC, encryption, and compliance controls for sovereign/private cloud customers.API & Service Enablement Integrate models with Rackspace’s Unified Infere...
Ready to Apply?
Take the next step in your AI career. Submit your application to Rackspace today.
Submit Application