AI Model Serving Specialist

Rackspace

📍 India, India, India

Full - Time Computer Occupations Posted February 27, 2026

Apply Now Similar Jobs

Job Description

Role Purpose
Enable enterprise customers to operationalize AI workloads by deploying and optimizing model-serving platforms (, NVIDIA Triton, vLLM, KServe) within Rackspace’s Private Cloud and Hybrid environments.
This role bridges AI engineering and platform operations, ensuring secure, scalable, and cost-efficient inference services.
Key Responsibilities : - 
Model Deployment & Optimization 
Package and deploy ML/LLM models on Triton, vLLM, or KServe within Kubernetes clusters.
Tune performance (batching, KV-cache, TensorRT optimizations) for latency and throughput SLAs.
Platform Integration 
Work with VMware VCF9, NSX-T, and vSAN ESA to ensure GPU resource allocation and multi-tenancy.
Implement RBAC, encryption, and compliance controls for sovereign/private cloud customers.
API & Service Enablement 
Integrate models with Rackspace’s Unified Infere...
                

Ready to Apply?

Take the next step in your AI career. Submit your application to Rackspace today.

Submit Application

Job Details

Location

India, India, India

Job Type

Full - Time