Job Description

Role Purpose

  • Enable enterprise customers to operationalize AI workloads by deploying and optimizing model-serving platforms (, NVIDIA Triton, vLLM, KServe) within Rackspace’s Private Cloud and Hybrid environments.
  • This role bridges AI engineering and platform operations, ensuring secure, scalable, and cost-efficient inference services.
  • Key Responsibilities : -

  • Model Deployment & Optimization 
  • Package and deploy ML/LLM models on Triton, vLLM, or KServe within Kubernetes clusters.
  • Tune performance (batching, KV-cache, TensorRT optimizations) for latency and throughput SLAs.
  • Platform Integration 
  • Work with VMware VCF9, NSX-T, and vSAN ESA to ensure GPU resource allocation and multi-tenancy.
  • Implement RBAC, encryption, and compliance controls for sovereign/private cloud customers.
  • API & Service Enablement 
  • Integrate models with Rackspace’s Unified Infere...
  • Ready to Apply?

    Take the next step in your AI career. Submit your application to Rackspace today.

    Submit Application