Job Description
AI Infrastructure Engineer- L3
The Role
The AI Infrastructure Engineer (L3) provides advanced engineering and architectural expertise for high‑performance AI and ML infrastructure. This role focuses on building, optimizing, and scaling GPU/accelerator environments and distributed systems for large‑scale training and inference workloads.
Competency Focus: High‑performance computing (HPC), distributed systems, Kubernetes, GPU orchestration, cloud optimization
Keywords: Nvidia GPU Infrastructure, Kubernetes, GPU Cluster Administrator, Infrastructure SME, RCA
Responsibilities:
Deploy, configure, and manage GPU and AI accelerator platforms (NVIDIA A100/H100/L40, AMD Instinct, TPU).
Troubleshot GPU hardware and software issues, including failures, thermal throttling, PCIe/NVLink topology, and driver conflicts.
Install, upgrade, and maintain GPU software stacks, including drivers, CUDA, cu DNN, Tensor RT, and firmware.
Perform capacity planning and resource optimiza...
The Role
The AI Infrastructure Engineer (L3) provides advanced engineering and architectural expertise for high‑performance AI and ML infrastructure. This role focuses on building, optimizing, and scaling GPU/accelerator environments and distributed systems for large‑scale training and inference workloads.
Competency Focus: High‑performance computing (HPC), distributed systems, Kubernetes, GPU orchestration, cloud optimization
Keywords: Nvidia GPU Infrastructure, Kubernetes, GPU Cluster Administrator, Infrastructure SME, RCA
Responsibilities:
Deploy, configure, and manage GPU and AI accelerator platforms (NVIDIA A100/H100/L40, AMD Instinct, TPU).
Troubleshot GPU hardware and software issues, including failures, thermal throttling, PCIe/NVLink topology, and driver conflicts.
Install, upgrade, and maintain GPU software stacks, including drivers, CUDA, cu DNN, Tensor RT, and firmware.
Perform capacity planning and resource optimiza...
Ready to Apply?
Take the next step in your AI career. Submit your application to HCLTech today.
Submit Application