Job Description

Position Overview (Job Summary):
The role is for an

HPC Engineer

responsible for

designing, deploying, managing, and optimizing

an

on-premises High Performance Computing (HPC)

environment.
The environment includes

SLURM-managed CPU and GPU clusters .
Strong emphasis on

HPC architecture, Linux administration, job scheduling, and cluster operations .
Experience with

parallel/distributed storage (WekaFS, Scality)

is

preferred but optional .
Primary Skills:
HPC Operations & Cluster Management (CPU & GPU)
SLURM Workload Manager (Mandatory) Install/configure/manage SLURM across multiple clusters
Partitions/queues, fairshare, job priority, scheduling policies
Upgrades, migrations, automation via API/integrations
Linux System Administration (RHEL focus) OS patching, hardening, tuning, package management
Troubleshooting & Performance Optimization Cluster health, no...

Ready to Apply?

Take the next step in your AI career. Submit your application to HCLTech today.

Submit Application