Job Description
Position Overview (Job Summary):
The role is for an
HPC Engineer
responsible for
designing, deploying, managing, and optimizing
an
on-premises High Performance Computing (HPC)
environment.
The environment includes
SLURM-managed CPU and GPU clusters .
Strong emphasis on
HPC architecture, Linux administration, job scheduling, and cluster operations .
Experience with
parallel/distributed storage (WekaFS, Scality)
is
preferred but optional .
Primary Skills:
HPC Operations & Cluster Management (CPU & GPU)
SLURM Workload Manager (Mandatory) Install/configure/manage SLURM across multiple clusters
Partitions/queues, fairshare, job priority, scheduling policies
Upgrades, migrations, automation via API/integrations
Linux System Administration (RHEL focus) OS patching, hardening, tuning, package management
Troubleshooting & Performance Optimization Cluster health, no...
The role is for an
HPC Engineer
responsible for
designing, deploying, managing, and optimizing
an
on-premises High Performance Computing (HPC)
environment.
The environment includes
SLURM-managed CPU and GPU clusters .
Strong emphasis on
HPC architecture, Linux administration, job scheduling, and cluster operations .
Experience with
parallel/distributed storage (WekaFS, Scality)
is
preferred but optional .
Primary Skills:
HPC Operations & Cluster Management (CPU & GPU)
SLURM Workload Manager (Mandatory) Install/configure/manage SLURM across multiple clusters
Partitions/queues, fairshare, job priority, scheduling policies
Upgrades, migrations, automation via API/integrations
Linux System Administration (RHEL focus) OS patching, hardening, tuning, package management
Troubleshooting & Performance Optimization Cluster health, no...
Ready to Apply?
Take the next step in your AI career. Submit your application to HCLTech today.
Submit Application