Job Description

Position Overview (Job Summary):

  • The role is for an HPC Engineer responsible for designing, deploying, managing, and optimizing an on-premises High Performance Computing (HPC) environment.
  • The environment includes SLURM-managed CPU and GPU clusters.
  • Strong emphasis on HPC architecture, Linux administration, job scheduling, and cluster operations.
  • Experience with parallel/distributed storage (WekaFS, Scality) is preferred but optional.

Primary Skills:

  1. HPC Operations & Cluster Management (CPU & GPU)
  • SLURM Workload Manager (Mandatory)Install/configure/manage SLURM across multiple clusters
  • Partitions/queues, fairshare, job priority, scheduling policies
  • Upgrades, migrations, automation via API/integrations
  • ...

Ready to Apply?

Take the next step in your AI career. Submit your application to HCLTech today.

Submit Application