Job Description
Responsibilities:
- Install, configure, and maintain HPC systems , including compute, storage, and management nodes.
- Manage HPC lab environments , ensuring availability and reliability for software development, testing, and validation activities.
- Perform software installations, upgrades, and patch management for HPC environments.
- Troubleshoot and resolve hardware and system-level issues in collaboration with vendors and internal teams.
- Develop and maintain automation frameworks for system installation, configuration, and monitoring using Python , Ansible , or similar tools.
- Monitor system performance, analyze logs, and proactively identify and resolve performance bottlenecks.
- Implement best practices for system security, backup, and recovery .
- Maintain system ...
Ready to Apply?
Take the next step in your AI career. Submit your application to Hewlett Packard Enterprise today.
Submit Application