Job Description
Job Description:
Own end‐to‐end support for Domino Data Lab, GCP Dataproc, Galileo, and adjacent ML platforms. Perform installation, upgrades, configuration, patching, and environment maintenance. Monitor cluster health, resource utilization, job execution, performance, and alerts. Troubleshoot ML workloads involving Spark, Python, R, GPUs, containers, and orchestrators based on the JIRA tickets (SLAs are very much applicable). Manage access, security policies, service accounts, and platform governance. Ensure high availability, optimal performance, and adherence to operational SLAs.
Ready to Apply?
Take the next step in your AI career. Submit your application to Cynet Systems today.
Submit Application