Job Description
Position Overview
This position will serve as primary support for cloud operations related to IDMC, Tableau, STATA, Sagemaker, and Databricks. The candidate will be responsible for maintaining operational excellence, monitoring and automation, managing incident response and performance optimization, and ensuring governance and cloud best practices.
Role & Responsibilities
Operational Architecture and Reliability
- Design scalable, fault-tolerant, and highly available AWS infrastructure
- Define and implement operational best practices for cloud workloads (compute, storage, database)
Monitoring and Logging
- Build and maintain operational playbooks
- Setup alerts, dashboards, and logs to track health and performance of AWS workloads
Incident Management and Troubleshooting
- Conduct root cause analysis and drive permanent fixes for recurring issues
- D...
Ready to Apply?
Take the next step in your AI career. Submit your application to Synapxe today.
Submit Application