Job Description

Your Role
Serve as Subject Matter Expert (SME) for distributed applications on hybrid cloud platforms, documenting best practices and providing guidance to peers.
Champion continuous operational improvements informed by metrics analysis and customer feedback.
Lead incident management, troubleshooting, response coordination, and conduct comprehensive post-incident reviews.
Clearly communicate complex technical issues to development teams, document root causes, and collaborate internally to create robust solutions.
Manage, deploy, and maintain enterprise applications and cloud-based systems using secure, scalable, and reliable frameworks.
Proactively monitor, troubleshoot, and optimize the health, performance, and reliability of applications and platforms.
Perform detailed log analysis and utilize stack traces to debug and resolve issues reported by partners and end-users.
Develop comprehensive documentation covering operational procedures, system configura...

Ready to Apply?

Take the next step in your AI career. Submit your application to OpsWerks today.

Submit Application