Job Description
Job Role: Help build a Site Reliability Engineering culture by sharing the best practices, approaches, documentation, and code with other engineering teams Apply automation and software to any tasks or parts of the system which are performed manually Able to troubleshoot complicated, cross platform issues handling OS, Networking, Database in a cloud-based SaaS environment and handle live production incidents Monitor application performance take steps to improve overall application performance and stability and follow through with implementation Conduct system analysis, configuration management and develops improvements for system software performance, availability and reliability Skills: Experience in monitoring and analyzing infrastructure performance using standard performance monitoring tools Demonstrable experience in Containerization-Docker and orchestration (Kubernetes) Experience with Infrastructure As Code (Terraform, Cloud Formation, Ansible) Knowledge and proven hands-on expe...
Ready to Apply?
Take the next step in your AI career. Submit your application to Confidential today.
Submit Application