Job Description
Site Reliability Engineering (SRE) at TikTok combines software and systems engineering to build and run large-scale, massively distributed, and fault-tolerant systems. In our team, you’ll have the opportunity to manage the complex challenges of scale, while using expertise in coding, algorithms, complexity analysis, and large-scale system design. We embrace a culture of diversity, intellectual curiosity, openness, and problem-solving. We encourage close collaboration while promoting self-direction.
Responsibilities
- Engage in and improve the whole lifecycle of services from inception and design, throughout development, capacity planning, and launch reviews, to deployment, operation, and automate
- Design and implement various dashboards and monitoring frameworks for efficient, automated, and intelligent service-oriented architecture (SOA) governance
- Scale systems elastically through mechanisms such as automation; evolve systems reliability, efficiency, and velocity by pu...
Responsibilities
- Engage in and improve the whole lifecycle of services from inception and design, throughout development, capacity planning, and launch reviews, to deployment, operation, and automate
- Design and implement various dashboards and monitoring frameworks for efficient, automated, and intelligent service-oriented architecture (SOA) governance
- Scale systems elastically through mechanisms such as automation; evolve systems reliability, efficiency, and velocity by pu...
Ready to Apply?
Take the next step in your AI career. Submit your application to TikTok today.
Submit Application