Job Description

About the project
The team develops and maintains distributed services around analytics, APIs, and transaction monitoring. The systems process very large volumes of data — terabytes of storage, trillions of records, continuously growing load.

Infrastructure:

~100 servers (bare metal + VPS)
active use of IaC
Kubernetes clusters in production
focus on stability, observability, and automation

The project is long-term — not a hype startup, but a mature product with real users.

What the work looks like
This is a hands-on role with a clear time allocation:

60% — operations and incidents (including helping teams)
20% — infrastructure automation
20% — prototyping, improvements, technical initiatives

There is on-call responsibility, but normally after-hours incidents happen 2–3 times a year, not every week.

Responsibilities
Operation of production ser...

Ready to Apply?

Take the next step in your AI career. Submit your application to Alex Staff Agency today.

Submit Application