Job Description

We are hiring a Site Reliability Engineer (SRE) to manage, support, and enhance enterprise data platforms. This role focuses on platform reliability, automation, and integration, ensuring scalability, stability, and compliance in a dynamic and fast-paced environment.

The Position:

  • Design and implement automation frameworks to streamline operational tasks for data platforms (e.g., provisioning, configuration, monitoring, and incident remediation).
  • Collaborate with Data Platform Engineers, data product teams, and business stakeholders to ensure reliability and performance of data platforms.
  • Develop and maintain Infrastructure-as-Code (IaC) solutions for deploying and managing data platform components across environments.
  • Establish robust monitoring, alerting, and observability systems to proactively detect and resolve issues.
  • Drive incident management processes, including root cause analysis and post-mortem reviews, to...

Ready to Apply?

Take the next step in your AI career. Submit your application to Tek Systems today.

Submit Application