Job Description

Key Responsibilities

- End-to-end service ownership: design for telemetry, security, resiliency, scalability, and performance; lead sizing/architecture; drive service health reviews and process simplification.


- Incident management and prevention: lead postmortems/RCAs, coordinate fixes, define repair items, and implement data-driven prevention and continuous improvement.


- AI/ML and GenAI delivery: design and integrate solutions with LLMs, RAG, agentic workflows, and conversational AI; build low-latency model serving and retraining pipelines.


- Application engineering: develop performant microservices for distributed, containerized, cloud-native systems.


- Automation: eliminate toil by automating operational workflows, recovery procedures, code delivery, and configuration management; build internal tools and reusable scripts/services to accelerate delivery and reduce errors.


- Observability: define and implement monitor...

Ready to Apply?

Take the next step in your AI career. Submit your application to Oracle today.

Submit Application