Job Description
Job Summary: Ensure 24/7 availability and performance of network infrastructure through monitoring, incident response, escalation management, and technical support while maintaining SLAs and collaborating with cross-functional teams.
Key Responsibilities
- Network Monitoring & Performance:
- Use SolarWinds, PRTG, Nagios, Zabbix, and proprietary systems like Cosmos for monitoring
- Analyze performance trends, bandwidth, latency, packet loss, and KPIs
- Maintain monitoring thresholds, alert policies, and SLA metrics
- Incident & Problem Management:
- Respond to alerts and outages within SLA timeframes
- Perform triage, diagnose, and manage incident lifecycle
- Conduct root cause analysis and document lessons learned
- Track trends and recommend preventive measures
- Escalation Management:
- Serve as primary escalation point for Tier 1 and Tier 2 support teams
- Escalate complex issues to Tier...
Ready to Apply?
Take the next step in your AI career. Submit your application to Solvo Global today.
Submit Application