Vacancy Description
Responsibilities
- Own operational health dashboards, alert thresholds, and incident response playbooks for the cloud platform
- Lead on‑call rotations, coordinate major incident resolution, and drive post‑incident reviews
- Implement and maintain Disaster Recovery (DR) solutions for core applications, including DNS routing strategies and low‑RTO repositories
- Manage patching pipelines, golden images, container registries, backups, and automated resilience testing
- Partner with platform engineers to feed operational learnings into architecture improvements and the roadmap
- Use automation and AI‑assisted tools to correlate anomalies, reduce noise, and accelerate root‑cause discovery
- Educate product teams on DR patterns, operational best practices, and shared responsibilities
Requirements
- Bachelor's or Master's degree in Computer Science, Computer Engineering, or equivalent professional experienc...
Ready to Apply?
अभी आवेदन करें
Submit your application for Lead Cloud Engineer (La Plata) at EPAM Systems
Apply for this Position