Website Brookwood Recruitment
About the Role:
We are looking for a highly skilled Senior Site Reliability Engineer I to join our team. You will focus on ensuring the reliability, performance, scalability, and efficiency of critical systems and services, while reducing operational toil through automation. This role involves designing and implementing complex technical solutions, leading incident response, and coaching junior engineers.
Key Responsibilities:
- Own and monitor services end-to-end, ensuring reliability and performance.
- Build, refactor, and maintain software applications using modern programming languages and best practices.
- Lead resolution of production incidents and drive long-term reliability improvements.
- Implement automation to reduce operational labor, technical debt, and costs.
- Improve observability through metrics, monitoring, and alerting enhancements.
- Provide architectural guidance and mentor junior team members.
- Continuously identify opportunities to improve processes, systems, and performance.
Requirements:
- Master’s degree in Computer Science, Engineering, or related field.
- Proven experience in Site Reliability Engineering, DevOps, or related software engineering roles.
- Strong programming skills and experience with cloud infrastructure, CI/CD, and observability tools.
- Experience in incident management, automation, and reducing operational toil.
- Excellent problem-solving, communication, and mentoring skills.
To apply for this job email your details to apply.a4lmrwo78xi@aptrack.co