parser3mo ago

Senior Site Reliability Engineer (SRE) (Spain / UK)

Spain, England#SP-UK, Greater LondonFull-Timesenior

EngineeringDevops Engineer

3 views0 saves0 applied

Apply Now

Quick Summary

Key Responsibilities

As a Senior SRE Engineer in our Engineering Enablement team, you will: • Architect and Implement Reliability: Design, build, and maintain highly scalable,resilient, and performant systems on Azure,

Technical Tools

EngineeringDevops Engineer

Senior Site Reliability Engineer (SRE)

We are seeking a highly skilled and passionate Senior Site Reliability Engineer to join our
Engineering Enablement team. This is a critical role within a large, complex, and high-impact
initiative focused on deconstructing our monolithic architecture, revitalising our technology
stack, and embedding quality and resilience into every stage of our development lifecycle.
You will play a pivotal role in shaping our future-state platform, driving operational
excellence, and fostering a culture of continuous improvement.

What You'll Do:

As a Senior SRE Engineer in our Engineering Enablement team, you will:

• Architect and Implement Reliability: Design, build, and maintain highly scalable,
resilient, and performant systems on Azure, focusing on our Java, Kafka, and
Couchbase stack.

• Drive Modernisation: Work hands-on as part of the team spearheading the adoption
of Micronaut, standardising application templates, and transitioning to managed cloud
services.

• Enhance Operational Excellence: Develop and implement strategies for improving
system observability (standardised logging, metrics, tracing), alerting, and on-call
practices.

• Automate Everything: Champion automation across the software development
lifecycle (SDLC), from CI/CD pipelines to infrastructure provisioning, focusing on
accelerating delivery and de-risking deployments.

• Incident Management & Learning: Contribute to our mature, blameless post-
incident review process, identifying root causes and implementing preventative

measures to reduce incident hours.

• Tooling & Standards: Develop, maintain, and drive the adoption of shared,
standardised SRE tooling and best practices across engineering teams, including
containerisation (e.g., Docker, Kubernetes on Azure), infrastructure as code (e.g.,
Terraform), and configuration management.

• Mentorship & Collaboration: Provide technical leadership and mentorship to junior
engineers, fostering a culture of SRE principles and operational excellence across the
wider engineering organisation.

• Strategic Input: Contribute to the overall technical strategy and roadmap for our
SRE and platform initiatives, ensuring alignment with business objectives.

What You'll Bring:

• Deep SRE Expertise: Proven experience as a Senior Site Reliability Engineer or a
similar role, with a strong understanding of SRE principles (error budgets,
SLOs/SLIs, toil reduction).

• Azure Cloud Proficiency: Extensive hands-on experience designing, deploying, and
operating highly available and scalable applications on Microsoft Azure.

• Azure Kubernetes Service (AKS) Expertise: Mandatory extensive hands-on
experience with AKS for container orchestration, including deployment, scaling,
monitoring, and troubleshooting.

• Java Ecosystem Mastery: Expert-level proficiency with Java, including experience
with modern frameworks (ideally Micronaut, Spring Boot, or similar) and JVM
performance tuning.

• Distributed Systems Knowledge: Solid understanding and practical experience with
distributed systems, microservices architecture, and associated challenges (e.g.,
consistency, fault tolerance).

• Messaging & Database Expertise: Hands-on experience with an event streaming
platform (ideally Kafka) and NoSQL data storage (ideally Couchbase), including
operational best practices.

• Automation First Mindset: Strong scripting skills (e.g., Python, Bash) and
experience with Infrastructure as Code tools (e.g., Terraform, ARM templates) and
CI/CD pipelines (e.g., Azure DevOps, Jenkins).

• Observability Tools: Experience with monitoring, logging, and alerting tools (e.g.,
Azure Monitor, Prometheus, Grafana, ELK Stack, Splunk).

• Problem-Solving Acumen: Exceptional analytical and troubleshooting skills, with a
methodical approach to diagnosing and resolving complex production issues.

• Communication & Collaboration: Excellent communication skills, with the ability
to articulate complex technical concepts to diverse audiences and collaborate
effectively with cross-functional teams.

• Continuous Improvement: A proactive and innovative mindset, always seeking
ways to improve systems, processes, and team efficiency.

Some of the benefits you’ll enjoy working with us:

• The chance to join an organization with triple-digit growth that is changing the paradigm on how software products are built.

• The opportunity to form part of an amazing, multicultural community of tech expert

• A highly competitive compensation package.

• Medical insurance.

• English lessons.

Come and join our #ParserCommunity.