Senior Site Reliability Engineer
Quick Summary
Current Vacancy Magnet is proud to offer bene
Design, implement, and maintain infrastructure-as-code using Terraform; contribute to shared module libraries and enforce IaC standards across the team;
Manage and evolve Helm chart definitions and ArgoCD GitOps workflows for multi-region SaaS deployments;
Operate and maintain observability infrastructure including Grafana, alerts, dashboards, and log pipelines. Act to eliminate noise and surface signal;
Contribute to pipeline reliability: identify flaky stages, reduce build times, improve developer experience across CI/CD pipelines;
Remediate security vulnerabilities (CVEs) in container images and infrastructure components; participate in compliance work including FedRAMP support activities;
Develop and maintain runbooks, change management procedures, and operational documentation;
Ensure alignment with internal policies and frameworks such as ISO 27001, SOC2, and NIST;
Contribute to AI-assisted tooling and automation (e.g., Claude-based Terraform agents, automated triage tools) as part of the team's operational efficiency roadmap;
Participate in on-call incident response rotation; lead or support incident command during active production incidents including root cause analysis and post-incident review.
5+ years of industry experience with a trajectory that demonstrates growing depth in cloud infrastructure and SRE practices;
Managed production Kubernetes environments at scale: not just deployed workloads, but owned cluster health, upgrades, and failure modes;
Responded to production incidents in high-stakes environments where downtime has real consequences;
Written and maintained Terraform at the module level, not just as a consumer: understands state, dependencies, and the operational burden of drift;
Operated in an environment that uses GitOps: has a good understanding of Helm chart organization, ArgoCD app-of-apps patterns, or equivalent;
Balanced reactive operational work with proactive roadmap delivery; knows how to protect time for improvements while keeping production stable;
Worked with observability as a first-class discipline: built meaningful dashboards, eliminated alert fatigue, and used metrics to make operational decisions;
Contributed to security hardening in a regulated or compliance-adjacent environment: FedRAMP, SOC 2, or similar frameworks are a strong asset.
What We Offer
~4 min readLocation & Eligibility
Listing Details
- Posted
- June 3, 2026
- First seen
- June 8, 2026
- Last seen
- June 9, 2026
Posting Health
- Days active
- 0
- Repost count
- 0
- Trust Level
- 46%
- Scored at
- June 8, 2026
Signal breakdown
Please let Magnetforensics know you found this job on Jobera.
3 other jobs at Magnetforensics
View all →Explore open roles at Magnetforensics.
Similar Devops Engineer jobs
View all →Browse Similar Jobs
Stay ahead of the market
Get the latest job openings, salary trends, and hiring insights delivered to your inbox every week.
No spam. Unsubscribe at any time.