Site Reliability Engineer
Quick Summary
Founded in 2017, Obsidian Security was created to close a critical gap: securing the SaaS applications where modern business happens—platforms like Microsoft 365, Salesforce, and hundreds more.
Reliability Engineering: Improve the reliability, availability, and resiliency of Obsidian’s production systems and distributed services Detection & Observability: Build and maintain monitoring, alerting, dashboards, and observability tooling to…
2–5 years of experience in Site Reliability Engineering, DevOps, Production Engineering, or related roles Experience operating and supporting production systems in AWS and/or GCP Familiarity with Kubernetes and Helm in cloud-native environments…
At Obsidian, our Site Reliability Engineers ensure the reliability, scalability, and operational excellence of a complex multi-tenant SaaS platform serving enterprise and financial customers. As an SRE, you will work closely with DevOps, Platform Engineering, and product teams to improve system observability, incident response, and service resilience across the platform.
This is a hands-on engineering role focused on building operational excellence through monitoring, automation, debugging, and continuous improvement. You will help ensure that issues are detected and addressed quickly while contributing to systems that improve platform reliability at scale.
Responsibilities
~1 min read- →Reliability Engineering: Improve the reliability, availability, and resiliency of Obsidian’s production systems and distributed services
- →Detection & Observability: Build and maintain monitoring, alerting, dashboards, and observability tooling to enhance system visibility and reduce operational noise
- →Incident Response & Operations: Support incident response, on-call operations, troubleshooting, and postmortem processes to drive operational excellence
- →Collaboration: Partner with engineering teams to implement SLI/SLO practices, operational standards, and reliability-focused workflows
- →Execution: Automate infrastructure operations, deployment workflows, and platform tooling across Kubernetes, cloud infrastructure, and data pipelines
Requirements
~1 min read- 2–5 years of experience in Site Reliability Engineering, DevOps, Production Engineering, or related roles
- Experience operating and supporting production systems in AWS and/or GCP
- Familiarity with Kubernetes and Helm in cloud-native environments
- Experience with observability and monitoring tools such as Prometheus, Grafana, Datadog, or similar platforms
- Exposure to CI/CD systems such as GitLab CI/CD, GitHub Actions, ArgoCD, or equivalent
- Strong troubleshooting and debugging skills across distributed systems and microservices
- Experience writing automation or infrastructure tooling using scripting or programming languages
- Strong systems thinking and a collaborative engineering mindset
Requirements
~1 min read- AI Agent development experience
- Experience supporting SaaS platforms in production environments
- Familiarity with incident management and postmortem practices
- Exposure to infrastructure-as-code and GitOps workflows
- Understanding of SLI/SLO concepts and operational metrics
- Experience with enterprise-scale monitoring or customer-facing production systems
- Work on reliability challenges across a large-scale distributed SaaS platform
- Build and improve observability and operational tooling used across engineering
- Gain hands-on experience with cloud infrastructure, Kubernetes, and production systems
- Help safeguard critical services for enterprise and financial customers
- Production issues are detected and resolved quickly
- Monitoring and alerting provide clear, actionable operational insights
- Reliability metrics and operational practices improve over time
- Engineering teams can effectively troubleshoot and self-serve observability
- Automation reduces operational toil and improves platform stability
What We Offer
~1 min readPlease note that the base pay range is a guideline and for candidates who receive an offer, the base pay will vary based on factors such as work location, as well as the knowledge, skills and experience of the candidate. In addition to a competitive base salary, this position is eligible for equity awards and may be eligible for sales commission or incentive compensation based on the role or function within the company.
At Obsidian, we are proud to be an equal-opportunity employer. We value diversity and hire for talent, passion, and compassion. In compliance with federal law, all persons hired will be required to submit satisfactory proof of identity and legal authorization. If you have a need that requires accommodation, please contact accommodations@obsidiansecurity.com
Information collected and processed as part of any job applications you choose to submit is subject to Obsidian’s Applicant Privacy Policy.
Location & Eligibility
Listing Details
- Posted
- May 13, 2026
- First seen
- May 13, 2026
- Last seen
- May 14, 2026
Posting Health
- Days active
- 0
- Repost count
- 0
- Trust Level
- 60%
- Scored at
- May 13, 2026
Signal breakdown
Please let Obsidiansecurity know you found this job on Jobera.
3 other jobs at Obsidiansecurity
View all →Explore open roles at Obsidiansecurity.
Similar Devops Engineer jobs
View all →Browse Similar Jobs
Stay ahead of the market
Get the latest job openings, salary trends, and hiring insights delivered to your inbox every week.
No spam. Unsubscribe at any time.