Saviynt7mo ago
Associate Principal Site Reliability Engineer
EngineeringOtherDevOps & InfrastructureSite Reliability EngineerStaff Site Reliability EngineerDevops EngineerInfrastructure & Cloud
1 views0 saves0 applied
Quick Summary
Overview
Saviynt's AI-powered identity platform manages and governs human and non-human access to all of an organization's applications, data, and business processes. Customers trust Saviynt to safeguard their digital assets, drive operational efficiency, and reduce compliance costs.
Technical Tools
awselasticsearchgrafanakuberneteslangchainopenaiprometheuspythonterraformci-cddistributed-systems
Saviynt's AI-powered identity platform manages and governs human and non-human access to all of an organization's applications, data, and business processes. Customers trust Saviynt to safeguard their digital assets, drive operational efficiency, and reduce compliance costs. Built for the AI age, Saviynt is today helping organizations safely accelerate their deployment and usage of AI. Saviynt is recognized as the leader in identity security, with solutions that protect and empower the world’s leading brands, Fortune 500 companies and government institutions. For more information, please visit www.saviynt.com.
We’re a fast-moving AI Security Company building AI-native infrastructure and
applications powered by LLMs and autonomous agents. Our stack is deeply integrated with AWS, Kubernetes, and OpenAI-based systems, and we’re rethinking reliability in a world where software can reason, adapt, and self-heal.
applications powered by LLMs and autonomous agents. Our stack is deeply integrated with AWS, Kubernetes, and OpenAI-based systems, and we’re rethinking reliability in a world where software can reason, adapt, and self-heal.
We’re hiring a Staff SRE Engineer to own reliability across our cloud-native and AI-driven platform. You’ll work at the intersection of distributed systems, Kubernetes operations, and LLM-powered automation, building systems that don’t just scale—but think and fix themselves.
- Own uptime, reliability, and performance of services running on AWS + Kubernetes (EKS).
- Design and implement self-healing infrastructure using automation and AI agents.
- Build LLM-powered operational tooling using APIs such as the OpenAI API for:
- Intelligent alert triage
- Incident summarization
- Root cause analysis
- Runbook automation
- Manage and scale Kubernetes workloads:
- Deployments, autoscaling, resource optimization
- Cluster reliability and cost efficiency
- Build and evolve observability systems:
- Metrics (Prometheus), dashboards (Grafana)
- Logs (ELK / OpenSearch)
- Tracing (OpenTelemetry)
- Define and enforce SLOs, SLAs, and error budgets tied to business metrics.
- Automate infrastructure using Terraform and CI/CD pipelines.
- Lead incident response, postmortems, and continuous reliability improvements.
- Introduce chaos engineering practices to proactively test system resilience.
- 8+ years in SRE / DevOps / Platform Engineering.
- Strong hands-on experience with:
- AWS infrastructure at scale
- Kubernetes (production-grade clusters)
- Proven ability to debug complex distributed systems under pressure.
- Strong coding skills (Python or Go)—you build internal platforms and tools.
- Experience implementing monitoring, alerting, and incident management systems.
Nice to Have
~2 min read- Experience working with LLM APIs such as the OpenAI API.
- Familiarity with agent frameworks like:
- LangChain
- AutoGen
- Built or experimented with:
- AI agents for DevOps / SRE workflows
- Retrieval-Augmented Generation (RAG) systems
- Vector databases (Pinecone, Weaviate, etc.)
- Exposure to AIOps or intelligent automation systems.
If required for this role, you will:
- Complete security & privacy literacy and awareness training during onboarding and annually thereafter
- Review (initially and annually thereafter), understand, and adhere to Information Security/Privacy Policies and Procedures such as (but not limited to):
> Data Classification, Retention & Handling Policy
> Incident Response Policy/Procedures
> Business Continuity/Disaster Recovery Policy/Procedures
> Mobile Device Policy
> Account Management Policy
> Access Control Policy
> Personnel Security Policy
> Privacy Policy
Saviynt is an amazing place to work. We are a high-growth, Platform as a Service company focused on Identity Authority to power and protect the world at work. You will experience tremendous growth and learning opportunities through challenging yet rewarding work which directly impacts our customers, all within a welcoming and positive work environment. If you're resilient and enjoy working in a dynamic environment you belong with us!
Saviynt is an equal opportunity employer and we welcome everyone to our team. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or veteran status.
Location & Eligibility
Where is the job
Bangalore, India
Hybrid — some on-site time required
Who can apply
IN
Listed under
Worldwide
Listing Details
- Posted
- November 11, 2025
- First seen
- March 26, 2026
- Last seen
- June 25, 2026
Posting Health
- Days active
- 91
- Repost count
- 0
- Trust Level
- 33%
- Scored at
- June 25, 2026
Signal breakdown
freshnesssource trustcontent trustemployer trust

Saviynt
lever
Saviynt is a leading provider of cloud-native identity and governance platform solutions, empowering enterprises to secure their digital transformation, safeguard critical assets, and meet regulatory compliance.
View company profileExternal application · ~5 min on Saviynt's site
Please let Saviynt know you found this job on Jobera.
4 other jobs at Saviynt
View all →Explore open roles at Saviynt.
Similar Devops Engineer jobs
View all →Senior Site Reliability Engineer, Kubernetes w/ active TS/SCI
USD 147000-202000
Staff Site Reliability Engineer - Observability
USD 194000-267000
Staff Site Reliability Engineer (SRE)
$135k–$165k/yr
Staff Site Reliability Engineer (SRE)
$135k–$165k/yr
Software Engineer, Site Reliability
Software Engineer, Site Reliability
Browse Similar Jobs
DevOps & Infrastructure2.2kSecurity1.7kEngineering Manager1.4kFullstack Developer1.2kBackend Developer1.1kSoftware Architect1.1kBackend Engineering954Qa Engineer953Data Engineering913Mechanical Engineer907Security Engineer822Mobile Developer803Frontend Developer745Electrical Engineer709Design Engineer553Frontend Engineering544Project Engineer543IT & Administration347Civil Engineer300Product Engineer295
Newsletter
Stay ahead of the market
Get the latest job openings, salary trends, and hiring insights delivered to your inbox every week.
A
B
C
D
No spam. Unsubscribe at any time.