We are seeking a highly skilled and experienced Manager of Cloud Operations with a strong focus on Site Reliability Engineering (SRE) to lead our team in ensuring the reliability, performance, and scalability of our cloud-based infrastructure. This role will be pivotal in driving our SRE practices across both AWS and/or Azure environments.
Lead, mentor, and develop a team of DevOps and SRE engineers.
Implement and promote SRE principles and practices across the organization.
Define and monitor service level objectives (SLOs), service level indicators (SLIs), and service level agreements (SLAs).
Develop and implement incident response and post-mortem processes.
Drive automation of operational tasks and infrastructure management.
Design, implement, and maintain scalable and resilient infrastructure on Azure and/or AWS.
Implement infrastructure-as-code (IaC) using tools like Terraform.
Ensure security and compliance of cloud environments.
Manage CI/CD pipelines for automated deployments.
Implement and maintain comprehensive monitoring and alerting systems. Utilize monitoring tools like Azure Monitor, AWS CloudWatch, Prometheus, Grafana, etc.
Communicate effectively with stakeholders at all levels.
Responsible for hiring the right team for the product
Between 12 to 18 years of experience which includes leading SRE teams building highly scalable, secure, efficient, and resilient production systems in AWS and/or Azure.
Proven experience in implementing and managing SRE practices.
Strong understanding of CI/CD pipelines and automation tools.
Proficiency in infrastructure-as-code (IaC) tools (Terraform)
Experience with containerization and orchestration technologies (Docker, Kubernetes).
Strong understanding of networking concepts and protocols.
Experience with monitoring and logging tools (Azure Monitor, CloudWatch, Prometheus, Grafana, ELK stack).
Scripting and programming skills (Python, Bash, etc.).
Experience with various Databases (Oracle, SQLServer, etc.)
Professional growth and Development opportunities.
Working within a team of friendly, skilled people where help is always within reach
Flexible working hours
4 recharge days, where the entire company goes on a brief pause in all geographies for 1 day each quarter. This day can be spent in whatever way helps you recharge, to regain energy, and dive back into the next workday
High-end laptop (Dell or Mac)
Competitive pay and bonus
18 vacation days in a year in addition to 15 days Sick Leave/ Casual leave per calendar year.
16 hours of paid volunteer time off per year
Wedding gift and newborn gift allowance for employees.
26 weeks of paid maternity leave and one week of paid paternity leave.
12 wellness leaves for women employees
Health Insurance of up to 7 lacs for self, spouse, 4 dependent children, and parents. 100% of the premium is paid by Vendavo and it covers the employee, spouse, children, and their parents.
Group Term Insurance coverage up to three times of their Annual CTC . Dependents are not covered.
Group Personal Accident coverage up to three times of Annual CTC. Dependents are not covered.
Provident fund contributions