SBG
SBG1d ago
New

Director, Site Reliability Engineering & Cloud Operations (SRE)

United StatesUnited States·AustinHybridexecutive
OtherSite Reliability Engineering
0 views0 saves0 applied

Quick Summary

Overview

At Resideo, we imagine a world where homes and buildings are good for the planet, and where technology works to simplify everyday life. In that world, people are healthy, happy, and secure.

Technical Tools
OtherSite Reliability Engineering

At Resideo, we imagine a world where homes and buildings are good for the planet, and where technology works to simplify everyday life. In that world, people are healthy, happy, and secure. To help create this future, we will work every day to simplify the connected world so people have peace of mind and can focus on what matters most. Resideo is making a large investment in our engineering group. With global reach and impact, we are dedicated to an investment in building our team as we develop new products and introduce them to consumers around the world (NPI).  Being an established leader in the connected products space, we will give you a platform to work on new and innovative projects as a member of a team of intelligent innovators that are developing products that truly align with our mission of protecting what matters most.

This is an exciting opportunity to lead cloud operations for one of the largest IoT ecosystems in the world, shaping the future of cloud infrastructure, SRE, and AI-driven operations. You'll work alongside world-class engineering talent and cutting-edge technologies to ensure Resideo’s mission of simplifying everyday life through innovative connected products. As a leader, you will have the opportunity to lead the platform engineering transformation in a global organization of multiple teams in delivering on business priorities while collaborating with development leaders and executives to define and advance best practices. 

Resideo is seeking a strategic and experienced leader to oversee the global cloud infrastructure, Site Reliability Engineering (SRE) for our large-scale, connected products ecosystem and CloudOps. This role will drive the performance, reliability, security, and operational excellence of our multi-cloud environments (Azure), supporting millions of IoT devices and trillions of data points and events. The ideal candidate will have deep expertise in cloud infrastructure, IoT, and large-scale SaaS platforms, and be passionate about fostering a culture of innovation, reliability, and automation. 

 

  • Cloud Infrastructure & SRE Strategy
    • Define and execute global cloud operations and SRE strategies, ensuring 99.999%+ uptime for mission-critical IoT services.
    • Architect, implement, and optimize multi-cloud infrastructure to support IoT devices with low-latency data processing, scalability, and high availability.
    • Drive cost optimization strategies while balancing performance, redundancy, and financial efficiency across cloud platforms (Azure). 
    • Develop automated deployment, monitoring, and recovery systems using technologies like Kubernetes, Terraform, Ansible, and CI/CD pipelines.
  • Reliability, Performance & Incident Management
    • Establish and refine SLOs, SLIs, and KPIs for service reliability, performance, and capacity planning.
    • Build and optimize incident management, disaster recovery, and resilience engineering frameworks.
    • Leverage AI/ML-driven automation for proactive failure detection and remediation.
  • Security & Compliance
    • Implement robust security practices and ensure cloud security, compliance with standards such as SOC2, GDPR, and NIST, and oversee the zero-trust security model for IoT data protection.
    • Collaborate with security and compliance teams to manage risk and ensure regulatory adherence across cloud platforms.
  • Team Leadership & Cross-Functional Collaboration
    • Lead and mentor a global team of Cloud Engineers, SREs, and SW professionals, fostering a culture of continuous learning and innovation.
    • Partner with product management, software engineering, and customer support to optimize IoT device onboarding, firmware updates, and cloud-to-edge performance.
    • Collaborate with finance and executive leadership to develop long-term cloud investment strategies.

 

Requirements

~1 min read
  • 15 + years in Computer Science, Electrical Engineering, or a related field
  • 15+ years of experience in Cloud Operations, SRE, or Infrastructure Engineering, with 8+ years in technical leadership roles
  • 5+ years of experience managing large-scale, distributed IoT cloud environments supporting billions of data points per day
  • 5+ years of deep professional experience in Azure cloud platforms including networking, storage, compute, and database services
  • 5+ years of experience in Kubernetes, Terraform, CI/CD pipelines, and observability tools (e.g., Prometheus, Grafana, ELK, etc.)
  • 5+ years of experience in large-scale systems design and architecture, with a focus on reliability, performance, and scalability of cloud-native platforms
  • 5+ years of hands-on experience with tools like Terraform, Ansible, CDK, Pulumi for Infrastructure-as-Code (IaC), and managing cloud-native architectures

 

  • Strong background in AI/ML-driven automation for cloud infrastructure monitoring, self-healing, and optimization
  • Solid understanding of security-first cloud architectures, DevSecOps, and compliance standards (SOC2, GDPR, NIST)
  • Proven ability to manage teams across multiple global time zones, ensuring operational excellence and driving performance in large, distributed environments
  • Expertise in incident management, disaster recovery, and building resilience engineering frameworks
  • Ability and desire to review code, system designs, and engage in system engineering discussions and decisions
  • Experience managing Consumer IoT ecosystems with large-scale sensor data processing and real-time analytics
  • Expertise in serverless architecture, edge computing, and IoT protocol optimization
  • Strong financial acumen in cloud cost management, and forecasting
  • Familiarity with regulatory compliance frameworks such as SOC2, GDPR, and ISO 27001
  • Relevant certifications, such as Azure Expert

 

  • Innovation: Bring your creative ideas to the table and be part of a company that values out-of-the-box thinking

 

#LI-HYBRID

#LI-MA1 

Location & Eligibility

Where is the job
Austin, United States
On-site at the office
Who can apply
US

Listing Details

Posted
May 20, 2026
First seen
May 20, 2026
Last seen
May 20, 2026

Posting Health

Days active
0
Repost count
0
Trust Level
52%
Scored at
May 20, 2026

Signal breakdown

freshnesssource trustcontent trustemployer trust
Newsletter

Stay ahead of the market

Get the latest job openings, salary trends, and hiring insights delivered to your inbox every week.

A
B
C
D
Join 12,000+ marketers

No spam. Unsubscribe at any time.

SBGDirector, Site Reliability Engineering & Cloud Operations (SRE)