G

SRE / Devops Engineer - Infrastructure Team

IndiaIndiaRemoteEmployee Indiamid
EngineeringOtherDevops Engineer
0 views0 saves0 applied

Quick Summary

Key Responsibilities

Production Operations & Reliability:-> Participate in 24/7 on-call rotations for core infrastructure systems-> Execute incident response during production events, including triage, mitigation,

Requirements Summary

-> Work closely with Cloud Infrastructure, SRE, Platform, Data,

Technical Tools
EngineeringOtherDevops Engineer
About HighLevel:
HighLevel is an AI-powered business operating system that gives agencies, entrepreneurs and SMBs the infrastructure to build, automate and scale. Today, HighLevel supports SMBs across 150+ countries, fueling community-driven growth rooted in real customer outcomes.
To date, businesses operating on HighLevel have generated over $7 billion in ecosystem value, demonstrating the impact of shared infrastructure at scale. By centralizing conversations, automation and intelligence into one system, we help businesses move faster, reduce complexity and execute efficiently.
Behind the platform, HighLevel powers more than 4 billion API hits and 2.5 billion message events daily. With 250 terabytes of distributed data, 250+ microservices and over 1 million domain names supported, our architecture is built for performance, resilience and long-term scalability.

Our People
With over 2,000 team members across 10+ countries, HighLevel operates as a global, remote-first organization built for speed and ownership. We value initiative, clarity and execution, creating space for ambitious people to build systems that support millions of businesses worldwide. Here, innovation thrives, ideas are celebrated and people come first, no matter where they call home.

Our Impact
Every month, HighLevel enables more than 1.5 billion messages, 200 million leads and 20 million conversations for the more than 1 million businesses we support. Behind those numbers are real people building independence, expanding opportunity and creating measurable impact. We’re proud to be a part of that.
Learn more about us on our YouTube Channel or Blog Posts 

About the Role

~2 min read

Production Operations & Reliability:
-> Participate in 24/7 on-call rotations for core infrastructure systems
-> Execute incident response during production events, including triage, mitigation, and recovery
-> Maintain and improve runbooks, operational procedures, and escalation paths
-> Help reduce MTTR and prevent repeat incidents through engineering solutions

Infrastructure Reliability Engineering:
-> Improve reliability of core infrastructure components including: Kubernetes (GKE) clusters, Cloud networking and load balancing & Edge services (Cloudflare)
-> Identify systemic reliability issues and drive corrective actions
-> Support capacity planning, scaling, and resilience testing

Security Operations & Remediation:
-> Execute security remediations across cloud and Kubernetes environments
-> Support enforcement of: IAM least-privilege access, Network security controls & Runtime security policies
-> Partner with Platform Security on vulnerability management and remediation
-> Support security incident response and post-incident reviews

Automation & Tooling:
-> Automate repetitive operational and security tasks
-> Build tooling to improve:Incident response speed, Operational visibility & Security posture enforcement
-> Reduce manual toil through scripts, tooling, and process improvements

Change Management & Governance:
-> Support safe execution of infrastructure and configuration changes
-> Ensure changes follow defined change management and audit requirements
-> Contribute to incident reviews, postmortems, and continuous improvement initiatives

Collaboration & Growth:
-> Work closely with Cloud Infrastructure, SRE, Platform, Data, and Security teams
-> Contribute to shared documentation and operational standards
-> Mentor junior engineers and lead small reliability or security initiatives

  • 4+ years of experience operating large-scale systems
  • Experience with GCP or other public cloud platforms
  • Experience with Kubernetes (GKE) in production
  • Ability to identify systemic issues and propose long-term fixes
  • Experience leading incident response or reliability initiatives
  • Strong understanding of reliability, security, and operational best practices
  • Comfortable working in on-call and incident response environments
  • Strong troubleshooting and communication skills
  • Experience supporting or operating production systems
  • Comfortable mentoring junior engineers and influencing peers
  • Familiarity with Cloudflare, networking, or edge security
  • Exposure to security tooling or vulnerability management
  • Scripting or automation experience (Python, Go, Bash, etc.)
  • Experience in compliance- or audit-driven environments (SOC2, ISO)
  • Location & Eligibility

    Where is the job
    India
    Remote within one country
    Who can apply
    Open to applicants worldwide
    Listed under
    India

    Listing Details

    Posted
    January 13, 2026
    First seen
    April 23, 2026
    Last seen
    May 4, 2026

    Posting Health

    Days active
    11
    Repost count
    0
    Trust Level
    39%
    Scored at
    May 4, 2026

    Signal breakdown

    freshnesssource trustcontent trustemployer trust
    G
    Employees
    7k+
    Founded
    2018
    View company profile
    Newsletter

    Stay ahead of the market

    Get the latest job openings, salary trends, and hiring insights delivered to your inbox every week.

    A
    B
    C
    D
    Join 12,000+ marketers

    No spam. Unsubscribe at any time.

    G
    SRE / Devops Engineer - Infrastructure Team