everops
everops2mo ago
New

Lead Site Reliability Engineer – IT Support Automation

Hqfull-timelead
EngineeringDevops Engineer
0 views0 saves0 applied

Quick Summary

Overview

Overview As technology organizations scale, so does operational friction. IT support teams become overloaded with repetitive tickets — account lockouts, access requests, provisioning tasks, and standard “ask IT” issues that drain time and attention from higher-value work.

Requirements Summary

Experience integrating AI/LLM capabilities into workflow automation Familiarity with ITSM frameworks Background building internal self-service platforms Experience presenting technical strategy to senior leadership Experience operating in…

Technical Tools
awsazuregcpgokubernetespulumipythonterraformci-cdmicroservicesnetworkingrest-apissaas

As technology organizations scale, so does operational friction. IT support teams become overloaded with repetitive tickets — account lockouts, access requests, provisioning tasks, and standard “ask IT” issues that drain time and attention from higher-value work.

EverOps partners directly with enterprise engineering and IT organizations to solve complex operational challenges from within their environments. We don’t patch symptoms — we eliminate root causes.

We are seeking a Lead Site Reliability Engineer to own and execute a comprehensive IT support automation strategy designed to significantly reduce ticket volume and human intervention.

This is not a reactive support role.

This is a systems-level engineering role focused on:

  • Automating resolution paths when tickets do occur

  • Building durable automation frameworks across SaaS and internal platforms

  • Removing systemic friction across the IT lifecycle

  • You will operate heavily within the IT support domain, addressing areas such as:

    • Account lockouts and access management

    • Provisioning and deprovisioning workflows

    • Device and asset lifecycle management

    • Standard internal IT requests

    • SaaS integrations and workflow orchestration

    The expectation is leadership-level ownership. You will define the automation roadmap, architect solutions, and drive initiatives from intake through deployment with measurable outcomes.

    As a Lead SRE, your mission is to:

    • Reduce human intervention across IT support workflows

    • Build automation systems that scale without increasing headcount

    • Architect reliable, observable, production-grade automation services

    • Establish engineering standards for automation development

    • Mentor junior engineers while maintaining direct ownership of delivery

    Success is measured in outcomes:

    • Reduced ticket creation rates

    • Increased fully automated resolution percentages

    • Improved user satisfaction while lowering operational burden

    This role requires deep technical capability combined with strong execution discipline and cross-functional influence.

    Responsibilities

    ~1 min read
    • Analyze ticket trends and identify systemic failure patterns

    • Redesign workflows to remove recurring pain points

    • Replace reactive fixes with preventative engineering solutions

    • Partner with IT and engineering stakeholders to prioritize high-leverage automation opportunities

    • Design and implement automation workflows across multiple SaaS platforms

    • Integrate with third-party and internal APIs (e.g., identity providers, collaboration tools, asset systems, ticketing platforms)

    • Architect resilient API integrations including:

      • Authentication & authorization flows (OAuth2, SAML, token management)

      • Rate limiting and retry strategies

      • Error handling and observability

    • Build self-service systems that allow users to resolve common requests without human escalation

    When no off-the-shelf solution exists, you will:

    • Build lightweight microservices or serverless functions (Python or Go preferred)

    • Develop internal middleware, proxies, or orchestration services

    • Create background automation jobs (cron-style processes)

    • Containerize and deploy services using modern DevOps practices

    You will make thoughtful build-vs-buy decisions, balancing speed, maintainability, and long-term scalability.

    Automation must be as reliable as any production system.

    You will:

    • Implement Infrastructure as Code (Terraform, Pulumi, or similar)

    • Maintain CI/CD pipelines for automation services

    • Design monitoring, logging, and alerting frameworks

    • Define SLIs/SLOs to measure automation reliability

    • Ensure automation services are secure, observable, and resilient

    This is not scripting — this is platform-grade engineering.

    This role requires operating as a single-threaded owner for major initiatives.

    You will:

    • Define solution architecture from concept to deployment

    • Set timelines and milestones autonomously

    • Conduct feasibility validation in development environments

    • Communicate proactively with stakeholders

    • Re-scope tactically to maintain forward momentum when blocked

    • Deliver measurable impact — not just activity

    You are expected to think systemically, move with urgency, and drive initiatives to completion without requiring micro-management.

    • 8+ years in SRE, Platform Engineering, DevOps, or Automation Engineering

    • Proven experience designing enterprise-scale automation systems

    • Strong exposure to IT support domains (access, provisioning, identity, device lifecycle, SaaS operations)

    • Deep experience designing and consuming REST APIs

    • Strong understanding of authentication and authorization patterns

    • Experience orchestrating workflows across multiple SaaS platforms

    • Strong proficiency in Python or Go

    • Experience building production-ready services

    • Advanced scripting for orchestration and automation logic

    • Strong familiarity with at least one major cloud provider (AWS, GCP, or Azure)

    • Containerization and Kubernetes exposure

    • Infrastructure as Code experience

    • Networking fundamentals

    • Identity and access concepts

    • Understanding of asset lifecycle management

    • Experience leading technical initiatives from idea through deployment

    • Ability to mentor junior engineers

    • Strong written and verbal communication skills

    • Comfortable influencing cross-functional stakeholders

    • Data-driven decision-making approach

    You think in terms of leverage, scale, and long-term impact.

    Within 6–12 months, you will have:

    • Eliminated entire categories of recurring IT tickets

    • Implemented durable automation frameworks across core IT workflows

    • Increased automated resolution rates quarter over quarter

    • Reduced manual provisioning and access overhead

    • Established scalable, observable automation systems that continue to compound value

    Your impact will be visible in metrics — not anecdotes.

    Nice to Have

    ~1 min read
    • Experience integrating AI/LLM capabilities into workflow automation

    • Familiarity with ITSM frameworks

    • Background building internal self-service platforms

    • Experience presenting technical strategy to senior leadership

    • Experience operating in high-scale, compliance-sensitive environments

    What We Offer

    ~1 min read
    100% Remote Workplace
    Unlimited Paid Time Off
    Equity – Become a true owner of the company
    401K with company contribution and sponsored healthcare
    Professional Growth – Access to training and certification programs

    Location & Eligibility

    Where is the job
    Hq
    On-site at the office
    Who can apply
    Same as job location

    Listing Details

    Posted
    March 3, 2026
    First seen
    May 6, 2026
    Last seen
    May 8, 2026

    Posting Health

    Days active
    0
    Repost count
    0
    Trust Level
    14%
    Scored at
    May 6, 2026

    Signal breakdown

    freshnesssource trustcontent trustemployer trust
    Newsletter

    Stay ahead of the market

    Get the latest job openings, salary trends, and hiring insights delivered to your inbox every week.

    A
    B
    C
    D
    Join 12,000+ marketers

    No spam. Unsubscribe at any time.

    everopsLead Site Reliability Engineer – IT Support Automation