GoDaddy
GoDaddy8h ago
New

Senior Manager System Engineering

ColombiaColombiasenior
OtherSystem
0 views0 saves0 applied

Quick Summary

Overview

Location Details: Colombia, remote. At GoDaddy, the future of work looks different for each team.

Technical Tools
OtherSystem

Join GoDaddy's Forge Ops team at the intersection of Data, Infrastructure, and AI-driven operations. As Senior Manager, Systems Engineering, you will lead the reliability, cost efficiency, and agentic operation of the Data & AI ecosystem that serves GoDaddy. This is a deeply technical leadership role, not a hands-off manager position. You will operate as GoDaddy's L1/L2 authority over critical analytics and data platforms while advancing Forge Operations: a structured operating model designed to transition platform operations from hero-based, expert-dependent support to system-based, agent-assisted, self-improving operations. If you can translate a business problem into a technical architecture and that architecture into team execution — and you want to build the AI Ops pattern for a large-scale data organization, this role is for you.

Responsibilities

~1 min read
  • Own and operate GoDaddy's analytical and data intelligence platforms(Redshift, QuickSight, FeedDB, Protegrity, Alation) as the authoritative L1/L2 platform owner — driving reliability, deployment standards, cost optimization, and user enablement across an ecosystem with a 50PB+ data lake and thousands of consumers.
  • Lead 24/7 incident management and production operations across 10+ Data & AI platforms, owning MTTR/MTTD targets, AAR rigor, and a root-cause-to-control loop that converts every incident into a runbook, monitoring improvement, or automation — not just a resolved ticket.
  • Architect and advanced Forge Ops OS, the team's agent-based operating model. This model uses history-informed early warning, auto-recovery agents, runbook intelligence, and bounded agentic orchestration. The team transitions from operating systems to leading all aspects of agents that operate systems.
  • Drive data platform cost efficiency through unit economics— cost per query, cost per workload, cost per dashboard visit — translating AWS spend into measurable business metrics and continuous optimization across Redshift, QuickSight, DPaaS, and ML infrastructure.
  • Manage operational planning and executive reporting weekly, monthly, and quarterly. Run a sprint-based improvement program with a near 70% strategic allocation. Provide clear traceability from team execution to company goals and landmark outcomes.
  • 5+ years validated 24/7 production operations leadership— leading incident response end-to-end, owning MTTR performance, leading post-mortems (AARs) that produce controls, and driving the systemic fixes that reduce incident recurrence
  • Hands-on AWS architecture/platform expertise — Redshift, EMR/Airflow, Lambda, EKS, S3, IAM/RBAC, and CDK/CloudFormation — with end-to-end operational and cost ownership of at least two production data or analytics platforms.
  • Systems and software architecture fluency— able to translate business requirements into scalable technical designs, reason about architectural trade-offs, and decompose solutions into actionable engineering tasks without deferring all technical judgment to individual contributors.
  • Data platform operations at scale— ETL/ELT pipelines, data lakes, orchestration frameworks (Airflow, EMR), and BI tooling — with deep understanding of data quality, SLAs, lineage, and the dependency chains that connect producers to executive-facing consumers.
  • Technical team leadership with operational rigor— proven ability to lead engineers through sprint-based planning, capacity management, and cross-functional delivery, while maintaining the hands-on technical credibility to unblock, review, and elevate the team's output.
  • Experience with AI/agentic operations — building or operating LLM-based tools such as automated runbooks, incident response agents, AAR generation systems, or bounded auto-recovery workflows.
  • Familiarity with graph databases or lineage/observability architectures (e.g., Neptune or equivalent) for dependency mapping, early warning, and blast-radius analysis in large data ecosystems.
  • Hands-on experience with Databricks or analytical compute platforms (Lakehouse, feature stores, ML infrastructure) in a production operations context.
  • Experience with data protection platforms (e.g., Protegrity) and PII/tokenization workflows in large-scale data lake or analytics environments.
  • Familiarity with ServiceNow/CMDB or equivalent incident management systems (Jira, PagerDuty) as operational systems of record — including MTTR/MTTD tracking and CI/lineage integration.

We encourage you to apply even if your experience or abilities don’t align perfectly with every requirement. We value a wide range of backgrounds and transferable skills, and we are excited to support learning and growth.

Requirements

~1 min read

Location & Eligibility

Where is the job
Colombia
On-site within the country
Who can apply
CO

Listing Details

Posted
June 2, 2026
First seen
June 2, 2026
Last seen
June 2, 2026

Posting Health

Days active
0
Repost count
0
Trust Level
67%
Scored at
June 2, 2026

Signal breakdown

freshnesssource trustcontent trustemployer trust
GoDaddy
GoDaddy
greenhouse

GoDaddy helps the world easily start, confidently grow, and successfully run an online presence.

Employees
5k+
Founded
1997
View company profile
Newsletter

Stay ahead of the market

Get the latest job openings, salary trends, and hiring insights delivered to your inbox every week.

A
B
C
D
Join 12,000+ marketers

No spam. Unsubscribe at any time.

GoDaddySenior Manager System Engineering