AI Agent Architect

India·Bangaloremid

OtherAi Agent Architect

3 views0 saves0 applied

Apply Now

Quick Summary

Key Responsibilities

Develop deep, evidence-grounded intuition for how the agent thinks, succeeds, and fails across the full range of real-world use Mine production behavior for the failure modes, regressions,

Technical Tools

OtherAi Agent Architect

Emergent builds autonomous coding agents that replace traditional software development by generating, testing, and deploying production applications directly from plain-language intent. Our systems run in production at global scale and are used to build millions of real applications.

Since our public launch, we've crossed $100M in ARR and grown to over 10M users across 190+ countries. We're backed by Khosla Ventures, SoftBank, Google, Lightspeed, Prosus, Together, and Y Combinator.

We're solving the hard part of AI-driven software creation: correctness, reliability, security, and scale in real production systems. The team is built by repeat founders, Olympiad medalists, IIT & IIM alumni, and leaders from Google, Amazon, and Dropbox.

We're hiring builders who want ownership, speed, and impact at global scale.

Responsibilities

~1 min read

→Develop deep, evidence-grounded intuition for how the agent thinks, succeeds, and fails across the full range of real-world use
→Mine production behavior for the failure modes, regressions, and bottlenecks most teams never see, and turn them into clear, quantitative signal
→Define and run high-leverage experiments that improve agent quality, reliability, and code outcomes, spanning prompt tuning, eval dataset creation, experimentation, and harness engineering
→Build and evolve evaluation frameworks that measure agent quality at scale: define the metric, build the dataset, validate it against known signals, and ship dashboards that make regressions impossible to miss
→Ship with rigor through clear metrics, evaluation gates, staged rollouts, and explicit rollback criteria
→Drive the hard problems in context engineering, memory systems, tool use, and long-horizon execution, where agent reliability is actually won or lost
→Make hard calls in subjective, probabilistic systems: when a regression is real, when a win is noise, when a benchmark is overfit, when to ship on mixed signals, and when to kill a promising direction
→Think like the agent and continuously make it smarter, more reliable, and more useful

5+ years building and shipping software, with real end-to-end ownership
At least 1 year of hands-on experience with agentic systems and their evaluation
Strong systems thinking paired with sharp product judgment
Hands-on with experimentation, metrics, and debugging, fluent in Python, SQL, or similar for analysis
Comfortable reasoning about noise, confounds, distribution shift, and selection effects, you know when to trust a number and when to suspect it
Energized by the long tail, sifting through large volumes of agent behavior to find the rare, hidden failure mode is the fun part for you
Deep, current interest in LLMs, agents, and emerging model capabilities, you know what shipped last week and why it matters
An independent operator with leadership presence, you scope your own work and push back on weak ideas, including your manager's
Bias toward velocity without compromising honest measurement, you know which corners are safe to cut and which are load-bearing

Principal or Staff ICs, software architects, and lead builders
Engineers who moved into product and never stopped building
Technical cofounders who have owned systems end-to-end and want to stay close to the work