Judgment Labs — Agent Product Engineer

United States·San Franciscomid

EngineeringProduct Engineer

4 views0 saves0 applied

Apply Now

Quick Summary

Key Responsibilities

TypeScript (full-stack); no further stack specified.

Technical Tools

EngineeringProduct Engineer

Type: Full-time | On-site | San Francisco (FiDi), CA Compensation: $200,000–$350,000 + Competitive Equity Hiring count: 1 Visa sponsorship: Yes — H-1B Reports to: Not specified on the page

Judgment Labs builds infrastructure for Agent Behavior Monitoring (ABM). Where traditional observability logs exceptions and latency, ABM surfaces behavioral anomalies — instruction drift, context retrieval loss — in production at scale. Hundreds of teams building autonomous agents rely on Judgment to understand post-deployment behavior: clustering patterns across conversations and workflows, correlating regressions to specific interaction types, and pinpointing where reliability breaks down.

Founded: Not stated | Team size: ~19 (per role body) | Total funding: $30M+ (two rounds in the past 5 months) Industry: AI infrastructure / agent observability & evals Website: judgmentlabs.ai Office: San Francisco (FiDi) Investors: Lightspeed, SV Angel, Valor Equity Partners, Nova Global, Chris Manning, Michael Ovitz, Michael Abbott, Cory Levy, Kevin Hartz, and others.

Funded momentum: $30M+ raised across two rounds in five months from top-tier investors.
Build 0-to-1: ship agent capabilities from a blank page on a fast-scaling product surface.
Founding-track: small (~19-person) team; fast track to founding-level experience and direct customer interaction.
Perks: full benefits package, Equinox membership, private chef.

No intake call summary was present on the role page. (flag — request from Contrario if one exists.)

Judgment Labs is hiring an Agent Product Engineer to build high-taste products for self-learning agents. The role is majority agent work on a full-stack baseline. Candidates can come from a front-end/design engineering background or an AI engineering background, but both must have prior experience building and designing agents, ideally at a startup with 0-1 product ownership.

Responsibilities

~1 min read

→Build high-taste agent products that pair powerful behavior with consumer-grade UX polish
→Design and ship agent capabilities from 0 to 1 inside a fast-scaling product surface
→Contribute across the full stack as needed, with the majority of work on agent infrastructure and product
→Translate customer feedback on agent behavior into concrete product iterations
→Help raise the bar on product taste and craft as the team grows past 19 people

Tech stack: TypeScript (full-stack); no further stack specified.

Requirements

~1 min read

3-7 years engineering
Agent or Applied AI background required
0-1 product ownership comfort
Front-end/design engineer or AI engineer
TypeScript fluency
Evals experience heavily preferred

Shipped agents in production at a startup. Can walk through context design, tool design, reasoning loop tradeoffs.
Strong product engineering range. Has shipped customer-facing product features end to end, not just backend infrastructure or research artifacts.
Demonstrably strong communicator. Customer calls, internal demos, technical writing, public talks, or comparable signal of comfort explaining complex AI concepts.
Prior evals, observability, or behavior-monitoring product experience. Direct adjacency to Judgment's space, heavily preferred.
Prior 0-1 product ownership at a seed or Series A startup. Built agents from a blank page.
High-intensity background signal: olympiad medals, debate competitions, competitive athletics, founder experience, or comparable high-output indicators.

No prior agent or Applied AI experience. The bar is production agent work.
Front-end or design engineering only without agent depth.
Backend or infrastructure only without product-shipping experience or customer-facing range.
Weak customer-facing skills. The seat is customer-facing, candidates who want heads-down only are not the fit.
Traditional ML or model-training background without agent system experience.
Cannot or will not work 5 days in person at the FiDi office.

Salary$200,000–$350,000EquityCompetitive (not quantified)On-site policy5 days/week in office (Monday–Friday), FiDiVisa sponsorshipH-1BEmployment typeFull-timeLocationSan Francisco (FiDi), CA (header shows "Chinatown, CA" — see flag)

(Contrario "Required Candidate Q&A" form fields)

Where are you currently located?
LinkedIn URL
Are you legally authorized to work in the United States?
Will you require work sponsorship now and/or in the future?
Are you located in the San Francisco-area and/or willing to relocate?
Are you willing to work on-site in our San Francisco office Monday-Friday?

Stage 1 — Pending Approval — Candidates awaiting initial approval. Stage 2 — Founder vibe check (30 min) — Plus optional 15-min deeper dive into technical projects. Stage 3 — Technical Interview (75 min) — Problem-solving (30 min) + role-specific interview (45 min). Stage 4 — Work Trial Stage 5 — Offer Extended Stage 6 — Candidate Hired — When the candidate accepts and starts.

Pulled from the role page's Ideal Companies grid.

Clearly recognizable: Linear, Cursor, Ramp, Figma, Vercel, CockroachDB, Modal Labs, Anyscale, Runway, Applied Intuition, Anduril, Notion, Nomic AI, MotherDuck, LangChain

None provided on the role page.