dexmate
dexmate26d ago
New

Senior Software Engineer, AI Platform

Santa Clara Officefull-timesenior
OtherSoftware Engineer Ai
0 views0 saves0 applied

Quick Summary

Key Responsibilities

We are looking for a Senior AI Engineer to design, build, and ship AI-powered software across the full stack — from the agentic infrastructure that powers our robot operations, to the backend services that expose it, to the interfaces that operators…

Requirements Summary

Familiarity with harness engineering patterns: AGENTS.md structured repositories, architectural constraint enforcement via linters, observability-driven agent iteration, and agent-first documentation as living systems — not static docs Understanding…

Technical Tools
anthropicdockergographqljavascriptkuberneteslangchainnextjsopenaipostgresqlpythonreactrusttypescriptapi-designci-cdcode-reviewrest-apissystem-design

We are looking for a Senior AI Engineer to design, build, and ship AI-powered software across the full stack — from the agentic infrastructure that powers our robot operations, to the backend services that expose it, to the interfaces that operators and engineers use every day.

You will own AI features end-to-end: from system design through implementation, deployment, and production monitoring. You understand the failure modes of LLM-based systems — non-determinism, prompt injection, runaway tool-calling, token cost spirals — and you build guardrails that prevent them. You are equally comfortable writing agent orchestration logic, designing REST APIs, and shipping a React dashboard.

This is not a research role. We want engineers who have closed the loop from prototype to production with AI systems that real users depend on.

Responsibilities

~1 min read
  • Design, implement, and deploy production-grade AI agents: multi-step reasoning pipelines, tool-calling workflows, multi-agent coordination, and human-in-the-loop handoffs

  • Design and build agent harnesses — the runtime infrastructure (context management, tool definitions, memory, feedback loops, observability, and lifecycle control) that makes agents reliable in production; the model is a component, the harness is the product

  • Engineer context pipelines: dynamic retrieval, re-ranking, semantic search, and GraphRAG as tools within an agentic reasoning loop — not static RAG pipelines; understand when to retrieve, when to use long context, and when to use agent memory

  • Implement production-grade reliability: retry logic with backoff, cost controls, structured output validation, sandboxed tool execution, and checkpoint-resume for long-running agent workflows

  • Develop systematic evaluation frameworks (evals, golden datasets, regression suites, observability traces) that measure agent quality and catch regressions before production

  • Architect and implement scalable backend services and APIs (REST/GraphQL) in Go, Rust, or TypeScript/Node.js

  • Build and maintain integrations with external systems — databases, internal APIs, robot data streams — enabling agents to take real actions with appropriate access controls

  • Own deployment, monitoring, and observability: Docker, Kubernetes, CI/CD pipelines, and LLM-specific tracing and cost tracking

  • Build clean, functional web interfaces in React/Next.js — operator dashboards for robot fleet management, engineering tooling for the AI team, and customer-facing applications

  • Own features end-to-end: product requirements, implementation, testing, rollout, and ongoing maintenance

  • Treat prompt engineering as a first-class engineering discipline: write, test, and version prompts with the same rigor as application code

Requirements

~1 min read
  • 5+ years of professional software engineering experience with a full-stack production track record — this is a software engineering role first; strong fundamentals in system design, data structures, algorithms, and code quality are required

  • Strong command of Python and/or TypeScript at a production level: clean abstractions, testable code, performance awareness, and maintainability — not just scripting

  • Backend engineering depth: Go, Rust, or TypeScript/Node.js for production services — RESTful and GraphQL API design, relational database modeling (PostgreSQL), async programming, caching, and system integration via APIs and webhooks; Python for AI/ML integration and scripting

  • Frontend engineering proficiency: React, Next.js, TypeScript — able to architect and ship functional, production-grade UIs, not just wire up component libraries

  • Software delivery practices: automated testing (unit, integration, end-to-end), CI/CD pipelines, code review, and observability (logging, metrics, alerting)

  • Containerization and deployment: Docker, Kubernetes — able to own a service from code to production without a DevOps handoff

  • Proven, hands-on experience building and deploying LLM-powered systems or AI agents in production — beyond prototypes; you understand the real failure modes (non-determinism, prompt injection, tool-calling loops, cost spirals)

  • Experience with at least one LLM API (Anthropic Claude, OpenAI, or equivalent) and agentic frameworks (LangChain, LangGraph, PydanticAI, or similar)

  • Ability to design agent architectures with appropriate guardrails: structured output validation, retry logic, fallback handling, and human-in-the-loop patterns

Requirements

~1 min read
  • Familiarity with harness engineering patterns: AGENTS.md structured repositories, architectural constraint enforcement via linters, observability-driven agent iteration, and agent-first documentation as living systems — not static docs

  • Understanding of context engineering beyond naive RAG: agentic retrieval, GraphRAG, hybrid search, semantic layers, and when long context windows are a better fit than retrieval

  • Experience with durable execution patterns (Temporal, or similar) for long-running or stateful agent workflows with checkpoint-resume

  • Vector database and embedding experience (Pinecone, Weaviate, pgvector, Voyage AI, etc.) — but as one tool in a broader context engineering stack, not the whole solution

  • Background in robotics, industrial automation, or IoT — experience building software that connects to physical hardware or real-time data streams

  • Experience designing multi-tenant platforms or internal developer platforms (SDKs, golden-path tooling, shared infrastructure)

  • Familiarity with prompt injection risks, sandboxed code execution, and AI security considerations for agents that take real-world actions

  • Active use of AI coding agents (Claude Code, Codex, Gemini, or equivalent) as a core part of your development workflow — you know how to get 10x leverage from them without shipping broken code

Location & Eligibility

Where is the job
Santa Clara Office
On-site at the office
Who can apply
Same as job location

Listing Details

Posted
April 12, 2026
First seen
May 6, 2026
Last seen
May 8, 2026

Posting Health

Days active
0
Repost count
0
Trust Level
14%
Scored at
May 6, 2026

Signal breakdown

freshnesssource trustcontent trustemployer trust
Newsletter

Stay ahead of the market

Get the latest job openings, salary trends, and hiring insights delivered to your inbox every week.

A
B
C
D
Join 12,000+ marketers

No spam. Unsubscribe at any time.

dexmateSenior Software Engineer, AI Platform