bespokelabs
New

Backend Engineer

United StatesUnited States·Mountain Viewfull-timemid
EngineeringBackend Developer
0 views0 saves0 applied

Quick Summary

Key Responsibilities

Design and own the sandboxing and execution layer that environments run inside. Build systems to snapshot and restore environment state (disk, process,

Technical Tools
EngineeringBackend Developer

Bespoke Labs is an applied AI research lab pioneering data and RL environment curation for training and evaluating agents.

Recently, we curated Open Thoughts, one of the best open reasoning datasets used by multiple frontier labs, trained SOTA specialized models such as Bespoke-MiniChart-7B and Bespoke-MiniCheck, and built the environment infrastructure that frontier labs and enterprises use to make their agents reliable.

Bespoke is uniquely positioned to capture a large share of data and RL environment curation.

About the Role

~1 min read

We're looking for an Infrastructure Engineer to own the execution layer beneath our RL environments: the systems that let an agent operate inside a realistic, multi-tool world coherently for hours or days.

This is a hard systems problem disguised as an AI job. As the tasks agents can complete keep lengthening, the environments that train them have to stay coherent across far longer horizons than anything that exists today. That means sandboxing and isolation you can trust, execution that's fast and cheap enough to run at training scale, and the ability to snapshot, restore, inspect, and branch a running environment instead of treating every rollout as one-shot. You'll build the platform that makes all of this possible.

You'll work closely with our research and data teams, and directly with frontier labs and enterprise customers, to turn environment designs into infrastructure that runs reliably in production.

Responsibilities

~2 min read
  1. Environment Execution & Sandboxing:

    • Design and own the sandboxing and execution layer that environments run inside. Build systems to snapshot and restore environment state (disk, process, and where relevant memory and accelerator state) so runs can be paused, resumed, inspected, and branched rather than executed once.

    • Develop the machinery to detect failure modes early in a rollout (reward hacks, infra faults, fairness issues) and to revert to a known-good state, patch, and continue.

    • Extend execution to long-horizon and multi-node environments, where an agent operates across many tools and services over hours or days.

  2. Performance & Scale

    • Own the performance characteristics of the platform: throughput, latency, and cost-per-rollout at scale.

    • Drive utilization and scheduling so we can run far more environment rollouts per dollar without sacrificing reliability.

    • Profile and remove bottlenecks across the stack, from container startup to environment teardown.

    • Build the observability that lets us understand what's happening inside thousands of concurrent, long-running rollouts.

  3. Environment Platform

    • Build and maintain the framework for specifying, packaging, and deploying RL environments which is used by both humans and agents authoring environments internally.

    • Create the tooling that lets researchers and environment authors debug a specific failure across hundreds of long agent traces.

  4. Collaboration & Production Excellence

    • Scale prototypes into production systems with reproducible workflows and high engineering standards.

    • Write the documentation and tools that let internal teams and external users build on the platform.

  1. Systems & Infrastructure

    • Strong track record building production systems or research infrastructure at scale: distributed systems, execution engines, container/sandboxing infrastructure, or similar.

    • Deep comfort with the systems layer: containers and isolation (e.g. namespaces, cgroups, VMs, gVisor/Firecracker-style sandboxing), filesystems, process and state management.

    • Experience making systems fast and cheap — profiling, scheduling, resource utilization, and cost optimization at scale.

    • Proficiency with cloud platforms (GCP, AWS) and distributed computing.

    • Strong engineering fundamentals and a systematic approach to testing, validation, and reliability.

  2. Execution & Ownership

    • Comfort operating in ambiguity.

    • Strong Python skills; comfort in a systems language (Rust, Go, or C++) is a plus.

    • Ability to use modern tools such as Claude Code effectively.

  3. Collaboration & Communication

    • Excellent communication skills for working with research teams and enterprise customers.

    • Ability to translate between research needs and infrastructure requirements.

    • Comfortable presenting technical work to diverse audiences.

Nice to Have

~1 min read

Experience with RL training or evaluation infrastructure, or the execution layer for agent rollouts.

Experience with checkpoint/snapshot-restore systems, CRIU, or distributed state management.

Background in high-throughput, low-latency execution systems.

Contributions to widely-used infrastructure, datasets, benchmarks, or open-source systems.

Previous experience in a research engineering or infrastructure role at an AI or systems-heavy company.

Location: Mountain View, CA

Compensation: Competitive salary and equity

Benefits: Health coverage, and the opportunity to work directly with the world's leading AI research labs

Location & Eligibility

Where is the job
Mountain View, United States
Hybrid — some on-site time required
Who can apply
US

Listing Details

Posted
May 27, 2026
First seen
May 27, 2026
Last seen
May 29, 2026

Posting Health

Days active
0
Repost count
0
Trust Level
54%
Scored at
May 27, 2026

Signal breakdown

freshnesssource trustcontent trustemployer trust
Newsletter

Stay ahead of the market

Get the latest job openings, salary trends, and hiring insights delivered to your inbox every week.

A
B
C
D
Join 12,000+ marketers

No spam. Unsubscribe at any time.

bespokelabsBackend Engineer