Applied Control Researcher
Quick Summary
Application deadline: We are conducting interviews actively and aim to fill this role as soon as we find someone suitable. THE OPPORTUNITY Join our new AGI safety product team and help transform AI control research into practical tools that directly reduce risks from AI.
TLDR: you will design & implement control protocols (see e.g. [Greenblatt et al, 2023]) and test them on real-world production systems at scale.
You will like this opportunity if you're passionate about using empirical research to make AI systems safer in practice. You enjoy the challenge of translating theoretical AI risks into concrete detection mechanisms. You thrive on rapid iteration and learning from data. You want your research to directly impact real-world AI safety.
- Systematically collect and catalog coding agent failure modes from real-world instances, our internal deployments, public examples, research literature, and theoretical predictions
- Design and conduct experiments to test monitor effectiveness across different failure modes and agent behaviors
- Build and maintain evaluation frameworks to measure progress on monitoring capabilities
- Build and maintain high-quality datasets to train and test monitors on
- Iterate on monitoring approaches based on empirical results, balancing detection accuracy with computational efficiency
- Stay current with research on AI safety, agent failures, and detection methodologies
- Stay current with research into coding security and safety vulnerabilities
- Develop & maintain a comprehensive library of monitoring prompts tailored to specific failure modes (e.g., security vulnerabilities, goal misalignment, deceptive behaviors)
- Experiment with different reasoning strategies and output formats to improve monitor reliability
- Design and test hierarchical monitoring architectures and ensemble approaches
- Optimize log pre-processing pipelines to extract relevant signals while minimizing latency and computational costs
- Implement and evaluate different scaffolding approaches for monitors, including chain-of-thought reasoning, structured outputs, and multi-step verification
- Fine-tune open-source models to create efficient monitors for high-volume production environments
- Design and build agentic monitoring systems that autonomously investigate logs to identify both known and novel failure modes
- Build automated red-teaming pipelines that attack monitors at scale
- Design iterative adversarial games where a red-team and blue team continuously attack and defend respectively
Location & Eligibility
Listing Details
- Posted
- December 17, 2025
- First seen
- March 26, 2026
- Last seen
- May 13, 2026
Posting Health
- Days active
- 47
- Repost count
- 0
- Trust Level
- 23%
- Scored at
- May 13, 2026
Signal breakdown
Please let Apolloresearch know you found this job on Jobera.
4 other jobs at Apolloresearch
View all →Explore open roles at Apolloresearch.
Browse Similar Jobs
Stay ahead of the market
Get the latest job openings, salary trends, and hiring insights delivered to your inbox every week.
No spam. Unsubscribe at any time.