Apolloresearch1mo ago

Software Engineer (Infrastructure)

United States·San FranciscoFull-timemid

Software EngineerSoftware Engineering

2 views0 saves0 applied

Apply Now

Quick Summary

Overview

Applications deadline: We are conducting interviews actively and aim to fill this role as soon as we find someone suitable.

Technical Tools

Software EngineerSoftware Engineering

Applications deadline: We are conducting interviews actively and aim to fill this role as soon as we find someone suitable.

Apollo Research's mission is to reduce the risks from scheming frontier AI systems. We work with and are trusted by every frontier AI lab. We test their models before deployment and collaborate with them on scheming mitigations.

We're looking for a Software Engineer to build the platform that the rest of Apollo runs on. This platform determines how quickly we can take on new research and scale our operations, whilst keeping our most valuable assets secure.

Apollo is fast-growing and fast-moving and this role is critical to enabling that growth and strengthening the trust of our partners, which is essential to our mission.

Help set the vision for the platform. As Apollo takes on greater and more complex research, the platform must keep pace. You'll talk with researchers and engineers, understand where things are heading, and propose a plan for what we build, why, and in what order.

Build and maintain Apollo's cloud infrastructure. This means IaC, networking, environment management, observability, and cost control. Everything should be reproducible, auditable, and as automated as possible.

Protect Apollo’s assets. Frontier labs trust us to evaluate their models pre-release. It is essential to build our infrastructure in such a way that our researchers can do this work securely so that labs continue to trust us. You’ll be a key part of this work, designing and implementing controls to keep these assets safe.

Keep things running. We run a collection of services for our team. When something breaks, you'll be one of the first people they come to for help. You know when a quick fix is fine and when you need to take the time to do it properly.

Create infrastructure for agent tooling. As more of our team builds and deploys agents, you'll develop the platform that makes this safe, reliable, and maintainable. You'll work closely with our security lead to make sure the infrastructure meets the bar our partnerships require.

Here are examples of projects which you might build in your first 6 months:

Strengthening Security with Agents: Apollo faces threats every day and we want to make sure our systems are battle hardened. You would build harnesses that use agents to test Apollo’s internal isolation and public facing infrastructure. Finding flaws, reporting and remediating them before an attacker would have a chance.

Observability platform: Stand up the metrics, logging, and health monitoring layer that any Apollo engineer can plug into and immediately get usage, health, and security signals for their service.

Multi-cloud GPU orchestration: The platform that finds and provisions GPUs across providers on demand, so a researcher asking for 10 H200s gets a working cluster with packages, networking and IAM in minutes.

Internal service platform: All employees are now capable of building services with the help of coding agents. This comes with both massive upsides and security concerns. You would build the paved road, to mitigate the risks, that lets any Apollo engineer, researcher, or employee ship a service. This platform becomes a catalyst for all employees delivering more securely.

7+ years of experience in infrastructure, platform, or DevOps engineering

Strong working experience with AWS, Azure, or GCP, in a multi-account and multi-project setup.

Experience with Infrastructure as Code (Terraform, Pulumi, or similar)

Experience with containerisation technologies like Docker

Experience designing and owning infrastructure for a growing engineering team, not just contributing to an existing setup

Experience building systems in programming languages like Python, Rust, Go.

You've built and operated services that real users depend on, and you've been responsible for their uptime, reliability, and scaling

It will be a bonus if you have the following:

Exposure to agentic AI workflows or building platforms that support AI/ML workflows

Familiarity with GPU workloads, ML training infrastructure, or research compute

Demonstrated interest in AI safety (e.g. worked at an AI safety org, relevant coursework or research)

We want to emphasize that people who feel they don’t fulfill all of these characteristics but think they would be a good fit for the position nonetheless are strongly encouraged to apply. We believe that excellent candidates can come from a variety of backgrounds and are excited to give you opportunities to shine.

This role offers market competitive salary, equity, and competitive benefits.

Salary: 215k - 265k USD

Flexible work hours and schedule

Unlimited vacation

Unlimited sick leave

Up to 6 months of paid parental leave

Comprehensive health, dental and vision insurance

Retirement savings with competitive employer matching (e.g. 401(k) for US employees)

Lunch, dinner, and snacks are provided for all employees on workdays

Paid work trips, including staff retreats, business trips, and relevant conferences

A yearly $1,000 (USD) professional development budget

Start Date: We are conducting interviews actively and aim to fill this role as soon as we find someone suitable.

Time Allocation: Full-time

Location: This is an in-person role working out of our San Francisco office. We offer flexible working hours and wfh arrangements.

Visa sponsorship: We sponsor visas in the US. Sponsorship isn't guaranteed for every role or candidate, but if we make you an offer, we'll work with you to find the right visa route.

The SWE (Research) team currently consists of Rusheb Shah, Glen Rodgers, Andrei Matveiakin, and Alex Kedryk. Beyond the SWE team, you will closely interact with the research scientists and engineers as the primary user group of your tools. You can find our full team here.

The rapid rise in AI capabilities offer tremendous opportunities, but also present significant risks. At Apollo Research, we’re primarily concerned with risks from Loss of Control, i.e. risks coming from the model itself rather than e.g. humans misusing the AI. We’re particularly concerned with deceptive alignment / scheming, a phenomenon where a model appears to be aligned but is, in fact, misaligned and capable of evading human oversight. We work on the detection of scheming (e.g. building evaluations), the science of scheming (e.g. model organisms), and scheming mitigations (e.g. anti-scheming, and control). We closely work with multiple frontier AI companies, e.g. to test their models before deployment or collaborate on scheming mitigations.

At Apollo, we aim for a culture that emphasizes truth-seeking, being goal-oriented, giving and receiving constructive feedback, and being friendly and helpful.

If you’re interested in more details about what it’s like working at Apollo, you can find more information here.

Apollo Research is an Equal Opportunity Employer. We value diversity and are committed to providing equal opportunities to all, regardless of age, disability, gender reassignment, marriage and civil partnership, pregnancy and maternity, race, religion or belief, sex, or sexual orientation.

Please complete the application form with your CV. The provision of a cover letter is optional but not necessary. Please also feel free to share links to relevant work samples.

Screening interview
Take-home test (approx. 2 hours)
3 technical interviews
Final interview with Marius (CEO)

Our multi-stage process includes the steps above. The technical interviews will be closely related to tasks the candidate would do on the job. There are no leetcode-style general coding interviews. If you want to prepare for the interviews, we suggest working on hands-on LLM evals projects (e.g. as suggested in our starter guide), such as building LM agent evaluations in Inspect.

Your Privacy and Fairness in Our Recruitment Process: We are committed to protecting your data, ensuring fairness, and adhering to workplace fairness principles in our recruitment process. To enhance hiring efficiency, we use AI-powered tools to assist with tasks such as resume screening. These tools are designed and deployed in compliance with internationally recognized AI governance frameworks. Your personal data is handled securely and transparently. We adopt a human-centred approach: all resumes are screened by a human and final hiring decisions are made by our team. If you have questions about how your data is processed or wish to report concerns about fairness, please contact us at info@apolloresearch.ai.