alephalpha
alephalpha2mo ago

Senior AI R&D Engineer- Model Evaluation (f/m/d/)

GermanyGermany·Heidelbergfull-timesenior
EngineeringOtherSoftware EngineerAi Software EngineerSoftware EngineeringR&D Engineer
1 views0 saves0 applied

Quick Summary

Overview

Aleph Alpha Research’s mission is to deliver category-defining AI innovation that enables open, accessible, and trustworthy deployment of GenAI in industrial applications.

Requirements Summary

Understanding of foundation model training - how data, scale, and architecture affect capabilities. Experience with large-scale data processing or ML infrastructure.

Technical Tools
pythonpytorchdistributed-systemsmachine-learning

Aleph Alpha Research’s mission is to deliver category-defining AI innovation that enables open, accessible, and trustworthy deployment of GenAI in industrial applications. Our organization develops foundational models and next-generation methods that make it easy and affordable for Aleph Alpha’s customers to increase productivity in development, engineering, logistics, and manufacturing processes.

At Aleph Alpha, we foster a culture built on ownership, au tonomy, and empowerment. Teams and individual contributors are trusted to take responsibility for their work and drive meaningful impact. We maintain a flat organizational structure with efficient, supportive management that enables quick decision‑making, open communication, and a strong sense of shared purpose.

About the Role

~1 min read

As a Senior AI R&D Engineer- Model Evaluation (f/m/d), you will work in the pre-training evaluations team. Our mission is to give meaningful signals during pre-training runs and provide additional metrics to other teams to make informed decisions (ablations).

Responsibilities

~1 min read

We are a mix of researchers and engineers, and you will support our engineering efforts. Major points include improving the testability of our code through design and architecture changes, and lowering the time it takes for an end-to-end integration of a new benchmark. You drive these changes through incremental, hands-on modifications of our code. Simultaneously, you are expected to work on smaller day-to-day tasks, e.g., maintain our repositories, investigate a spurious benchmark result, or iron out an out-of-memory error.

No two days are the same. Things move fast, and your ability to focus and prioritize is what lets you unblock the team day-to-day while designing the tooling and automation that speeds us up long-term. You will have real influence on what gets built and how. Your work directly shapes how quickly we can experiment and improve our models.

Requirements

~2 min read
  • Capable, driven and open individual that thrives in a dynamic environment: LLMs are rapidly evolving, and we maintain flat hierarchies and the possibility to make an impact across a wide range of areas. Hence - above all - we are looking for highly talented individuals that thrive in such an environment. You should add something unique that helps our efforts, but nobody needs to tick a long list of boxes.

  • Willingness to relocate to Germany. Our primary working locations are Heidelberg and Berlin. We foster an on-site culture with direct communication and collaboration. As such, you should be on-site at your main work location at least two days a week. If you choose Berlin, you should be willing to travel to Heidelberg (our headquarters) every one to two months for a few days.

  • Software engineer with ability to write code that other strong engineers want to build on.

  • Ability to incrementally convert a code-base with accumulated complexity into a more testable and explainable state.

  • Explainer: A lot of decisions we make together. Communicating and convincing the team of your ideas is pivotal skill.

  • Taking initiatives to drive and deliver high-impact work

  • Degree in computer science, engineering, or a related field.

  • Strong Python skills.

  • Deep interest in and willingness to learn about LLM training.

(We encourage you to apply even if you don't check every box!)

  • Experience working with distributed systems.

  • Experience with infrastructure tooling and container orchestration such as docker, Kubernetes, infrastructure as code etc.

  • Experience with LLM evaluation, benchmark design or evaluation dataset curation.

  • Understanding of foundation model training: how data, scale, and architecture affect capabilities.

  • Familiarity with statistical methods.

What We Offer

~1 min read
Become part of an AI revolution!
30 days of paid vacation
Access to a variety of fitness & wellness offerings via Wellhub
Mental health support through nilo.health
Substantially subsidized company pension plan for your future security
Subsidized Germany-wide transportation ticket
Budget for additional technical equipment
Flexible working hours for better work-life balance and hybrid working model
Virtual Stock Option Plan
JobRad® Bike Lease

Location & Eligibility

Where is the job
Heidelberg, Germany
Hybrid — some on-site time required
Who can apply
DE

Listing Details

Posted
April 17, 2026
First seen
May 6, 2026
Last seen
June 20, 2026

Posting Health

Days active
45
Repost count
0
Trust Level
18%
Scored at
June 20, 2026

Signal breakdown

freshnesssource trustcontent trustemployer trust
Newsletter

Stay ahead of the market

Get the latest job openings, salary trends, and hiring insights delivered to your inbox every week.

A
B
C
D
Join 12,000+ marketers

No spam. Unsubscribe at any time.

alephalphaSenior AI R&D Engineer- Model Evaluation (f/m/d/)