Product Manager
Quick Summary
About Arena Intelligence Arena Intelligence is the open platform for evaluating how AI models perform in the real world. Created by researchers from UC Berkeley’s SkyLab, our mission is to measure and advance the frontier of AI for real-world use.
Arena Intelligence is the open platform for evaluating how AI models perform in the real world. Created by researchers from UC Berkeley’s SkyLab, our mission is to measure and advance the frontier of AI for real-world use.
Millions of people use Arena Intelligence each month to explore how frontier systems perform — and we use our community’s feedback to build transparent, rigorous, and human-centered model evaluations. Leading enterprises and AI labs rely on our evaluations to understand real-world reliability, alignment, and impact. Our leaderboards are the gold standard for AI performance — trusted by leaders across the AI community and shaping the global conversation on model reliability and progress.
We’re a team of researchers, engineers, academics, and builders from places like UC Berkeley, Google, Stanford, DeepMind, and Discord. We seek truth, move fast, and value craftsmanship, curiosity, and impact over hierarchy. We’re building a company where thoughtful, curious people from all backgrounds can do their best work. Everyone on our team is a deep expert in their field — our office radiates excellence, energy, and focus.
About the Role
~1 min readOwn the roadmap and product strategy for Arena's evaluations and leaderboard platform.
Partner closely with ML researchers to translate emerging evaluation methodologies — multimodal evals, agentic workflows, reasoning traces, and new benchmark categories — into production-quality product experiences.
Define how evaluation research moves from prototype → implementation → launch → ecosystem adoption.
Drive cross-functional execution across research, engineering, design, and marketing to close the gap between research artifacts and trusted user-facing infrastructure.
Prioritize what gets evaluated next based on frontier model trends, developer demand, ecosystem gaps, and strategic opportunities.
Build systems, workflows, and operational rigor around evaluation quality, release cadence, and leaderboard credibility.
Own product metrics across adoption, engagement, citations, frontier-lab participation, and evaluation throughput.
Engage directly with frontier labs, researchers, developers, and enterprise users to identify where current evaluation systems break down and where the ecosystem is headed next.
Help shape how Arena balances evaluation rigor, usability, neutrality, and speed as the platform scales.
5–8 years of product management experience in highly technical or ambiguous environments.
Strong familiarity with modern AI systems, including LLMs, multimodal models, agents, reasoning systems, and evaluation methodologies.
A track record of shipping technically complex products from concept to production.
Experience translating research-heavy or technically ambiguous work into clear product direction and execution.
Strong systems thinking — you can identify bottlenecks, coordination gaps, and scaling constraints across technical and organizational systems.
Exceptional cross-functional leadership skills. You can align researchers, engineers, and designers without relying on formal authority.
High agency and strong product judgment. You move quickly, make decisions with incomplete information, and create structure where little exists.
Strong written communication. You can write specifications for researchers and product narratives for external technical audiences with equal clarity.
Nice to Have
~1 min readTechnical background in computer science, machine learning, or related fields.
Prior experience in evaluations, benchmarking systems, AI infrastructure, research tooling, or developer platforms.
Experience building products for technical audiences such as researchers, ML engineers, or developers.
Founder or early-stage startup experience.
What We Offer
~1 min readLocation & Eligibility
Listing Details
- Posted
- May 13, 2026
- First seen
- May 14, 2026
- Last seen
- May 14, 2026
Posting Health
- Days active
- 0
- Repost count
- 0
- Trust Level
- 54%
- Scored at
- May 14, 2026
Signal breakdown
Please let arena know you found this job on Jobera.
3 other jobs at arena
View all →Explore open roles at arena.
Similar Product Manager jobs
View all →Browse Similar Jobs
Stay ahead of the market
Get the latest job openings, salary trends, and hiring insights delivered to your inbox every week.
No spam. Unsubscribe at any time.