About Abaka AI
Abaka AI is built on one mission: to be the world’s most trusted data partner for AI companies. More than 1,000 industry leaders across Generative AI, Embodied AI, and Automotive AI rely on us to power their data pipelines. With our headquarters in Silicon Valley—and teams in Paris, Singapore, and Tokyo—we support global partners with fast, reliable, and scalable data solutions.
Our offerings include a diverse catalog of off-the-shelf datasets (image, video, multimodal, reasoning, 3D, and beyond) as well as comprehensive data collection and annotation services. Whether teams need raw data, curated datasets, or full-cycle data engineering, Abaka AI provides the foundation for building high-performance AI systems.
About the Role
We’re hiring our first Data Engineer in the United States, a foundational role that will shape Abaka AI’s data engineering standards, systems, and culture from day one. This is an opportunity to take full ownership of how multimodal data is sourced, processed, cleaned, annotated, and delivered to some of the world’s most advanced AI teams.
You won’t just be building pipelines—you’ll be developing the infrastructure that powers frontier AI models. You’ll partner directly with foundation model teams to understand their data needs, translate them into scalable workflows, and deliver high-quality multimodal datasets that meaningfully impact model performance.
As an early member of our engineering team, you’ll influence everything from our long-term roadmap to our internal tooling ecosystem. If you thrive in high-ownership environments and want to shape the machine learning foundation of a fast-moving AI company, this role offers an opportunity to make an immediate and lasting impact.