featherlessai
New

AI Researcher – Multilingual Data

(world)Remotefull-timemid
OtherAi Researcher
0 views0 saves0 applied

Quick Summary

Overview

About the Role We’re looking for an AI Researcher focused on multilingual data to help us build and scale next-generation language models across diverse languages and domains.

Requirements Summary

Experience with low-resource languages or non-Latin scripts Open-source contributions in NLP or data tooling Experience training or evaluating large language models Familiarity with multilingual benchmarks (e.g., XTREME, FLORES, TyDi QA) Why Join Us…

Technical Tools
pythonpytorch

About the Role

~1 min read

We’re looking for an AI Researcher focused on multilingual data to help us build and scale next-generation language models across diverse languages and domains. You’ll own research and execution around data sourcing, curation, evaluation, and training strategies for multilingual and low-resource languages, with a strong emphasis on publishing high-quality research and translating it into production systems.

This role is ideal for someone who enjoys working close to the frontier: balancing papers, prototypes, and real-world impact in a fast-moving startup environment.

Responsibilities

~1 min read
  • Design and execute research on multilingual datasets, including data collection, filtering, deduplication, and quality measurement

  • Develop strategies for low-resource and long-tail languages (sampling, augmentation, curriculum design)

  • Research and improve cross-lingual transfer, alignment, and robustness in large language models

  • Build and maintain evaluation benchmarks for multilingual performance

  • Collaborate with engineers and researchers on training pipelines and model architecture decisions

  • Publish research at top venues (e.g., ACL, EMNLP, NeurIPS, ICML, ICLR) and contribute to open-source when appropriate

  • Translate research insights into practical improvements in production models

  • Strong background in NLP / ML research, with a focus on multilingual or cross-lingual modeling

Nice to Have

~1 min read
  • Experience with low-resource languages or non-Latin scripts

  • Open-source contributions in NLP or data tooling

  • Experience training or evaluating large language models

  • Familiarity with multilingual benchmarks (e.g., XTREME, FLORES, TyDi QA)

What We Offer

~1 min read
Real ownership over research direction and impact
A team that values papers and production
Access to meaningful scale: large datasets, modern infrastructure, and fast iteration
Competitive compensation and meaningful equity at an early stage

Location & Eligibility

Where is the job
Worldwide
Fully remote, anywhere in the world
Who can apply
Same as job location

Listing Details

Posted
January 23, 2026
First seen
May 6, 2026
Last seen
May 8, 2026

Posting Health

Days active
0
Repost count
0
Trust Level
23%
Scored at
May 6, 2026

Signal breakdown

freshnesssource trustcontent trustemployer trust
Newsletter

Stay ahead of the market

Get the latest job openings, salary trends, and hiring insights delivered to your inbox every week.

A
B
C
D
Join 12,000+ marketers

No spam. Unsubscribe at any time.

featherlessaiAI Researcher – Multilingual Data