AI Researcher – Multilingual Data
Quick Summary
About the Role We’re looking for an AI Researcher focused on multilingual data to help us build and scale next-generation language models across diverse languages and domains.
Experience with low-resource languages or non-Latin scripts Open-source contributions in NLP or data tooling Experience training or evaluating large language models Familiarity with multilingual benchmarks (e.g., XTREME, FLORES, TyDi QA) Why Join Us…
About the Role
~1 min readWe’re looking for an AI Researcher focused on multilingual data to help us build and scale next-generation language models across diverse languages and domains. You’ll own research and execution around data sourcing, curation, evaluation, and training strategies for multilingual and low-resource languages, with a strong emphasis on publishing high-quality research and translating it into production systems.
This role is ideal for someone who enjoys working close to the frontier: balancing papers, prototypes, and real-world impact in a fast-moving startup environment.
Responsibilities
~1 min read- →
Design and execute research on multilingual datasets, including data collection, filtering, deduplication, and quality measurement
- →
Develop strategies for low-resource and long-tail languages (sampling, augmentation, curriculum design)
- →
Research and improve cross-lingual transfer, alignment, and robustness in large language models
- →
Build and maintain evaluation benchmarks for multilingual performance
- →
Collaborate with engineers and researchers on training pipelines and model architecture decisions
- →
Publish research at top venues (e.g., ACL, EMNLP, NeurIPS, ICML, ICLR) and contribute to open-source when appropriate
- →
Translate research insights into practical improvements in production models
Strong background in NLP / ML research, with a focus on multilingual or cross-lingual modeling
Nice to Have
~1 min readExperience with low-resource languages or non-Latin scripts
Open-source contributions in NLP or data tooling
Experience training or evaluating large language models
Familiarity with multilingual benchmarks (e.g., XTREME, FLORES, TyDi QA)
What We Offer
~1 min readLocation & Eligibility
Listing Details
- Posted
- January 23, 2026
- First seen
- May 6, 2026
- Last seen
- June 25, 2026
Posting Health
- Days active
- 49
- Repost count
- 0
- Trust Level
- 24%
- Scored at
- June 25, 2026
Signal breakdown
Please let featherlessai know you found this job on Jobera.
4 other jobs at featherlessai
View all →Explore open roles at featherlessai.
Browse Similar Jobs
Stay ahead of the market
Get the latest job openings, salary trends, and hiring insights delivered to your inbox every week.
No spam. Unsubscribe at any time.