huaweicanada
huaweicanada~1mo ago

Researcher - Reinforcement Learning

CanadaCanada·Burnaby,Markham,Montréal+1 moremid
ResearcherRecruitment & Talent Acquisition
1 views0 saves0 applied

Quick Summary

Overview

Huawei Canada has an immediate 12-month contract opening for a Reinforcement Learning Researcher. About the team: Founded in 2012, the Noah’s Ark lab has evolved into a prominent research organization with notable achievements in academia and industry.

Technical Tools
pythonpytorchdeep-learning

Founded in 2012, the Noah’s Ark lab has evolved into a prominent research organization with notable achievements in academia and industry. The lab’s mission focuses on advancing artificial intelligence and related fields to benefit the company and society. Driven by impactful, long-term projects, the aim is to enhance state-of-the-art research while integrating innovations into the company's products and services, including LLMs, RL, NLP, computer vision, AI theory, and Autonomous driving.

About the Role

~1 min read
  • Enabling Large Language Models (LLMs) to learn from experience, interaction, and environment feedback, moving beyond static fine-tuning toward continual, agentic self-improvement.

  • LLM post-training paradigms (e.g., RLHF, GRPO, reward-free methods, etc.).

  • Agentic reinforcement learning for tool-using and browsing-based LLMs trained in interactive environments.

  • Agentic evaluation and benchmarking, including design of multi-turn, verifiable reasoning tasks.

  • Your work will involve implementing and evaluating new training and evaluation pipelines for reasoning-enhanced LLMs and tool-using agents, scaling experiments on large GPU clusters, and contributing to scientific insights and publications in this emerging area.

  • PhD degree in Computer Science or related fields or master's degree with comparable experience.

  • Strong foundation in deep learning, including architectures such as Transformers and optimization techniques for large models.

  • Practical or research experience in reinforcement learning, self-supervised learning, or language model fine-tuning.

  • Proven research record in AI by having at least one paper as the first author in top tier venues, such as NeurIPS, ICML, ICLR, CVPR, ICCV, ECCV, ICRA.

  • Solid proficiency in Python and experience with PyTorch, DeepSpeed, Megatron and other distributed training frameworks.

  • Familiarity with LLM post-training pipelines (RLHF, GRPO/PPO, SFT, LoRA, MoE, etc.) is an asset.

  • Experience with multi-agent RL, tool-use / browser/coding agents, is an asset.

  • Strong communication and writing skills; enthusiasm for open research and collaborative problem-solving.

Huawei aims to support a French-speaking work environment for its employees in Quebec. We have taken steps to avoid requiring a language other than French for this position. However, proficiency in English is essential for this role for the following reasons:

The person will be required to communicate regularly with colleagues located outside Quebec, where English is the primary language used for communication between offices. In addition, the nature of the tasks related to this position, which falls within a highly specialized field of artificial intelligence, also requires knowledge of English.

Location & Eligibility

Where is the job
Edmonton, Canada
On-site at the office
Who can apply
CA

Listing Details

First seen
May 6, 2026
Last seen
June 7, 2026

Posting Health

Days active
32
Repost count
0
Trust Level
14%
Scored at
June 7, 2026

Signal breakdown

freshnesssource trustcontent trustemployer trust
Newsletter

Stay ahead of the market

Get the latest job openings, salary trends, and hiring insights delivered to your inbox every week.

A
B
C
D
Join 12,000+ marketers

No spam. Unsubscribe at any time.

huaweicanadaResearcher - Reinforcement Learning