connecthum
connecthum21d ago
New

Audio | Multimodal ML Engineer

Parisfull-timemid
Machine Learning EngineerData
0 views0 saves0 applied

Quick Summary

Overview

We’re hiring an Audio / Multimodal ML Engineer for a fast-growing AI infrastructure startup building the safety and control layer for large-scale AI systems.

Key Responsibilities

Train and fine-tune large-scale audio & multimodal models Design and run experiments (architecture, data mixtures, training strategies) Build and optimize audio data pipelines Improve inference speed, latency and production readiness Deploy models…

Technical Tools
pytorchdeep-learningetl

We’re hiring an Audio / Multimodal ML Engineer for a fast-growing AI infrastructure startup building the safety and control layer for large-scale AI systems.

  • AI-native product company operating in the AI safety & infrastructure space

  • Backed by international investors

  • Processing large-scale AI traffic across enterprise environments

  • Training and fine-tuning proprietary models for performance & reliability

  • Small, highly technical team shipping fast

The company builds the control and evaluation layer for AI systems - helping organizations define, test, and enforce how AI behaves in real-world environments.

  • Train and fine-tune large-scale audio & multimodal models

  • Design and run experiments (architecture, data mixtures, training strategies)

  • Build and optimize audio data pipelines

  • Improve inference speed, latency and production readiness

  • Deploy models end-to-end in low-latency environments

  • Define meaningful evaluation metrics beyond benchmark scores

  • Collaborate closely with research & engineering

This is a hands-on role where research meets production.

  • PyTorch-based training pipelines

  • Large-scale distributed training

  • Speech & audio modeling architectures

  • Multimodal model integration

  • Model optimization (quantization, distillation, streaming inference)

  • Production deployment & serving systems

(Deep technical stack shared during interviews)

  • 3+ years training deep learning models in audio / speech domains

  • Strong experience with distributed training frameworks

  • Solid understanding of audio signal processing fundamentals

  • Experience shipping models to production (latency matters, not just metrics)

  • Experience building and maintaining data pipelines

  • Strong engineering hygiene (clean code, testing, versioning)

Nice to Have

~1 min read
  • Experience with multimodal architectures

  • Experience with alignment / fine-tuning techniques

  • Experience working in AI infrastructure or model optimization

What We Offer

~1 min read
Competitive compensation + equity
Hybrid setup in Europe + relocation support
Comprehensive health coverage
Top-tier hardware & tools
Team off-sites
Budget for learning & AI tooling

Location & Eligibility

Where is the job
Paris
On-site at the office
Who can apply
Same as job location

Listing Details

Posted
April 17, 2026
First seen
May 6, 2026
Last seen
May 8, 2026

Posting Health

Days active
0
Repost count
0
Trust Level
19%
Scored at
May 6, 2026

Signal breakdown

freshnesssource trustcontent trustemployer trust
Newsletter

Stay ahead of the market

Get the latest job openings, salary trends, and hiring insights delivered to your inbox every week.

A
B
C
D
Join 12,000+ marketers

No spam. Unsubscribe at any time.

connecthumAudio | Multimodal ML Engineer