bland
bland1mo ago
$140K – $250K • Offers Equity • Offers Bonus/yr

Machine Learning Researcher, Audio

United StatesUnited States·San Francisco,San Franciscofull-timemid
OtherMachine Learning Researcher
3 views0 saves0 applied

Quick Summary

Overview

Machine Learning Researcher / Engineer, Multimodal LLMs Location: San Francisco, CA or Remote (US) About Bland At Bland.com, our mission is to empower enterprises to build AI phone agents at scale.

Technical Tools
machine-learning

Responsibilities

~1 min read
  • Design and train large scale text-to-speech models capable of expressive, controllable, human-sounding output.

  • Develop neural audio codec-based TTS architectures for efficient, high-fidelity generation.

  • Improve prosody modeling, question inflection, emotional expression, and multi-speaker robustness.

  • Optimize for real-time, low-latency inference in production.

 
  • Build and fine-tune large scale ASR systems robust to accents, noise, telephony artifacts, and code switching.

  • Leverage self-supervised pretraining and large-scale weak supervision.

  • Improve transcription accuracy for real-world enterprise scenarios, including structured extraction and conversational nuance.

 
  • Research and implement neural audio codecs that achieve extreme compression with minimal perceptual loss.

  • Explore discrete and continuous latent representations for scalable speech modeling.

  • Design codec architectures that enable downstream generative modeling and controllable synthesis.

 
  • Curate and process massive audio datasets across languages, speakers, and environments.

  • Design staged training curricula and data filtering strategies.

  • Scale training across distributed GPU clusters focusing on cost, throughput, and reliability.

 
  • Design ablation studies that isolate the impact of architectural changes.

  • Measure improvements using both objective metrics and perceptual evaluations.

  • Validate ideas quickly through focused experiments that confirm or eliminate hypotheses.

 
 
  • Experience with self-supervised learning, multimodal modeling, or generative modeling.

  • Ability to derive new formulations and implement them efficiently.

 
  • Hands-on experience building or scaling TTS, STT, or neural audio codec systems.

  • Familiarity with large scale speech datasets and real-world audio variability.

  • Strong intuition for audio quality, prosody, and conversational dynamics.

 
  • Experience training and serving large models on modern accelerators.

  • Knowledge of inference optimization techniques, including quantization, kernel optimization, and memory efficiency.

  • Understanding of real-time constraints in telephony or streaming environments.

 
  • Track record of designing controlled experiments and meaningful ablations.

  • Comfortable working with both offline benchmarks and live production metrics.

  • Ability to move quickly from hypothesis to validation.

 
  • Comfortable in fast-moving startup environments.

  • Strong ownership mindset from research through deployment.

  • Excited by ambiguous, unsolved problems.

 
 
  • You treat unsolved problems as opportunities to invent new paradigms.

  • You identify the single experiment that can validate an idea in days, not months.

  • You measure everything and let data drive decisions.

  • You are obsessed with making voice agents sound truly human.

  • You use AI tools aggressively to amplify your own impact and accelerate research cycles.

 
 

Nice to Have

~1 min read
  • Experience with large scale distributed training.

  • Research publications or open source contributions in speech or language AI.

  • Background in real-time speech systems or telephony.

  • PhD in ML, AI, or a related field, or equivalent research impact.

 
 

What We Offer

~1 min read
Healthcare, dental, vision, all the good stuff
Meaningful equity in a fast-growing company
Every tool you need to succeed
Beautiful office in Jackson Square, SF with rooftop views
Competitive salary: $160,000 to $250,000

Location & Eligibility

Where is the job
San Francisco, United States
On-site at the office
Who can apply
US

Listing Details

Posted
April 20, 2026
First seen
May 5, 2026
Last seen
May 29, 2026

Posting Health

Days active
23
Repost count
0
Trust Level
26%
Scored at
May 29, 2026

Signal breakdown

freshnesssource trustcontent trustemployer trust
Newsletter

Stay ahead of the market

Get the latest job openings, salary trends, and hiring insights delivered to your inbox every week.

A
B
C
D
Join 12,000+ marketers

No spam. Unsubscribe at any time.

blandMachine Learning Researcher, Audio$140K – $250K • Offers Equity • Offers Bonus