Senior AI Inference Engineer - Model Optimization & Deployment
Quick Summary
The Perception team is pioneering the development of a multi-modality foundation model to drive the next generation of autonomous system intelligence. As a Model Optimization & Deployment Engineer,
As a Model Optimization & Deployment Engineer, you will focus on bringing highly efficient, production-ready large-scale models to our on-vehicle stack. We are looking for experts with hands-on experience in compressing, accelerating, and deploying complex models (LLMs, VLMs, or FMs) for power- and thermal-constrained vehicle SOCs. You will optimize the ML models, write custom CUDA kernels, and build highly concurrent inference code to ensure real-time, deterministic execution on edge devices.
Architect and implement model conversion and compilation pipelines using TensorRT and TensorRT-LLM for edge deployment.
Perform rigorous parity checking, accuracy recovery, and latency benchmarking between PyTorch frameworks and compiled edge binaries.
Location & Eligibility
Listing Details
- Posted
- April 11, 2026
- First seen
- April 11, 2026
- Last seen
- April 28, 2026
Posting Health
- Days active
- 17
- Repost count
- 0
- Trust Level
- 49%
- Scored at
- April 28, 2026
Signal breakdown

Zoox, a subsidiary of Amazon, designs fully autonomous vehicles focusing on making urban transportation safer and more efficient.
View company profilePlease let Zoox know you found this job on Jobera.
Similar Engineer jobs
View all →Browse Similar Jobs
Stay ahead of the market
Get the latest job openings, salary trends, and hiring insights delivered to your inbox every week.
No spam. Unsubscribe at any time.