Staff Machine Learning Engineer – Autonomous Driving Model Quantization & Deployment
Quick Summary
XPENG is a leading smart technology company at the forefront of innovation, integrating advanced AI and autonomous driving technologies into its vehicles, including electric vehicles (EVs), electric vertical take-off and landing ( eVTOL ) aircraft, and robotics.
Lead Optimization Strategy: Own the end-to-end quantization and optimization roadmap for large-scale multimodal models (Transformers, VLMs).
Experience with VLA /VLM or other Foundation Model deployment. Background in autonomous driving, robotics, or real-time safety-critical systems. Contributions to open-source inference or compiler projects.
-
Lead Optimization Strategy: Own the end-to-end quantization and optimization roadmap for large-scale multimodal models (Transformers, VLMs).
-
Model Compression: Apply and innovate in PTQ (Post-Training Quantization), QAT (Quantization-Aware Training), and pruning techniques to fit VLA models into strict memory and power envelopes.
-
Hardware-Software Co-design: Collaborate directly with model researchers to ensure architectures are "deployment-friendly" and with platform teams to influence future hardware requirements.
-
Production Excellence: Develop and maintain robust, safety-critical deployment stacks in Modern C++, ensuring 24/7 stability and deterministic performance on the road.
-
Proven Track Record: 5-8 years of experience in model deployment, quantization, or high-performance computing (HPC).
-
Core Technical Skills: Mastery of Modern C++ and deep experience with CUDA or other hardware acceleration libraries.
-
Deep Learning Expertise: Strong familiarity with PyTorch and deep knowledge of inference engines like TensorRT, ONNX Runtime, or TVM.
-
Quantization Depth: Hands-on experience with INT8/FP8/INT4 quantization and knowledge of the unique challenges in quantizing Large Language Models (LLMs) or Transformers.
-
Platform Knowledge: Solid understanding of computer architecture (Cache, Memory Bandwidth, SIMD) and experience with embedded/edge compute constraints.
-
Systems Thinking: Ability to debug complex performance bottlenecks across the entire software stack.
-
Experience with VLA/VLM or other Foundation Model deployment.
-
Background in autonomous driving, robotics, or real-time safety-critical systems.
-
Contributions to open-source inference or compiler projects.
-
A fun, supportive and engaging environment
-
Infrastructures and computational resources to support your ML model development/research.
-
Opportunity to work on cutting edge technologies with the top talent in the field.
-
Opportunity to make significant impact on transportation revolution by the means of advancing autonomous driving
-
Competitive compensation package
-
Snacks, lunches, dinners, and fun activities
Location & Eligibility
Listing Details
- First seen
- March 26, 2026
- Last seen
- May 8, 2026
Posting Health
- Days active
- 44
- Repost count
- 0
- Trust Level
- 34%
- Scored at
- May 9, 2026
Signal breakdown
Please let Xpengmotors know you found this job on Jobera.
Similar Staff Machine Learning Engineer jobs
View all →Browse Similar Jobs
Stay ahead of the market
Get the latest job openings, salary trends, and hiring insights delivered to your inbox every week.
No spam. Unsubscribe at any time.