Senior Computer Vision Engineer
Quick Summary
XPENG is a leading smart technology company at the forefront of innovation, integrating advanced AI and autonomous driving technologies into its vehicles, including electric vehicles (EVs), electric vertical take-off and landing (eVTOL) aircraft, and robotics.
Research and implement multi-modal large models (image-text, image-audio, etc.) training, fine-tuning, and inference optimization strategies, continuously improving model performance, efficiency, and generalization ability.
Experience in edge device algorithm deployment, published papers in top computer vision conferences (CVPR, ICCV, ECCV), or open-source project contributions in related fields. What do we provide: A fun, supportive and engaging environment.
Responsibilities
~1 min read- →
Research and implement multi-modal large models (image-text, image-audio, etc.) training, fine-tuning, and inference optimization strategies, continuously improving model performance, efficiency, and generalization ability.
- →
Design and optimize computer vision models and algorithms (e.g., detection, classification, segmentation, feature extraction) to support real-world applications.
- →
Collaborate with cross-functional teams (product, engineering, data) to translate research into scalable, reliable, and production-ready solutions.
- →
Use C++ to implement and optimize models and systems, including deployment, performance tuning, and integration, ensuring low latency and high throughput.
- →
Stay up to date with advances in computer vision and multi-modal AI, and apply new methods to improve model performance and product impact.
- →
Contribute to technical discussions, code reviews, and knowledge sharing to improve code quality and engineering best practices.
Requirements
~1 min read-
Master’s or Ph.D. in Computer Science or a related field, with strong expertise in computer vision and machine learning.
-
1-3 years of experience in multi-modal large model training, fine-tuning, and optimization (e.g., CLIP, Flamingo, BLIP, or self-developed multi-modal models), with a deep understanding of multi-modal fusion mechanisms.
-
Strong foundation in computer vision, including object detection, image classification, feature matching, and image enhancement.
-
Strong C++ development skills, with proficiency in STL, multi-threading, memory management, and performance optimization; experience in production-level implementation and deployment is required.
-
Familiar with deep learning frameworks (e.g., PyTorch, TensorFlow) and computer vision libraries (e.g., OpenCV, OpenMMLab).
-
Strong problem-solving ability, self-driven, and passionate about technological innovation; ability to work independently and in a team.
-
Experience in edge device algorithm deployment, published papers in top computer vision conferences (CVPR, ICCV, ECCV), or open-source project contributions in related fields.
-
A fun, supportive and engaging environment.
-
Opportunity to make a significant impact on the transportation revolution by the means of advancing autonomous driving.
-
Opportunity to work on cutting edge technologies with the top talent in the field.
-
Competitive compensation package.
-
Snacks, lunches and fun activities.
Location & Eligibility
Listing Details
- First seen
- March 26, 2026
- Last seen
- June 2, 2026
Posting Health
- Days active
- 67
- Repost count
- 0
- Trust Level
- 34%
- Scored at
- June 2, 2026
Signal breakdown
Please let Xpengmotors know you found this job on Jobera.
4 other jobs at Xpengmotors
View all →Explore open roles at Xpengmotors.
Similar Machine Learning Engineer jobs
View all →Stay ahead of the market
Get the latest job openings, salary trends, and hiring insights delivered to your inbox every week.
No spam. Unsubscribe at any time.