I
Ifm Us13h ago
New
New
Inference Optimization Intern – Performance Modeling
OtherIntern
0 views0 saves0 applied
Quick Summary
Key Responsibilities
Develop analytical performance models for GPU kernels and inference workloads. Build and validate a simulator to estimate theoretical hardware performance limits.
Technical Tools
OtherIntern
About the Institute of Foundation Models
The Institute of Foundation Models is dedicated to advancing the science and engineering of large-scale AI systems. Our researchers and engineers develop cutting-edge foundation models while pushing the limits of high-performance computing and efficient AI inference. By combining deep expertise in machine learning, systems engineering, and hardware optimization, we build scalable AI solutions that drive scientific discovery and real-world impact.
As part of the team, interns work alongside world-class researchers and performance engineers to optimize the execution of large-scale foundation models on next-generation NVIDIA GPU architectures. This internship provides hands-on experience in low-level GPU performance analysis, kernel optimization, and hardware-aware inference acceleration.
This intensive internship offers a unique opportunity to contribute to the development of a simulator and profiling framework for foundation model inference on NVidia GPUs.
Responsibilities include:
Develop analytical performance models for GPU kernels and inference workloads.
Build and validate a simulator to estimate theoretical hardware performance limits.
Compare measured kernel performance against architectural peak throughput.
Identify performance bottlenecks in compute, memory, communication, and scheduling.
Analyze GPU execution using NVIDIA Nsight Systems and Nsight Compute.
Investigate PTX and SASS code generation to understand low-level execution behavior.
Collaborate with researchers and engineers to optimize inference kernels for transformer-based models.
Evaluate utilization of Tensor Cores, memory bandwidth, caches, and instruction pipelines.
Design profiling methodologies for Hopper and Blackwell architectures.
Document findings and provide actionable recommendations for performance improvements.
Currently pursuing a degree in Computer Science, Computer Engineering, Electrical Engineering, Artificial Intelligence, High-Performance Computing, or a related quantitative discipline.
Experience with CUDA programming and GPU kernel development.
Understanding of NVIDIA GPU architecture and memory hierarchy.
Familiarity with performance profiling tools such as Nsight Systems and Nsight Compute.
Knowledge of PTX, SASS, and low-level GPU execution.
Experience optimizing CUDA kernels for throughput and latency.
Understanding of roofline analysis, performance modeling, and hardware utilization metrics.
Experience with deep learning frameworks such as PyTorch or TensorFlow.
Strong programming skills in C++, CUDA, and Python.
Performance engineering mindset.
Strong analytical and debugging abilities.
Interest in AI systems, inference optimization, and hardware-software co-design.
Ability to work independently on research and engineering challenges.
Excellent written and verbal communication skills.
Location & Eligibility
Where is the job
Sunnyvale, United States
On-site at the office
Who can apply
US
Listing Details
- Posted
- June 24, 2026
- First seen
- June 24, 2026
- Last seen
- June 25, 2026
Posting Health
- Days active
- 0
- Repost count
- 0
- Trust Level
- 67%
- Scored at
- June 24, 2026
Signal breakdown
freshnesssource trustcontent trustemployer trust
External application · ~5 min on Ifm Us's site
Please let Ifm Us know you found this job on Jobera.
3 other jobs at Ifm Us
View all →Explore open roles at Ifm Us.
Similar Intern jobs
View all →Intern - Immigration & Border Governance (IBG)
Intern - Public Health
University Intern-AP
4
4Lu44N1N37W012KIntern Menu Planning & Ingredients (f/m/x)
Performance Coaching Intern
USD 20–20
Internship
Visual Journalist Intern (Photo + Video), Snappr News
From $0k/yr
Internship
Browse Similar Jobs
Manager6.1kAssistant Manager5.4kTeam Member5.1kEngineer3.6kDirector2.9kAssistant2.8kAssociate2.8kConsultant2.7kTechnician2.5kData Collector2.3kCoordinator2.1kFitness & Wellness2.1kTeam Leader1.7kRestaurant General Manager1.6kSupervisor1.5kAnalyst1.5kPart Time1.5kSocial Worker1.3kCustomer Service1.3kOperator1.2k
Newsletter
Stay ahead of the market
Get the latest job openings, salary trends, and hiring insights delivered to your inbox every week.
A
B
C
D
No spam. Unsubscribe at any time.
I
Inference Optimization Intern – Performance Modeling