GPU Compute & MLIR Compiler Engineer — AI Workloads

mid

Software EngineeringCompiler Engineer

0 views0 saves0 applied

Apply Now

Quick Summary

Requirements Summary

* Frontend dialects — ingestion and representation of ML models (e.g., TOSA, StableHLO, ONNX-MLIR) * Graph-level IR — high-level operation fusion, shape inference,

Technical Tools

Software EngineeringCompiler Engineer

## Company: Qualcomm India Private Limited ## Job Area: Engineering Group, Engineering Group > Systems Engineering General Summary: We are looking for a highly skilled GPU compute, MLIR compiler, and kernel optimization engineer with deep expertise in GPU compute, MLIR-based code generation, and end-to-end performance optimization for AI workloads. In this role, you will design, optimize, and deploy high-performance GPU compute kernels, build and extend MLIR compiler backends, and collaborate closely with ML, runtime, and hardware teams to push the limits of performance on modern GPU architectures. Key Responsibilities * Develop and optimize GPU compute kernels targeting OpenCL and Vulkan compute backends for high-throughput AI/ML workloads. * Design, build, and extend MLIR dialects across multiple abstraction levels — including frontend dialects, graph-level IR, tensor IR (e.g., Linalg, Tensor, Tosa), and runtime/low-level dialects — to enable efficient end-to-end model compilation. * Implement and maintain MLIR-based compiler passes and transformations, including tiling, fusion, bufferization, vectorization, and lowering pipelines targeting OpenCL and Vulkan GPU backends. * Conduct profiling and bottleneck analysis of compiled kernels using GPU counters and vendor-specific profilers, and drive performance improvements through compiler-level optimizations. * Build and maintain GPU runtime infrastructure for both OpenCL and Vulkan, including memory management, pipeline setup, command buffer orchestration, and resource scheduling. * Develop and extend code generation pipelines, enabling automatic lowering from tensor IR through MLIR to efficient OpenCL and Vulkan GPU kernels. * Implement performance-critical schedules — tiling, loop fusion, parallelism, and caching strategies — within MLIR-based backends targeting OpenCL and Vulkan runtimes. * Collaborate with framework teams to optimize end-to-end model lowering for computer vision and LLM workloads using MLIR compilation stacks. * Design and implement robust compiler and runtime components using modern C/C++, leveraging advanced programming paradigms for high-performance systems. Required Qualifications * Strong hands-on experience with MLIR framework, including authoring and extending custom dialects, writing compiler passes, and building end-to-end lowering pipelines. * Deep expertise across MLIR abstraction levels: * Frontend dialects — ingestion and representation of ML models (e.g., TOSA, StableHLO, ONNX-MLIR) * Graph-level IR — high-level operation fusion, shape inference, and graph transformations * Tensor IR level — structured op representation using Linalg, Tensor, and Vector dialects; tiling and fusion strategies * Runtime/low-level dialects — bufferization, MemRef, SCF, GPU, and LLVM dialects for final code generation * Strong hands-on experience in OpenCL programming, including kernel development, memory model, work-group/work-item optimization, and OpenCL runtime management. * Solid understanding of Vulkan compute programming, including descriptor management, compute pipelines, synchronization primitives, and Vulkan runtime internals. * Strong understanding of GPU architecture, memory hierarchies, and async compute. * Proficiency in C/C++ for system-level development. * Experience with kernel profiling and bottleneck analysis on GPU platforms. * Strong background in machine learning fundamentals, covering both CV and LLM workloads. Good to Have * Hands-on experience with IREE (Intermediate Representation Execution Environment) or other MLIR-based deployment frameworks such as TVM, XLA, or LLVM. * Familiarity with IREE's compiler and runtime architecture — including HAL (Hardware Abstraction Layer), executable compilation, and dispatch mechanisms — particularly its OpenCL and Vulkan HAL backends. * Experience contributing to open-source MLIR or IREE projects. * Knowledge of quantization, mixed-precision inference, and model optimization techniques for edge and server GPU targets. * Exposure to multi-target compilation (CPU, GPU, NPU) using MLIR-based toolchains. * Familiarity with cross-vendor GPU profiling tools (e.g., ARM Streamline, Qualcomm Snapdragon Profiler, Intel VTune) for OpenCL and Vulkan workloads. Minimum Qualifications: • Bachelor's degree in Engineering, Information Systems, Computer Science, or related field and 4+ years of Systems Engineering or related work experience. OR Master's degree in Engineering, Information Systems, Computer Science, or related field and 3+ years of Systems Engineering or related work experience. OR PhD in Engineering, Information Systems, Computer Science, or related field and 2+ years of Systems Engineering or related work experience. Applicants: Qualcomm is an equal opportunity employer. If you are an individual with a disability and need an accommodation during the application/hiring process, rest assured that Qualcomm is committed to providing an accessible process. You may e-mail disability-accomodations@qualcomm.com or call Qualcomm's toll-free number found here. Upon request, Qualcomm will provide reasonable accommodations to support individuals with disabilities to be able participate in the hiring process. Qualcomm is also committed to making our workplace accessible for individuals with disabilities. (Keep in mind that this email address is used to provide reasonable accommodations for individuals with disabilities. We will not respond here to requests for updates on applications or resume inquiries). Qualcomm expects its employees to abide by all applicable policies and procedures, including but not limited to security and other requirements regarding protection of Company confidential information and other confidential and/or proprietary information, to the extent those requirements are permissible under applicable law. To all Staffing and Recruiting Agencies: Our Careers Site is only for individuals seeking a job at Qualcomm. Staffing and recruiting agencies and individuals being represented by an agency are not authorized to use this site or to submit profiles, applications or resumes, and any such submissions will be considered unsolicited. Qualcomm does not accept unsolicited resumes or applications from agencies. Please do not forward resumes to our jobs alias, Qualcomm employees or any other company location. Qualcomm is not responsible for any fees related to unsolicited resumes/applications. If you would like more information about this role, please contact Qualcomm Careers.