Expoint - all jobs in one place

Finding the best job has never been easier

Limitless High-tech career opportunities - Expoint

Microsoft Sr Software Engineer -- GPU Inference Optimization 
China, Beijing, Beijing 
689181228

10.12.2024

Senior Software Engineer (Search Ads Understanding)

Qualifications

Required Qualifications:
• Bachelor's degree in computer science or related technical field AND 4+ years technical engineering experience with coding in languages including, but not limited to, C/C++, CUDA, or ROCm
o OR equivalent experience.
• 3+ years’ practical experience working on applications that use GPUs, experience in optimizing their performance
• Practical Experience writing new GPU kernels, going beyond experience of GPU workloads with existing library kernels
• Cross-team collaboration skills and the desire to collaborate in a team of researchers and developers
Preferred Qualifications:
• Bachelor's Degree in Computer Science
o OR related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C/C++, CUDA, or ROCm
o OR Master's Degree in Computer Science or related technical field AND 2+ years technical engineering experience with coding in languages including, but not limited to, C/C++, CUDA, or ROCm
o OR equivalent experience.
• Experience in low-level performance analysis and optimization, including proficiency using GPU profiling tools such as NVIDIA Visual Profiler, and NVIDIA Nsight Compute
• Technical background and solid foundation in software engineering principles and architecture design
• Exposure to Deep Neural Network inference and experience in one or more deep learning frameworks such as PyTorch, Tensorflow, or ONNX Runtime


Responsibilities

• Software development in C/C++, Python, and in GPU languages such as CUDA, ROCm, or Triton
• Work with cutting-edge hardware stacks and a fast-moving software stack to deliver best-of-class inference and optimal cost.
• Engage with key partners to understand and implement inference and training optimization for state-of-the-art LLMs and other models.