Share
The R&D of Search Ads aims to build an online advertising ecosystem of users, advertisers, and the search engine.
This is a lead role focused on GPU inference optimization of large and small language models: it requires hands-on software development skills and expiernce to lead the team efforts by applying the model-coach-care practices. We’re looking for someone who has a demonstrated history of solving hard technical problems and is motivated to tackle the hardest problems in building a full end-to-end AI stack. An entrepreneurial approach and ability to take initiative and move fast are essential.
• Bachelor's degree in computer science or related technical field AND 5+ years technical engineering experience with coding in languages including, but not limited to, C/C++, CUDA, ROCm or equivalent experience
• Practical Experience writing new GPU kernels, going beyond experience of GPU workloads with existing library kernels
• Quick learning, good communication (fluent in English) and solid problem-solving skills
• Cross-team collaboration skills and the desire to collaborate in a team of researchers and developers
• Experience in low-level performance analysis and optimization, including proficiency using GPU profiling tools such as NVIDIA Visual Profiler, and NVIDIA Nsight Compute is a plus
• Familiar with LLM inference optimization, experience in developing popular inference framework such as TensorRT-LLM, SGLang, vLLM is a plus
These jobs might be a good fit