Develop and refine software solutions to expedite LLM SW stack (could be within inference/post train or pre-train phase) by harnessing the power of GPU technology.
Collaborate closely with a world-class team of engineers to implement and refine GPU-based algorithms.
Analyze and determine the most effective methods to improve performance, ensuring seamless execution across diverse computing environments.
Engage in both individual and team projects, contributing to NVIDIA's mission of leading the AI revolution.
Work in an empowering and inclusive environment to successfully implement groundbreaking AI solutions.
What we need to see:
Proven experience in software engineering, particularly in GPU programming and LLM inference.
Strong proficiency in programming languages such as Python, C++, and CUDA.
A solid understanding of deep learning frameworks and techniques.
Outstanding problem-solving skills and the ability to work collaboratively in a team setting.
Ambitious approach with a proven track record of taking initiative and delivering results.
A degree in Computer Science, Engineering, or a related field, or equivalent experience.
Experience with PyTorch, RLHF (Reinforcement Learning with Human Feedback), and LLM training frameworks like FSDP/Megatron LLM is a plus.