

Share
NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. It’s a unique legacy of innovation that’s fueled by great technology—and amazing people.
What you'll be doing:Design and implement the DSL and the core compiler of tile-aware GPU programming model for emerging GPU architectures
Continuously innovate and iterate on the core architecture of the compiler to consistently optimize performance
Investigation of next-generation GPU architectures and provide solutions in the DSL and compiler stack
Performance analysis on emerging AI/LLM workloads and integrate with AI/ML frameworks
Masters or PhD or equivalent experience in relevant discipline (CE, CS&E, CS, AI)
4 + years of relevant work experience
Excellent C/C++ programming and software engineering skills, ACM background is a plus
Good fundamental knowledges on computer architecture
Strong ability in abstracting problems and the methodology in resolving problems
Strong compiler backgrounds including MLIR/TVM/Triton/LLVM is desired
Good knowledge of GPU architecture and fast kernel programming skills is a plus
Knowledge of LLM algorithms or a certain HPC domain is a plus
Knowledge of multi-GPU distributed communication is a plus
Excellent oral communication in English is a plus
These jobs might be a good fit

Share
What you will be doing:
Design and implementfunctional/performancetests for CUDA products, like driver and library.
Automate CUDA tests, design test plans and integrate into automation testinginfrastructure.
Triage test results, root cause test failures or performance drops, and drive through bugs to fix.
Develop scripts/tools and optimize workflow to improve efficiency and productivity.
What we need to see:
MS or PhD degree from a leading university in computer science or a related field.
At least 3 years of relevant professional experience.
Excellent QA sense, knowledge, and experience in software testing.
Rich experience in test case development, tests automation and failure analysis.
Proficient programming and debugging skills in C/C++ and Python.
Comprehensive knowledge of Linux and Windows operating systems.
Experience in using AI development tools for test plans creation, test cases development and test cases automation.
Ways to stand out from the crowd:
Excellent English communication and collaboration skills.
Strong understanding of CUDA, HPC, Gcov, VectorCAST, Coverity.

Share
NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years.a unique legacy of innovationfueled by great technology—and amazing people.
What you'll be doing:
establishintegrations with NVIDIA Cloud Partners, enabling global developers to easily access GPU-optimized virtual machines.
You will craft and implement IaaS API integrations, collaborating with external engineering teams to ensure reliable, scalable, and consistent connectivity across diverse cloud environments.
Shape integration strategies, develop stateful workflow orchestration, and drive improvements in testing, observability, and automation to ensure high-quality, fault-tolerant solutions.
Be responsible for developing the two-sided marketplace, including the integration ofcomputeproviders and crafting discovery and bidding experiences to match supply with demand.
What we need to see:
5+ years of experience in developing software infrastructure for large-scale AI systems, with a proventrack recordof impact.
Expertise in software engineering withkubernetes, including cluster operations, operator development, node health monitoring, and GPU resource scheduling.
Familiarity with setting up cloud infrastructure environments (VMaaS, VPCs, RDMA, sharedfile-systems).
Proven ability to handle 3rd party API integrations, including communication with external teams, writing API clients, and improving integration reliability.
Comfort in a fast-paced environment, with the ability to collaborate and debug integrations with external engineering teams.
Strong technical knowledge, including proficiencyin a systems programming language (preference for Go) and a solid understanding of software design patterns for stateful workflow orchestration.
BS in Computer Science, Engineering, Physics, Mathematics, or a comparable degree or equivalent experience.
, distributed systems, and API development.

Share
What you’ll be doing:
Develop algorithms to exercise various parts of the GPU pipeline to verify our performance metrics.
Deeply dive into NVIDIA GPU architecture and software stack, develop new feature for NVIDIA GPU performance profiling tools.
Write unit and integration tests to verify the functionality, performance, stability, resource usage of our products.
What we need to see:
Pursuing a Master's degree major in CS/SE.
Proficiency in C/C++, object oriented programming.
Proficiency in written and spoken English.
Ways to stand out from the crowd:
OpenGL, GLES, Direct3D, Vulkan, CUDA, OpenCL, console graphics APIs.
Experience of driver development.
Background with software development for embedded systems.

Share
What you'll be doing:
Developing and introducing groundbreaking reinforcement learning algorithms tailored for LLM applications.
Collaborating with a world-class team of engineers and researchers to integrate these algorithms into applied scenarios.
Using your extensive expertise in math and AI to improve the reasoning capabilities of our models.
Engaging in rigorous testing and refinement processes to ensure flawless performance and reliability.
Contributing to our collective goal of delivering industry-leading AI solutions, strictly adhering to NVIDIA's high standards.
What we need to see:
Proficient in C++/Python programming.
3+ years working experience.
BS or MS (or equivalent experience) in CS, CE, EE, or a related field.
Proven experience in reinforcement learning and its application to large language models.
Strong background in mathematics and AI algorithms, with a focus on reinforcement learning.
Demonstrated history of applying reinforcement learning algorithms in practical scenarios.
Understanding of GPU architecture is a huge plus.
Excellent problem-solving skills and the ability to work collaboratively in a dynamic team environment.
A passion for innovation and a dedication to achieving outstanding results.

Share
What you will be doing:
Develop and optimize the control stack, including locomotion, manipulation, and whole-body control algorithms;
Deploy and evaluate neural network models in physics simulation and on real humanoid hardware;
Design and maintain teleoperation software for controlling humanoid robots with low latency and high precision;
Implement tools and processes for regular robot maintenance, diagnostics, and troubleshooting to ensure system reliability;
Monitor teleoperators at the lab and develop quality assurance workflows to ensure high-quality data collection;
Collaborate with researchers on model training, data processing, and MLOps lifecycle.
What we need to see:
Bachelor’s degree in Computer Science, Robotics, Engineering, or a related field;
3+ years of full-time industry experience in robotics hardware or software full-stack;
Hands-on experience with deploying and debugging neural network models on robotic hardware;
Ability to implement real-time control algorithms, teleoperation stack, and sensor fusion;
Proficiency in languages such as Python, C++, and experience with robotics frames (ROS) and physics simulation (Gazebo, Mujoco, Isaac, etc.).
Experience in maintaining and troubleshooting robotic systems, including mechanical, electrical, and software components.
Physically work on-site on all business days.
Ways to stand out from the crowd:
Master’s or PhD’s degree in Computer Science, Robotics, Engineering, or a related field;
Experience at humanoid robotics companies on real hardware deployment;
Experience in robot hardware design;
Demonstrated Tech Lead experience, coordinating a team of robotics engineers and driving projects from conception to deployment.

Share
What you'll be doing:
• Work on research, design and implementation of software features that are beneficial to our customers to meet their performance targets and build unique values
• World model and nvidia cosmos related development and bug fixes
• Develop solutions for DNN models acceleration optimization and deployment.
What we need to see:
• Subject to arrangement to different works at any time to take on different tasks and challenges
• Self-motivated attitude and motivation to make things success, intent to learn
• BS/MS degree in Computer Science/EE or related
• Proven fundamentals in c++/python programming and SW design and debug skills
• Abundant knowledge of ML/DL techniques for Computer Vision and autonomous driving
• Believe that experimenting is the only way to find the truth, not argument and arbitary guess
• Located in Shanghai or willing to work in Shanghai
Ways to stand out from the crowd:
• DNN development and network acceleration is highly desired
• Familiarity with GPU computing/NVIDIA CUDA/NVIDIA TensorRT

Share
NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. It’s a unique legacy of innovation that’s fueled by great technology—and amazing people.
What you'll be doing:Design and implement the DSL and the core compiler of tile-aware GPU programming model for emerging GPU architectures
Continuously innovate and iterate on the core architecture of the compiler to consistently optimize performance
Investigation of next-generation GPU architectures and provide solutions in the DSL and compiler stack
Performance analysis on emerging AI/LLM workloads and integrate with AI/ML frameworks
Masters or PhD or equivalent experience in relevant discipline (CE, CS&E, CS, AI)
4 + years of relevant work experience
Excellent C/C++ programming and software engineering skills, ACM background is a plus
Good fundamental knowledges on computer architecture
Strong ability in abstracting problems and the methodology in resolving problems
Strong compiler backgrounds including MLIR/TVM/Triton/LLVM is desired
Good knowledge of GPU architecture and fast kernel programming skills is a plus
Knowledge of LLM algorithms or a certain HPC domain is a plus
Knowledge of multi-GPU distributed communication is a plus
Excellent oral communication in English is a plus
These jobs might be a good fit