Machine Learning Software Platform Architect jobs at Nvidia in China, Shanghai

time type: Full time

posted on: Posted 3 Days Ago

job requisition id

What you’ll be doing:

Develop and test deep learning models for object detection, tracking, and behavior prediction.
Preprocess and analyze sensor data from LiDAR, radar, and camera systems.
Collaborate with cross-functional teams on algorithm integration and performance evaluation.
Support data labeling, simulation, and validation pipelines.
Document model performance and contribute to internal reports.

What we need to see:

Enrolled in a Bachelor’s, Master’s, or PhD program in Computer Science, Electrical Engineering, Robotics, or a related field.
Strong knowledge of Python and libraries such as PyTorch or TensorFlow.
Familiarity with computer vision, deep learning, and reinforcement learning concepts.
Experience with autonomous systems, robotics, or simulation tools a plus.
Internship duration: 10–12 weeks (with potential for extension)

Analyze state of the art DL networks (LLM etc.), identify and prototype performance opportunities to influence SW and Architecture team for NVIDIA's current and next gen inference products. Develop analytical...

These jobs might be a good fit

Nvidia Machine Learning Engineer Intern - China, Shanghai

Nvidia Machine Learning Intern - China, Hong Kong, Hong Kong Island

Nvidia Machine Learning Intern China, Hong Kong, Hong Kong Island

Apple Machine Learning Engineer Intern - Shanghai China, Shanghai

24.11.2025

Nvidia Deep Learning Performance Architect - New College Grad China, Shanghai

China, Beijing

time type: Full time

posted on: Posted 4 Days Ago

job requisition id

What you’ll be doing:

Analyze state of the art DL networks (LLM etc.), identify and prototype performance opportunities to influence SW and Architecture team for NVIDIA's current and next gen inference products.
Develop analytical models for the state of the art deep learning networks and algorithm to innovate processor and system architectures design for performance and efficiency.
Specify hardware/software configurations and metrics to analyze performance, power, and accuracy in existing and future uni-processor and multiprocessor configurations.
Collaborate across the company to guide the direction of next-gen deep learning HW/SW by working with architecture, software, and product teams.

What we need to see:

BS or higher degree in a relevant technical field (CS, EE, CE, Math, etc.).
Strong programming skills in Python, C, C++.
Strong background in computer architecture.
Experience with performance modeling, architecture simulation, profiling, and analysis.
Prior experience with LLM or generative AI algorithms.

Ways to stand out from the crowd:

GPU Computing and parallel programming models such as CUDA and OpenCL.
Architecture of or workload analysis on other deep learning accelerators.
Deep neural network training, inference and optimization in leading frameworks (e.g. Pytorch, TensorRT-LLM, vLLM, etc.).
Open-sourceAIcompilers (OpenAI Triton, MLIR, TVM, XLA, etc.).

and proud to be an

Benchmark, profile, and analyze the performance of AI workloads specifically tailored for large-scale LLM training and inference, as well as High-Performance Computing (HPC) on NVIDIA supercomputers and distributed systems. Aggregate...

These jobs might be a good fit

24.11.2025

Nvidia Performance Engineer Intern Deep Learning HPC - China, Shanghai

time type: Full time

posted on: Posted 4 Days Ago

job requisition id

You will be part of global Performance Lab team, improving our capacity to expertly and accurately benchmark state-of-the-art datacenter applications and products. We also work to develop infrastructures and solutions that enhance the team’s ability to gather data through automation and designing efficient processes for testing a wide variety of applications and hardware. The data that we collect drives marketing/sales collaterals as well as engineering studies for future products. You will have the opportunity to work with multi-functional teams and in a dynamic environment where multiple projects will be active at once and priorities may shift frequently.

What you’ll be doing:

Benchmark, profile, and analyze the performance of AI workloads specifically tailored for large-scale LLM training and inference, as well as High-Performance Computing (HPC) on NVIDIA supercomputers and distributed systems.
Aggregate and produce written reports with the testing data for internal sales, marketing, SW, and HW teams.
Develop Python scripts to automate the testing of various applications.
Collaborate with internal teams to debug and improve performance issues.
Assist with the development of tools and processes that improve our ability to perform automated testing.
Setup and configure systems with appropriate hardware and software to run benchmarks.

What we need to see:

Currently pursuing a bachelor's degree (or higher) in Computer Science, Electrical Engineering, or a related field.
Experienced in programming and debugging with scripting languages such as Python or Unix shell.
Strong data analysis skills and the ability to summarize findings in a written report.
Hands-on experience with Linux based systems. Familiarity using a container platform such as Docker or Singularity. Experience with compiling and running software from source code.
Good English verbal and written skills to improve collaboration with coworkers.
Fast and self-learning capabilities.

Ways to stand out from the crowd:

Experience with CI/CD pipelines and modern DevOps practices. Familiar with cloud provisioning and scheduling tools (Kubernetes, SLURM).
Curiosity about GPUs, TPUs, cloud and performance benchmarking.
Familiar with ML/DL techniques, algorithms and frameworks like TensorFlow or PyTorch. Experience in AI model inference deployment and training launching.
Background of system-level problem solving.

Create developer tools features for NVIDIA GPUs that enables developers to quickly iterate on optimizations to build fast graphics applications. Write fast, effective, maintainable, reliable and well documented code. Effectively...

These jobs might be a good fit

22.11.2025

Nvidia Graphics Tools Software Engineering Intern - China, Shanghai

time type: Full time

posted on: Posted 2 Days Ago

job requisition id

What you'll be doing:

As a valued member of the team, you will be involved in the technical design and implementation of numerous features working in an agile environment. In this role you can expect to:

Create developer tools features for NVIDIA GPUs that enables developers to quickly iterate on optimizations to build fast graphics applications.
Write fast, effective, maintainable, reliable and well documented code.
Effectively estimate and prioritize tasks in order to build a realistic delivery schedule.
Provide peer reviews to other engineers including feedback on performance, scalability and correctness.
Drive technology discussions and provide valuable feedback about the architecture as a domain expert.
Document requirements and designs, and review documents with stakeholders.
Demonstrate growth in technical and non-technical abilities.
Meet with the QA Department to develop a test plan for new features.

What we need to see:

Pursuing BS or MS degree in one of the areas of Electrical Engineering, Computer Engineering, Computer Science.
Excellent C++ programming skills and ability to articulate key aspects of Object-Oriented Programming.
Proficient in at least one graphics programming API such as Direct3D, OpenGL and Vulkan.
Able to work effectively with a team of engineers in a fast paced and dynamic environment.
Excellent written and verbal communication skills.
Able to estimate effectively to ensure delivery of software on time.

Ways to stand out from the crowd:

Knowledge of 3D Graphics Algorithms and GPU Architectures.
Strong grasp of heterogeneous computing, multithreading and a deep understanding of streaming multiprocessors, warp scheduling etc...
Experience with GPU low-level performance tuning/optimization, including profiling and debugging.
Solid understanding of User Experience (UX) design, GUI development and the Qt framework is a huge plus.

Design and Implement API tests for CUDA driver and library. Automate CUDA tests, design test plan and enable them in automation testing infrastructure. Triage test results, isolate test failures and...

These jobs might be a good fit

22.11.2025

Nvidia Software Engineering Intern CUDA Test Development - China, Shanghai

time type: Full time

posted on: Posted 3 Days Ago

job requisition id

What you’ll be doing:

Design and Implement API tests for CUDA driver and library.
Automate CUDA tests, design test plan and enable them in automation testing infrastructure.
Triage test results, isolate test failures and improve test coverage.

What we need to see:

Can work 4 days a week for at least 1 year
Pursuing MS or PhD degree from a leading university in computer science.
Familiar with programming and debugging skills with C/C++ and Python.
Interested in test cases development, tests automation and failure analysis.
Experience using AI development tools to improve quality and productivity across the end-to-end QA workflow.
Good QA sense, knowledge and experience in software testing.

Ways to stand out from the crowd:

Strong English communication and collaboration skills.
Familiar with parallel programming, ideally CUDA C/C++, is a plus.
Background with Bullyseye, Gcov or other dev tool is a plus.

Design and implement the DSL and the core compiler of tile-aware GPU programming model for emerging GPU architectures. Continuously innovate and iterate on the core architecture of the compiler to...

These jobs might be a good fit

16.11.2025

Nvidia Senior Deep Learning Compiler Engineer - CUDA China, Shanghai

China, Beijing

time type: Full time

posted on: Posted 5 Days Ago

job requisition id

NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. It’s a unique legacy of innovation that’s fueled by great technology—and amazing people.

What you'll be doing:

Design and implement the DSL and the core compiler of tile-aware GPU programming model for emerging GPU architectures
Continuously innovate and iterate on the core architecture of the compiler to consistently optimize performance
Investigation of next-generation GPU architectures and provide solutions in the DSL and compiler stack
Performance analysis on emerging AI/LLM workloads and integrate with AI/ML frameworks

What we need to see:

Masters or PhD or equivalent experience in relevant discipline (CE, CS&E, CS, AI)
4 + years of relevant work experience
Excellent C/C++ programming and software engineering skills, ACM background is a plus
Good fundamental knowledges on computer architecture
Strong ability in abstracting problems and the methodology in resolving problems
Strong compiler backgrounds including MLIR/TVM/Triton/LLVM is desired
Good knowledge of GPU architecture and fast kernel programming skills is a plus
Knowledge of LLM algorithms or a certain HPC domain is a plus
Knowledge of multi-GPU distributed communication is a plus
Excellent oral communication in English is a plus

Design and implementfunctional/performancetests for CUDA products, like driver and library. Automate CUDA tests, design test plans and integrate into automation testinginfrastructure. Triage test results, root cause test failures or performance...

These jobs might be a good fit

16.11.2025

Nvidia Senior CUDA Test Development Software Engineer SDET China, Shanghai

time type: Full time

posted on: Posted 6 Days Ago

job requisition id

What you will be doing:

Design and implementfunctional/performancetests for CUDA products, like driver and library.
Automate CUDA tests, design test plans and integrate into automation testinginfrastructure.
Triage test results, root cause test failures or performance drops, and drive through bugs to fix.
Develop scripts/tools and optimize workflow to improve efficiency and productivity.

What we need to see:

MS or PhD degree from a leading university in computer science or a related field.
At least 3 years of relevant professional experience.
Excellent QA sense, knowledge, and experience in software testing.
Rich experience in test case development, tests automation and failure analysis.
Proficient programming and debugging skills in C/C++ and Python.
Comprehensive knowledge of Linux and Windows operating systems.
Experience in using AI development tools for test plans creation, test cases development and test cases automation.

Ways to stand out from the crowd:

Excellent English communication and collaboration skills.
Strong understanding of CUDA, HPC, Gcov, VectorCAST, Coverity.

NvidiaMachine Learning Engineering Intern -

These jobs might be a good fit

1 2 3 4 5 6

656577337

24.11.2025

Description:

China, Shanghai

time type: Full time

posted on: Posted 3 Days Ago

job requisition id

What you’ll be doing:

Develop and test deep learning models for object detection, tracking, and behavior prediction.
Preprocess and analyze sensor data from LiDAR, radar, and camera systems.
Collaborate with cross-functional teams on algorithm integration and performance evaluation.
Support data labeling, simulation, and validation pipelines.
Document model performance and contribute to internal reports.

What we need to see:

Enrolled in a Bachelor’s, Master’s, or PhD program in Computer Science, Electrical Engineering, Robotics, or a related field.
Strong knowledge of Python and libraries such as PyTorch or TensorFlow.
Familiarity with computer vision, deep learning, and reinforcement learning concepts.
Experience with autonomous systems, robotics, or simulation tools a plus.
Internship duration: 10–12 weeks (with potential for extension)