דרושים Compute Architect ב-אנבידיה ב-China, Shanghai

Analyze state of the art DL networks (LLM etc.), identify and prototype performance opportunities to influence SW and Architecture team for NVIDIA's current and next gen inference products. Develop analytical...

time type: Full time

posted on: Posted 4 Days Ago

job requisition id

What you’ll be doing:

Analyze state of the art DL networks (LLM etc.), identify and prototype performance opportunities to influence SW and Architecture team for NVIDIA's current and next gen inference products.
Develop analytical models for the state of the art deep learning networks and algorithm to innovate processor and system architectures design for performance and efficiency.
Specify hardware/software configurations and metrics to analyze performance, power, and accuracy in existing and future uni-processor and multiprocessor configurations.
Collaborate across the company to guide the direction of next-gen deep learning HW/SW by working with architecture, software, and product teams.

What we need to see:

BS or higher degree in a relevant technical field (CS, EE, CE, Math, etc.).
Strong programming skills in Python, C, C++.
Strong background in computer architecture.
Experience with performance modeling, architecture simulation, profiling, and analysis.
Prior experience with LLM or generative AI algorithms.

Ways to stand out from the crowd:

GPU Computing and parallel programming models such as CUDA and OpenCL.
Architecture of or workload analysis on other deep learning accelerators.
Deep neural network training, inference and optimization in leading frameworks (e.g. Pytorch, TensorRT-LLM, vLLM, etc.).
Open-sourceAIcompilers (OpenAI Triton, MLIR, TVM, XLA, etc.).

and proud to be an

משרות נוספות שיכולות לעניין אותך

Nvidia Deep Learning Performance Architect - New College Grad China, Shanghai

Nvidia Deep Learning Performance Architect - New College Grad United States, California

Nvidia Deep Learning Architect - New College Grad United States, Texas

15.11.2025

Nvidia Senior Software Architect Humanoid Robotics China, Shanghai

שיתוף

Develop and optimize the control stack, including locomotion, manipulation, and whole-body control algorithms;. Deploy and evaluate neural network models in physics simulation and on real humanoid hardware;. Design and maintain...

time type: Full time

posted on: Posted 2 Days Ago

job requisition id

What you will be doing:

Develop and optimize the control stack, including locomotion, manipulation, and whole-body control algorithms;
Deploy and evaluate neural network models in physics simulation and on real humanoid hardware;
Design and maintain teleoperation software for controlling humanoid robots with low latency and high precision;
Implement tools and processes for regular robot maintenance, diagnostics, and troubleshooting to ensure system reliability;
Monitor teleoperators at the lab and develop quality assurance workflows to ensure high-quality data collection;
Collaborate with researchers on model training, data processing, and MLOps lifecycle.

What we need to see:

Bachelor’s degree in Computer Science, Robotics, Engineering, or a related field;
3+ years of full-time industry experience in robotics hardware or software full-stack;
Hands-on experience with deploying and debugging neural network models on robotic hardware;
Ability to implement real-time control algorithms, teleoperation stack, and sensor fusion;
Proficiency in languages such as Python, C++, and experience with robotics frames (ROS) and physics simulation (Gazebo, Mujoco, Isaac, etc.).
Experience in maintaining and troubleshooting robotic systems, including mechanical, electrical, and software components.
Physically work on-site on all business days.

Ways to stand out from the crowd:

Master’s or PhD’s degree in Computer Science, Robotics, Engineering, or a related field;
Experience at humanoid robotics companies on real hardware deployment;
Experience in robot hardware design;
Demonstrated Tech Lead experience, coordinating a team of robotics engineers and driving projects from conception to deployment.

משרות נוספות שיכולות לעניין אותך

15.11.2025

Nvidia Deep Learning Performance Architect - Intern China, Shanghai

שיתוף

time type: Full time

posted on: Posted 6 Days Ago

job requisition id

What you’ll be doing:

Analyze state of the art DL networks (LLM etc.), identify and prototype performance opportunities to influence SW and Architecture team for NVIDIA's current and next gen inference products.
Develop analytical models for the state of the art deep learning networks and algorithm to innovate processor and system architectures design for performance and efficiency.
Specify hardware/software configurations and metrics to analyze performance, power, and accuracy in existing and future uni-processor and multiprocessor configurations.
Collaborate across the company to guide the direction of next-gen deep learning HW/SW by working with architecture, software, and product teams.

What we need to see:

BS or higher degree in a relevant technical field (CS, EE, CE, Math, etc.).
Strong programming skills in Python, C, C++.
Strong background in computer architecture.
Experience with performance modeling, architecture simulation, profiling, and analysis.
Prior experience with LLM or generative AI algorithms.

Ways to stand out from the crowd:

GPU Computing and parallel programming models such as CUDA and OpenCL.
Architecture of or workload analysis on other deep learning accelerators.
Deep neural network training, inference and optimization in leading frameworks (e.g. Pytorch, TensorRT-LLM, vLLM, etc.).
Open-sourceAIcompilers (OpenAI Triton, MLIR, TVM, XLA, etc.).

and proud to be an

משרות נוספות שיכולות לעניין אותך

09.11.2025

Nvidia Compute System Architect - New College Grad China, Shanghai

שיתוף

Explore innovative GPU composition and novel system functionalities related to processing and storage. Connect with GPU architecture designers to make advised and random functional testing plan which provides good coverage....

time type: Full time

posted on: Posted 5 Days Ago

job requisition id

What you’ll be doing:

Explore innovative GPU composition and novel system functionalities related to processing and storage.
Connect with GPU architecture designers to make advised and random functional testing plan which provides good coverage.
Innovatively develop and improve infrastructure and methodology to generate tests.
Massively generate, run and debug tests in various platforms, e.g., functional simulator, full chip and unit-level RTL, emulator and silicon.
Build innovative tools to improve efficiency.

What we need to see:

Knowledge of computer architecture, compiler, assembly language.
Good understanding of GPU concept and pipeline, in terms of parallel computing and/or memory system.
Familiar with Linux
Good at C++ and Python development.
Bachelor in CS or EE. MS or PhD or equivalent experience is a plus.

Ways to stand out from the crowd:

Knowledge and experience of CUDA programming and debugging.
Random system development and work experience.
ASIC experience.
Compiler experience

משרות נוספות שיכולות לעניין אותך

09.11.2025

Nvidia Deep Learning Performance Architect - New College Grad China, Shanghai

שיתוף

Develop highly optimized deep learning kernels for inference. Do performance optimization, analysis, and tuning. Work with cross-collaborative teams across automotive, image understanding, and speech understanding to develop innovative solutions. Masters...

time type: Full time

posted on: Posted 6 Days Ago

job requisition id

What you’ll be doing:

Develop highly optimized deep learning kernels for inference
Do performance optimization, analysis, and tuning
Work with cross-collaborative teams across automotive, image understanding, and speech understanding to develop innovative solutions

What we need to see:

Masters or PhD or equivalent experience in relevant discipline (CE, CS&E, CS, AI)
SW Agile skills helpful
Excellent C/C++ programming and software design skills
Python experience a plus
Performance modelling, profiling, debug, and code optimization or architectural knowledge of CPU and GPU
GPU programming experience (CUDA or OpenCL) desired

משרות נוספות שיכולות לעניין אותך

26.10.2025

Nvidia Deep Learning Performance Architect China, Shanghai

שיתוף

time type: Full time

posted on: Posted 4 Days Ago

job requisition id

What you’ll be doing:

Develop highly optimized deep learning kernels for inference
Do performance optimization, analysis, and tuning
Work with cross-collaborative teams across automotive, image understanding, and speech understanding to develop innovative solutions
Occasionally travel to conferences and customers for technical consultation and training

What we need to see:

Masters or PhD or equivalent experience in relevant discipline (CE, CS&E, CS, AI)
SW Agile skills helpful
Excellent C/C++ programming and software design skills
Python experience a plus
Performance modelling, profiling, debug, and code optimization or architectural knowledge of CPU and GPU
GPU programming experience (CUDA or OpenCL) desired
5 years of relevant work experience

משרות נוספות שיכולות לעניין אותך

18.10.2025

Nvidia Deep Learning Performance Architect China, Shanghai

שיתוף

Design and develop the architecture, interface and features of the GPU kernel library. Keep improving the quality and performance of the library and its GPU kernels. Explore and expand the...

time type: Full time

posted on: Posted 2 Days Ago

job requisition id

What you'll be doing:

Design and develop the architecture, interface and features of the GPU kernel library
Keep improving the quality and performance of the library and its GPU kernels
Explore and expand the boundary of innovative technologies like GPU code generation and fusion
Contribute to NVIDIA's AI business by collaborating closely with DL product teams as well as kernel development teams

What we need to see:

MS, PhD or equivalent in relevant fields (CS, EE, Math)
2+ years of relevant work or research experience
Strong programming skills in C, C++, and Python
Excellent problem solving skills and learning capability
Experience with designing software architecture, interfaces, and building testing infrastructures
Good communication and a great teammate

Ways to stand out from the crowd:

Familiar with CUDA programming and GPU architecture
Background with DL fundamentals, frameworks, graph compilers, LLVM, MLIR etc.
Hands-on experience in development on Linux and Windows platforms, C++ build tools like CMake and DevOps tools, including Docker, Jenkins, Kubernetes etc.
Track record of mentoring junior engineers and leading a project and a team

NvidiaDeep Learning Performance Architect - New College Grad

משרות נוספות שיכולות לעניין אותך

1 2 3 4 5 6

204074340

24.11.2025

שיתוף

תיאור:

China, Shanghai

China, Beijing

time type: Full time

posted on: Posted 4 Days Ago

job requisition id

What you’ll be doing:

Analyze state of the art DL networks (LLM etc.), identify and prototype performance opportunities to influence SW and Architecture team for NVIDIA's current and next gen inference products.
Develop analytical models for the state of the art deep learning networks and algorithm to innovate processor and system architectures design for performance and efficiency.
Specify hardware/software configurations and metrics to analyze performance, power, and accuracy in existing and future uni-processor and multiprocessor configurations.
Collaborate across the company to guide the direction of next-gen deep learning HW/SW by working with architecture, software, and product teams.

What we need to see:

BS or higher degree in a relevant technical field (CS, EE, CE, Math, etc.).
Strong programming skills in Python, C, C++.
Strong background in computer architecture.
Experience with performance modeling, architecture simulation, profiling, and analysis.
Prior experience with LLM or generative AI algorithms.

Ways to stand out from the crowd:

GPU Computing and parallel programming models such as CUDA and OpenCL.
Architecture of or workload analysis on other deep learning accelerators.
Deep neural network training, inference and optimization in leading frameworks (e.g. Pytorch, TensorRT-LLM, vLLM, etc.).
Open-sourceAIcompilers (OpenAI Triton, MLIR, TVM, XLA, etc.).

and proud to be an