Expoint – all jobs in one place
Finding the best job has never been easier
Limitless High-tech career opportunities - Expoint

Nvidia Senior Infrastructure Software Engineer 
China, Shanghai 
719818503

20.10.2025
China, Shanghai
China, Beijing
time type
Full time
posted on
Posted 2 Days Ago
job requisition id
What you'll be doing:
  • Architect, develop, and maintain Python-based tools and services to efficiently run a performance-focused multi-tenant Linux cluster including embedded, desktop, and server systems

  • Work with industry standard tools (Kubernetes, Slurm, Ansible, Gitlab, Artifactory, Jira)

  • Actively support users doing development, functional testing, and performance testing on current and pre-production GPU cluster systems

  • Work with various teams at NVIDIA across different timezones to incorporate and influence the latest tools for operating GPU clusters

  • Collaborate with users and system administrators to seek out ways to improve UX and operational efficiency

  • Become an expert on the entire AI infrastructure stack

What we need to see:
  • BS or higher degree in computer science with 4+ years of relevant experience

  • Adept programming skills in multiple languages including Python

  • In-depth experience with distributed systems and cluster management stacks (logging, monitoring, scheduling, etc.)

  • Hands-on experience with continuous integration and deployment tools (e.g. GitlabCI)

  • Outstanding ability to understand users, prioritize among many contending requests, and build consensus

  • Passion for “it just works” automation, eliminating repetitive tasks, and enabling team members

  • Deep understanding of Linux system administration and container technologies

  • Proficient English communication skills

Ways to stand out from the crowd:
  • Experience automating operations for bare-metal clusters

  • Experience with GPU computing systems

  • Track record of identifying useful new technologies or methods and incorporating them into SW development flows

  • Experience as an active contributor to a SW project involving many developers or as a maintainer of open-source software