Expoint – all jobs in one place
The point where experts and best companies meet
Limitless High-tech career opportunities - Expoint

Nvidia Software Engineering Manager Enterprise AI 
China, Shanghai 
531705266

Today
China, Shanghai
time type
Full time
posted on
Posted 3 Days Ago
job requisition id

What you'll be doing:

  • Lead and manage a high-performing team of software engineers and SRE engineers, guiding their professional growth and project execution while fostering a culture of innovation and excellence.

  • Oversee factory automation initiatives that streamline the development, deployment, and management of inference microservices across distributed environments.

  • Coordinate the development of infrastructure that ensures consistency, quality, and security for inference workload deployments at scale.

  • Collaborate with cross-functional teams to integrate the infrastructure into CI/CD pipelines, enabling seamless and efficient microservices delivery.

  • Build foundational distributed computing systems supporting the full lifecycle of inference microservices for NVIDIA's AI strategy.

  • Establish and enforce standards for infrastructure and application deployment, eliminating manual, ad-hoc processes through automation.

  • Work closely with security teams to ensure the platform's design and implementation are robust and secure, with a focus on authentication, authorization, and data protection.

  • Drive recruitment and mentorship efforts to build and maintain a top-tier engineering team.

What we need to see:

  • BS, MS, or PhD in Computer Science, Electrical/Computer Engineering, Physics, Mathematics, other Engineering or related fields (or equivalent experience).

  • 8+ overall years of software engineering experience with a focus on distributed systems, cloud infrastructure, or large-scale platform development.

  • 3+ years of experience leading and managing high-performing engineering teams of software engineers and SRE engineers.

  • Proven expertise with distributed computing technologies including Kubernetes, container orchestration, and experience with multi-cloud or hybrid-cloud environments.

  • Strong understanding of microservices architecture** and experience building scalable, fault-tolerant distributed systems.

  • Experience with factory automation andinfrastructure-as-codeprinciples, including Temporal and automated deployment pipelines.

  • Excellent communication, leadership, and problem-solving skills with the ability to operate in a fast-paced, collaborative environment.

  • Proven ability to work effectively in remote and cross-functional teams.

Ways to stand out from the crowd:

  • Experience building platforms that support the full lifecycle of AI inference applications and microservices.

  • Deep understanding of inference workloads and their unique infrastructure requirements for low-latency, high-throughput processing.

  • Experience with NVIDIA hardware including GPUs, DPUs, and networking technologies for AI workloads.

  • Background in AI/ML infrastructure and understanding of model serving, inference optimization, and GPU utilization.

  • Experience in a large-scale, high-growth technology company with proven track record of delivering software products.