Expoint - all jobs in one place

Finding the best job has never been easier

Limitless High-tech career opportunities - Expoint

Nvidia NIM Solution Architect 
China, Shanghai 
949887673

06.05.2025
China, Shanghai
China, Guangzhou
China, Beijing
China, Shenzhen
time type
Full time
posted on
Posted 5 Days Ago
job requisition id

What you’ll be doing:

  • Drive the implementation and deployment of NVIDIA Inference Microservice (NIM) solutions

  • Use NVIDIA NIM Factory Pipeline to package optimized models (including LLM, VLM, Retriever, CV, OCR, etc.) into containers providing standardized API access for on-prem or cloud deployment

  • Refine NIM tools for the community, help the community to build their performant NIMs

  • Design and implement agentic AI tailored to customer business scenarios using NIMs

  • Deliver technical projects, demos and client support tasks as directed by the Solution Architecture Leadership

  • Provide technical support and guidance to customers, facilitating the adoption and implementation of NVIDIA technologies and products

  • Collaborate with cross-functional teams to enhance and expand our AI solutions portfolio

  • Be an internal champion for NVIDIA software and total solutions in technical community

  • Be an industry thought leader on integrating NVIDIA technology especially inference services into LHA, business partners and whole community

  • Assist in supporting NVAIE team and driving NVAIE business in China

What we need to see:

  • 3+ years working experience with Bachelor's or Master's degree in Computer Science, Artificial Intelligence, or a related field

  • Proven experience in deploying and optimizing large language models

  • Proficiency in at least one inference framework (e.g., TensorRT, ONNX Runtime, PyTorch)

  • Strong programming skills in Python or C++

  • Familiarity with main stream inference engines (e.g., vLLM, SGLang)

  • Experience with DevOps/MLOps such as Docker, Git, and CI/CD practices

  • Excellent problem-solving skills and ability to troubleshoot complex technical issues

  • Demonstrated ability to collaborate effectively across diverse, global teams, adapting communication styles while maintaining clear, constructive professional interactions

Ways to stand out from the crowd:

  • Experience in architectural design for field LLM projects

  • Expertise in model optimization techniques, particularly using TensorRT

  • Knowledge of AI workflow design and implementation, experience on cluster resource management tools. Familiarity with agile development methodologies

  • CUDA optimization experience, extensive experience designing and deploying large scale HPC and enterprise computing systems