Expoint – all jobs in one place
מציאת משרת הייטק בחברות הטובות ביותר מעולם לא הייתה קלה יותר
Limitless High-tech career opportunities - Expoint

Nvidia Systems Software Engineer - AI Cloud 
United States, California 
568937553

26.08.2025
US, CA, Santa Clara
time type
Full time
posted on
Posted 3 Days Ago
job requisition id

What you'll be doing:

  • Evaluate cloud-native, full-stack applications using microservices architecture to power AI use cases, bringing to bear NVIDIA frameworks, SDKs, and microservices.

  • Design and implement agentic workflows with advanced techniques like Retrieval-Augmented Generation (RAG) and the latest AI models.

  • Evaluate user experiences and analyze the technical performance of AI solutions, compiling findings into comprehensive reports. Offer practical suggestions for product improvement to senior executives and engineering management.

  • Engage with various teams across NVIDIA such as product, marketing, hardware, software engineering, and QA to improve NVIDIA's product offerings.

  • Develop developer-focused content, including detailed tutorials and code samples, to demonstrate the latest features in NVIDIA’s tools and libraries.

  • Write technical whitepapers and product briefs, and run technical demos of our products at prominent industry conferences.

What we need to see:

  • A Bachelor’s or Master’s in Software Engineering, Computer Science, Computer Engineering, Electrical Engineering or a related degree (or equivalent experience)

  • 3+ years of experience.

  • Proficiency in Python and JavaScript for programming and debugging, with a strong foundation in data structures, algorithms, and software design principles.

  • Basic familiarity with C++ programming and its application in high-performance computing environments.

  • Experience in crafting cloud-native systems optimized for Kubernetes deployment, using inference frameworks such as vLLM and NVIDIA Triton Inference Server.

  • A solid understanding of API design principles for building scalable, production-ready inference systems.

Ways to stand out from the crowd:

  • Advanced knowledge of LLMs, modern AI software architecture, and cloud APIs.

  • Contributions to public-facing technical content and open-source projects.

  • Expertise in deploying LLM inference frameworks like Triton Inference Server, vLLM, or TensorRT, including on Kubernetes or edge devices to improve performance.

You will also be eligible for equity and .