Expoint – all jobs in one place
The point where experts and best companies meet
Limitless High-tech career opportunities - Expoint

Nvidia Senior Staff Machine Learning Engineer — Enterprise AI 
United States, California 
678728585

28.07.2025
US, CA, Santa Clara
time type
Full time
posted on
Posted 2 Days Ago
job requisition id

What You’ll Be Doing:

  • Develop Intelligent AI Solutions – Leverage NVIDIA AI technologies and GPUs to build pioneering NLP and Generative AI solutions—such as Retrieval-Augmented Generation (RAG) pipelines and agentic workflows—that solve real-world enterprise and supply-chain problems.

  • Own Key AI Features – Drive the end-to-end development of LLM-powered applications, chatbots, and optimization engines that improve organizational efficiency and resilience.

  • Design Robust ML Architectures – Create machine-learning andcombinatorial-optimization

  • Collaborate Across NVIDIA – Work closely with product, research, and engineering teams to translate requirements into ML solutions and deliver measurable business outcomes.

  • Mentor & Share Best Practices – Guide junior engineers and peers on ML design patterns, code quality, and experiment methodology.

What We Need to See:

  • Master’s or Ph.D. in Computer Science, Operations Research, Industrial Engineering, or a related field, or equivalent experience.

  • 10+ years designing, building, and deploying machine-learning models and systems in production with 12+ years industry experience.

  • Solid understanding of transformers, attention mechanisms, and modern NLP / LLM techniques; experience fine-tuning or prompting large language models.

  • Strong Python plus deep-learning frameworks such as PyTorch or TensorFlow; familiarity with CUDA-accelerated libraries (e.g., TensorRT-LLM) is a plus.

  • Proven track record to take a significant ML component or feature from concept to production and collaborate effectively with multi-functional teams.

Ways to Stand Out from the Crowd:

  • Agentic AI Mastery – Practical experience with frameworks such as LangChain or LangGraph and a deep understanding of multi-step reasoning and planning.

  • LLM Inference Optimization – Expertise in accelerating LLM inference (e.g., KV caching, quantization) to achieve low latency at scale.

  • End-to-End ML Systems Ownership – A portfolio showing full lifecycle ownership, from data ingestion to monitoring and continuous improvement.

  • Research Impact – Publications or patents that advance NLP or enterprise AI.

You will also be eligible for equity and .