Expoint – all jobs in one place
Finding the best job has never been easier
Limitless High-tech career opportunities - Expoint

Nvidia Manager LLM Accuracy Evaluation 
Switzerland, Vaud 
97068182

16.09.2025
Switzerland, Zurich
Poland, Remote
Netherlands, Remote
UK, Remote
Spain, Remote
time type
Full time
posted on
Posted 6 Days Ago
job requisition id

What you’ll be doing:

  • Lead and mentor a team of highly skilled engineers, fostering their growth while solving the most ambitious challenges in AI evaluation.

  • Drive the accuracy evaluation of flagship AI models, coordinating efforts across internal teams and external partners to ensure timely, high-quality results.

  • Collaborate withstakeholders acrossNVIDIA to balance speed of delivery with rigorous engineering practices.

  • Develop and implement newmethodologies forevaluating LLMs, multimodal systems, and agent frameworks at scale.

  • Build a culture of innovation and excellence, encouraging continuous improvement and adoption of best practices in AI evaluation and deployment.

What we need to see:

  • BS, MS, or PhD in Computer Science, AI, Applied Math, or related field, or equivalent experience, with 7+ years of industry experience, including 3+ years in leadership.

  • Proven success leading engineering teams anddelivering complexAI/deep learning projects.

  • Deep understanding of modern AI technologies—LLMs, multimodal models, retrieval-augmented generation, and agent frameworks—with the ability to guide technical strategy.

  • Outstanding communication skills and the ability to partner effectively across organizations and with external collaborators.

  • Demonstrated ability to mentor and grow engineering talent, fostering collaboration and technical excellence.


Ways to stand out from the crowd:

  • Experience managing teams that shipped AI products or services using LLMs, RAG, or multimodal/agent models.

  • Hands-on expertise in deploying and optimizing AI models in production, with platforms such as TensorRT, Triton, or ONNX.

  • Strong backgroundin MLOps/DevOps,with a focus on scaling deep learning workloads.

  • Proven ability tomanage large-scaleAI evaluations and training workloads on HPC clusters, ensuring efficiency and reproducibility.

  • Deep understanding of cloudinfrastructure, containerization(Docker),and orchestration(Kubernetes), with an emphasis on scalability and reliability.