Expoint - all jobs in one place

מציאת משרת הייטק בחברות הטובות ביותר מעולם לא הייתה קלה יותר

Limitless High-tech career opportunities - Expoint

Nvidia Senior DL Algorithms Engineer - Inference Optimizations 
United States, Texas 
634788816

01.12.2024

What you will be doing:

  • Deliver hyper-optimized recipes for LLM inference as part of NVIDIA Inference Microservices (NIMs).

  • Analyze, validate and debug performance and accuracy characteristics of optimized models.

  • Benchmark state-of-the-art offerings in LLM inference and perform competitive analysis for NVIDIA SW/HW stack.

  • Develop software, tooling and processes across multiple layers of the stack to streamline and scale the delivery of hundreds of optimized LLM models.

  • Collaborate heavily with other SW/HW co-design teams to enable the creation of the next generation of AI-powered services.

What we want to see:

  • PhD in CS, EE or CSEE or equivalent experience.

  • 5+ years of experience.

  • Experience with delivering results under tight timelines and rapidly changing requirements.

  • Strong background in deep learning and neural networks, in particular inference.

  • Deep understanding of computer architecture, and familiarity with the fundamentals of GPU architecture.

  • Programming skills in C++ and Python.

Ways to stand out from the crowd:

  • Strong fundamentals in algorithms.

  • Experience and good understanding of LLMs and/or VLMs.

  • Proven experience with processor and system-level performance modelling.

  • Experience with MLOps and DLOps, building CI/CD pipelines

  • GPU programming experience (CUDA or OpenCL) is a strong plus but not required.

You will also be eligible for equity and .