Expoint – all jobs in one place
מציאת משרת הייטק בחברות הטובות ביותר מעולם לא הייתה קלה יותר
Limitless High-tech career opportunities - Expoint

Nvidia Senior System Software Engineer - Dynamo Triton Inference Server 
United States, Texas 
525037177

02.07.2025
US, CA, Remote
US, WA, Remote
US, CA, Santa Clara
time type
Full time
posted on
Posted 25 Days Ago
job requisition id

What you'll be doing:

In this role, you will develop open source software to serve inference of trained AI models running on GPUs. You will balance a variety of objectives: build robust, scalable, high performance software components to support our distributed inference workloads; work with team leads to prioritize features and capabilities; load-balance asynchronous requests across available resources; optimize prediction throughput under latency constraints; and integrate the latest open source technology.

What we need to see:

  • Masters or PhD or equivalent experience

  • 8+ years in Computer Science, Computer Engineering, or related field

  • Ability to work in a fast-paced, agile team environment

  • Excellent Rust/Python / C++ programming and software design skills, including debugging, performance analysis, and test design.

  • Experience with high scale distributed systems and ML systems

Ways to stand out from the crowd:

  • Prior work experience improving performance of AI inference systems.

  • Background with deep learning algorithms and frameworks. Especially experience Large Language Models and frameworks such as PyTorch, TensorRT, and ONNX Runtime.

  • Experience building and deploying cloud services using HTTP REST, gRPC, protobuf, JSON and related technologies.

  • Have familiarity with the latest AI research and working knowledge of how these systems are efficiently implemented.

You will also be eligible for equity and .