Expoint - all jobs in one place

The point where experts and best companies meet

Limitless High-tech career opportunities - Expoint

Nvidia Senior System Software Engineer - Dynamo Triton Inference Server 
United States, Texas 
171418243

Today
US, CA, Remote
US, WA, Remote
US, OR, Remote
time type
Full time
posted on
Posted 30+ Days Ago
job requisition id

What you'll be doing:

In this role, you will develop open source software to serve inference of trained AI models running on GPUs. You will balance a variety of objectives: build robust, scalable, high performance software components to support our distributed inference workloads; work with team leads to prioritize features and capabilities; load-balance asynchronous requests across available resources; optimize prediction throughput under latency constraints; and integrate the latest open source technology.

What we need to see:

  • Masters or PhD or equivalent experience

  • 6+ years in Computer Science, Computer Engineering, or related field

  • Ability to work in a fast-paced, agile team environment

  • Excellent Rust/Python / C++ programming and software design skills, including debugging, performance analysis, and test design.

  • Experience with high scale distributed systems and ML systems

Ways to stand out from the crowd:

  • Prior work experience improving performance of AI inference systems.

  • Background with deep learning algorithms and frameworks. Especially experience Large Language Models and frameworks such as PyTorch, TensorFlow, TensorRT, and ONNX Runtime.

  • Experience building and deploying cloud services using HTTP REST, gRPC, protobuf, JSON and related technologies.

  • Experience with container technologies, such as Docker and container orchestrators, such as Kubernetes.

  • Have familiarity with the latest AI research and working knowledge of how these systems are efficiently implemented.

You will also be eligible for equity and .