Expoint – all jobs in one place
המקום בו המומחים והחברות הטובות ביותר נפגשים
Limitless High-tech career opportunities - Expoint

Nvidia Senior Software Engineer - Distributed Inference 
United States, Texas 
230991932

31.08.2025
US, CA, Remote
US, TX, Remote
US, CO, Remote
US, WA, Remote
US, AZ, Remote
time type
Full time
posted on
Posted 5 Days Ago
job requisition id

What you’ll be doing:

  • Build and maintain distributed model management systems, including Rust-based runtime components, for large-scale AI inference workloads.

  • Implement inference scheduling and deployment solutions on Kubernetes and Slurm, while driving advances in scaling, orchestration, and resource management.

  • Collaborate with infrastructure engineers and researchers to develop scalable APIs, services, and end-to-end inference workflows.

  • Create monitoring, benchmarking, automation, and documentation processes to ensure low-latency, robust, and production-ready inference systems on GPU clusters.

What we need to see:

  • Bachelor’s, Master’s, or PhD in Computer Science, ECE, or related field (or equivalent experience).

  • 6+ years of professional systems software development experience.

  • Strong programming expertise in Rust (with C++, Python as a plus).

  • Deep knowledge of distributed systems, runtime orchestration, and cluster-scale services.

  • Hands-on experience with Kubernetes, container-based microservices, and integration with Slurm.

  • Proven ability to excel in fast-paced R&D environments and collaborate across functions.

Ways to stand out from the crowd:

  • Experience with inference-serving frameworks (e.g., Dynamo Inference Server, TensorRT, ONNX Runtime) and deploying/managing LLM inference pipelines at scale.

  • Contributions to large-scale, low-latency distributed systems (open-source preferred) with proven expertise in high-availability infrastructure.

  • Strong background in GPU inference performance tuning, CUDA-based systems, and operating across cloud-native and hybrid environments (AWS, GCP, Azure).

You will also be eligible for equity and .