Expoint – all jobs in one place
The point where experts and best companies meet
Limitless High-tech career opportunities - Expoint

Nvidia Senior Software Engineer AI Platform - Robotics 
United States, California 
68046424

Yesterday
US, CA, Santa Clara
time type
Full time
posted on
Posted 13 Days Ago
job requisition id


What you’ll be doing:

  • Architect, develop, and deploy backend services supporting NVIDIA GR00T using Kubernetes and cloud-native technologies.

  • Collaborate with ML, simulation, and robotics engineers to deploy scalable, reproducible, and observable multi-node training and inference workflows.

  • Extend and maintain OSMO’s orchestration layers to support heterogeneous compute backends and robotic data pipelines.

  • Develop Helm charts, controllers, CRDs, and service mesh integrations to support secure and fault-tolerant system operation.

  • Implement microservices written in Go or Python that power GR00T task execution, metadata tracking, and artifact delivery.

  • Optimize job scheduling, storage access, and networking across hybrid and multi-cloud Kubernetes environments (e.g., OCI, Azure, on-prem).

  • Build tooling that simplifies deployment, debugging, and scaling of robotics workloads.


What we need to see:

  • BS, MS, or PhD degree in Computer Science, Electrical Engineering, Computer Engineering, or related field (or equivalent experience)

  • 5+ years of work experience in DevOps, backend, or cloud infrastructure engineering.

  • Hands-on experience building and deploying microservices in Kubernetes-native environments.

  • Proficiency in Golang or Python, especially for backend systems and operators.

  • Experience with Helm, or other Kubernetes templating and config management tools.

  • Familiarity with GitOps workflows, observability stacks (e.g., Prometheus, Grafana), and container CI/CD pipelines.

  • Strong understanding of container networking, storage (e.g., PVCs, ephemeral), and scheduling.


Ways to stand out from the crowd:

  • Experience with ML training workflows, distributed job orchestration (e.g., MPI, Ray, Triton Inference Server).

  • Knowledge of robotics frameworks (e.g., ROS2) or simulation tools (e.g., Isaac Sim, Omniverse).

  • Background with GPU cluster management and scheduling across cloud providers.

  • Contributions to open-source Kubernetes projects or customoperators/controllers.

You will also be eligible for equity and .