Expoint - all jobs in one place

Finding the best job has never been easier

Limitless High-tech career opportunities - Expoint

Nvidia Principal Engineer Distributed Systems - NIM Factory 
United States, California 
394537195

01.12.2024

What you'll be doing:

  • Architect and build a software factory that will take an AI model and create deployable services across Cloud and On-prem Kubernetes environments

  • Design scalable services with resource efficiency to build and deploy distributed applications

  • Define and deliver rapid iterations of group's technical strategies and roadmaps for a scalable NIM factory system

  • Deliver a highly scalable and reliable factory architecture that operates with very high uptime while performant

  • Collaborate with multiple AI model teams to understand their requirements and build an efficient infrastructure that improves productivity

  • Define metrics and drive improvements based on user feedbacks

  • Partner with our NIM leadership to deliver a cohesive product to customers

What we need to see:

  • You possess advanced programming skills to build distributed and compute systems, backend services, microservices and cloud technologies

  • Experience on designing and implementing highly scalable Cloud Services with well defined APIs.

  • Ability to work optimally with multi-functional teams, principals and architects, across organizational boundaries

  • Deep technical expertise in Microservices, K8s, Cloud Endpoints, Temporal, Helm, Prometheus, Kafka

  • Passion for building rich, microservice applications with automated build and test pipeline

  • Excellent interpersonal skills and the ability to lead multi-functional efforts

  • BS or MS in Computer Science, Computer Engineering or related field (or equivalent experience)

  • 12+ years of proven experience in performant microservice and/or cloud technical leadership roles in an agile software development environment.

  • 12+ years of demonstrated ability in build, debugging, performance analysis and optimization

Ways to stand out from the crowd:

  • Highly proficient in building distributed systems, Microservices, Cloud and On-prem deployments, and CI/CD pipelines

  • Prior experience in working with large scale full stack development

You will also be eligible for equity and .