Expoint – all jobs in one place
מציאת משרת הייטק בחברות הטובות ביותר מעולם לא הייתה קלה יותר
Limitless High-tech career opportunities - Expoint

Nvidia Model-as-a-Service Tech Lead 
United States, California 
312743753

Today
US, CA, Santa Clara
time type
Full time
posted on
Posted 6 Days Ago
job requisition id

What you'll be doing:

  • Serve as the primary, high-impact contributor on complex features. Dedicate significant time to producing production code across the full stack, including UI, APIs, services, and infrastructure.

  • Code Review Leadership & Quality Assurance: Lead the code review process, setting and implementing thorough coding standards, performance benchmarks, and architectural integrity to ensure all merged code is high-quality, maintainable, and robust.

  • Architectural Ownership & Portability: Define and own the long-term technical roadmap, architecture, and design. This includes the required assurance that the deployment pipelines and services are platform-agnostic and easily deployable across the broader NVIDIA ecosystem, deliberately avoiding internal infrastructure dependencies.

  • Foundation Model Deployment Strategy: Lead the strategic implementation of web services and efficient batch processing queues to seamlessly integrate and operationalize our world foundation models into the customer-facing platform.

  • System Performance & Reliability: Implement and make sure standards for production-grade performance, monitoring, and fault tolerance across all services. Proactively identify and resolve systemic technical debt and scalability bottlenecks.

  • Deployment & Operational Excellence: Take ultimate ownership of the CI/CD pipelines, container orchestration strategy (Kubernetes/Helm), and operational readiness, ensuring seamless scalability and reliability in production.

  • Team Mentorship & Guidance: Mentor and guide the engineering team on advanced practices in full-stack development, distributed systems design, performance optimization, and clean, portable code architecture.

  • Multi-functional Partnership: Act as the key technical liaison, translating complex requirements from Product Managers, ML Engineers, and Data Scientists into robust, portable, and implementable designs.

What we need to see:

This role requires a proven track record of significant experience and technical mastery:

  • Minimum 12+ years of hands-on experience developing and deploying scalable full-stack web services in a cloud environment.

  • Proven Tech Lead or equivalent Senior/Staff level experience with demonstrated ability to define system architecture, mentor engineers, and take end-to-end technical ownership of a major platform while remaining deeply active in coding and code reviews.

  • Expert-level proficiency in designing and scaling distributed microservices architectures using gRPC and REST APIs.

  • Deep expertise in modern frontend frameworks and building highly responsive, data-intensive UIs capable of managing high-frequency data flows.

  • Direct experience designing and deploying containerized applications that use a GPU (e.g., NVIDIA Container Toolkit).

  • Experience with MaaS (Model-as-a-Service) patterns and serving large machine learning models as high-throughput endpoints.

  • Mastery of container orchestration, including Kubernetes and Helm for sophisticated, portable, multi-service production deployments.

  • Proficiency in backend languages such as Python and/or Go, and TypeScript for the frontend.

  • Strong practical experience with Cloud Infrastructure (AWS S3) and running complex data storage/access patterns (SQL, key-value stores).

  • Expertise in CI/CD practices (GitLab, Jenkins) with a focus on automation, testing, and improving deployment velocity and stability.

  • Bachelor's degree (B.S.) or equivalent experience in Computer Science, Software Engineering, Electrical Engineering, or a closely related technical field; Master's degree (M.S.) preferred

Ways to stand out from the crowd:

These skills represent a strong alignment with our specific domain challenges:

  • Experience in data querying platforms such as Apache Druid, ClickHouse, or Elasticsearch.

  • Familiarity with autonomous vehicle simulation environments (e.g., Carla) and synthetic data generation pipelines using foundational models.

You will also be eligible for equity and .