Expoint - all jobs in one place

מציאת משרת הייטק בחברות הטובות ביותר מעולם לא הייתה קלה יותר

Limitless High-tech career opportunities - Expoint

Nvidia Senior Product Manager NIM – Factory Observability Automation 
United States, California 
825976124

01.09.2024

We are looking for a Senior Product Manager to help scale the NVIDIA NIM initiative across the company. You will enable researchers and engineers with the infrastructure, tools, services, and workflows that shorten time-to-market for new NIMs and guarantee quality and consistency of development processes across 10+ vertical teams (LLMs, VLMs, Speech, Computer Vision, Healthcare, Genomics, Weather Forecasting, Digital Humans, etc). You will search for bottlenecks in the NIM production processes and automate inefficiencies by developing new NIM Factory capabilities (e.g. Container Building, NIM Validation, Cloud Readiness Testing, Artifact Publishing), make NIM Factory operations more effective and transparent with observability dashboards, extend the Factory to provide confidential and secure processing of partner models across the NIM lifecycle.

What you'll be doing:

  • Define and drive the Factory Automation vision, metrics, execution strategy, and design dashboards and metrics to report on NIM Factory operations.

  • Identify bottlenecks and inefficiencies in the existing NIM Factory operational processes.

  • Define product personas. Collect and prioritize requirements from a diverse pool of external model providers and internal teams working on various AI verticals. There are a lot of them making it challenging to find a scalable solution.

  • Drive product adoption, analyze usage of individual Factory capabilities and their combinations, improve the Factory based on customer feedback via log analytics, interviews, surveys, NPS, among others.

  • Perform computing capacity forecasting and HW bring-up process for Factory needs.

  • Collaborate with the UI/UX, Engineering, and Design teams on delightful CLI, SDK, API, and Web experiences to expose Factory capabilities and visualize its operations.

  • Coordinate with TPMs to align roadmaps and respond to market trends.and build new and extend the existing NIM Factory capabilities.

  • Author product requirement documents (PRDs) and software designs docs (SDDs). Design for ease-of-use, extensibility, modularity. Focus on scalability and tool adaptability to a diverse set of verticals and use cases.

What we need to see:

  • MBA or BS/MS in Computer Science, Electrical Engineering, Operations Research or equivalent experience.

  • 12+ years of experience in product management at a technology company, co-founder or related technical role in a startup or equivalent experience.

  • 3+ years of experience working on sophisticated software build systems, developer platforms (e.g. DevOps, MLOps), and infrastructure.

  • 2+ years of experience shipping AI/ML solutions for enterprises.

  • Teamwork and influencing skills to successfully navigate in a highly matrixed environment. At NVIDIA, your entire company is on your team!

  • Positive energy, attention for detail, drive for high-performance, personal growth, and deep care for customers to build products people love.

  • Pragmatic and data-driven project management skills to navigate the software development lifecycle, including prioritization of diverse customer requirements and product releases while delivering high quality software on time and with a lean team.

  • Strong time management skills and personal flexibility – very organized with the ability to multitask and prioritize, switch context between strategy and focused execution.

Ways to stand out from the crowd:

  • 3+ years of experience managing or developing a complex ERP installation.

  • 3+ years of experience driving operations for a complex supply chain or factory.

  • Solid understanding of MLOps, Cloud Computing, and software automation technologies, including Docker, K8s, Github/GitLab, Ansible, Redash, Grafana, CI/CD, Jenkins, CLI, Shell scripting, workflow systems (e.g. Kubeflow, Airflow), among others.

  • Key role in the development of a Cloud/SRE/MLOps enterprise platform and understanding of the solution stack from infra to services and everything in-between.

  • PhD in Computer Science, Operations Research, Economics or an equivalent.

You will also be eligible for equity and .