Expoint – all jobs in one place
מציאת משרת הייטק בחברות הטובות ביותר מעולם לא הייתה קלה יותר
Limitless High-tech career opportunities - Expoint

Nvidia Solutions Architect DGX Cloud 
United States, Texas 
88924166

Today
US, CA, Santa Clara
US, Remote
time type
Full time
posted on
Posted 7 Days Ago
job requisition id

What you’ll be doing:

Work closely with DGX Cloud Partners, become their trusted technical advisor, advocate for their needs, and ensure they are successful in accomplishing their business goals with the platform.

  • Accelerate NVIDIA Cloud Partner onboarding time, cluster manageability and reliability.

  • Scale knowledge, reach, and opportunities by building and educating vertical teams and communities on DGX Cloud and NVIDIA Reference Architectures.

  • Communicate to our Reference Architecture teams findings gathered from the field.

  • Provide technical education and facilitate field product feedback to improve DGX Cloud.

  • Enable partners to participate in the DGX Cloud Ecosystem with the goal of end-user satisfaction and increased sales.

What we need to see:

  • Strong foundational expertise, from a BS, MS, or Ph.D. degree in Engineering, Mathematics, Physics, Computer Science, Data Science (or equivalent experience)

  • 5+ years of proven experience with one or more Cloud Service Providers (AWS, Azure, GCP or OCI), NVIDIA Cloud Partners (CoreWeave, Lambda Labs, Crusoe, etc) and cloud-native architectures and software.

  • Demonstrated experience in technical leadership, strong understanding of NVIDIA technologies, and success in working with customers.

  • Expertise with parallel filesystems (e.g. Lustre, GPFS, BeeGFS, WekaIO) and high-speed interconnects (InfiniBand, Omni Path, RoCE, and Gig-E).

  • Strong coding and debugging skills, and demonstrated expertise in one or more of the following areas: Machine Learning, Deep Learning, Slurm, Kubernetes, MPI, MLOps, LLMOps, Ansible, Terraform, and other high-performance AI cluster solutions.

  • Proficient in deploying GPU applications in Slurm, Kubernetes, docker, helm, registries

  • Linux-based configuration management and monitoring solutions, system administration, OS installation, configuration, and troubleshooting

  • Networking technologies (e.g. router, firewall, load balancer, DNS, VPN) for complex infrastructure configuration

Ways to stand out from the crowd:

  • Experience using DGX Cloud, NVIDIA AI Enterprise AI Software including Base Command Manager, NeMo, and NVIDIA's Inference Microservices.

  • Experience with AI application development and deployment

  • Background with deploying and configuring observability tooling including Grafana, Prometheus, W&B, Nagios, Zabbix

  • Experience with high performance or large-scale computing environments.

You will also be eligible for equity and .