Expoint - all jobs in one place

Finding the best job has never been easier

Limitless High-tech career opportunities - Expoint

Nvidia Solutions Architect HPC Systems Engineer 
United States, Texas 
865605531

Yesterday
US, CA, Santa Clara
US, GA, Remote
US, TX, Remote
US, TN, Remote
US, CO, Remote
time type
Full time
posted on
Posted 4 Days Ago
job requisition id
What you will be doing:
  • Working with NVIDIA AI Native, Consumer Internet and IT Services customers on large data center GPU server and networking system deployments as Solution Architect Engineer. Guide customer discussions on network design, compute/storage and support bring up ofserver/network/clusterdeployments. You will need to visit customer data center during bring up phase.

  • Demonstrate subject matter expertise in advanced GPU & network systems and be a trusted technical advisor to NVIDIA's strategic customers. Bring customer-specific requirements to product teams to guide product roadmap features.

  • Identify new project opportunities for NVIDIA products and technology solutions in data center and artificial intelligence applications. Work closely with the GPU/Network Systems Engineering, Product management and Sales teams

  • Work as customer trusted advisor conducting regular technical customer meetings for product roadmap, cluster issues debug, feature discussions and introduction to new technology solutions

  • Build custom product demonstrations and POCs for solutions that address critical business needs of our customers

  • Analyze and debug compute/network configuration, performance issues to deliver performant clusters

What we need to see:
  • BS/MS/PhD in Electrical/Computer Engineering, Computer Science, Physics, or other Engineering fields or equivalent experience.

  • This role is for an individual with the motivation and skills to drive the data center engineering process. Ideal candidate has 5+ years of Systems/Solution Engineering (or similar Engineering roles) experience

  • System level expertise of CPU/GPU server architecture, NICs, Linux, system software and kernel drivers

  • Experience with networking switches for Ethernet/Infiniband, and Data Center infrastructure (power/cooling)

  • Knowledge of DevOps/MLOps technologies such as Docker/containers, Kubernetes

  • Effective time management and capable of balancing multiple tasks

  • Strong verbal/written communication skills and share your ideas/code clearly through documents, presentation etc

Ways to stand out from the crowd:
  • External customer facing background

  • Experience with bringup and deployment of large clusters

  • Systems engineering, coding, and debugging skills including experience with C/C++, Linux kernel and drivers

  • Hands-on experience with NVIDIA GPU systems/SDKs (e.g. CUDA), NVIDIA Networking technologies (e.g. NICs, RoCE, InfiniBand), and/or ARM CPU solutions

  • Familiarity with virtualization technology concepts

You will also be eligible for equity and .