Expoint - all jobs in one place

Finding the best job has never been easier

Limitless High-tech career opportunities - Expoint

Nvidia Senior HPC Architect 
United States, Texas 
309462414

01.09.2024

We are looking for an outstanding hands-on architect/engineer for a Senior HPC architect role to support deployment and bringup of large-scale GPU compute clusters. Be a key player to enable the most exciting computing hardware and software and contribute to the latest breakthroughs in artificial intelligence and GPU computing. Provide insights on and implement at-scale system administration and tuning mechanisms for large-scale compute runs. You will work with the latest accelerated computing and Deep Learning software and hardware platforms, and with many scientific researchers, developers, and customers to craft improved workflows and develop new, leading differentiated solutions. You will interact with HPC, OS, GPU compute, and systems specialist to architect, develop and bring up large scale performance platforms.

What we need to see:

  • 5+ years of experience using in accelerated computing for datacenter/HPC solutions.

  • Solid understanding of accelerated computing scheduling and I/O stacks.

  • Experience using and handling HPC-based Enterprise computing architectures.

  • C/C++/Python/Bashprogramming/scriptingexperience.

  • Experience working with engineering or academic research community supporting high performance computing or deep learning.

  • Background with scheduling and resource management systems.

  • Experience with parallel filesystems.

  • Strong verbal and written communication skills.

  • Strong teamwork and communication skills.

  • Ability to multitask effectively in a dynamic environment.

  • Action driven with strong analytical and analytical skills.

  • Desire to be involved in multiple diverse and innovative projects.

  • BS (or equivalent experience) in Engineering, Mathematics, Physics, or Computer Science. MS or PhD desirable.

Ways to stand out from the crowd:

  • Deep Learning framework skills.

  • Exposure to using and deploying telemetry and visualization pipelines

  • Exposure to container technology and Linux performance tools.

You will also be eligible for equity and .