Expoint - all jobs in one place

המקום בו המומחים והחברות הטובות ביותר נפגשים

Limitless High-tech career opportunities - Expoint

Nvidia Senior HPC Engineer Professional Services 
Japan 
73379464

24.06.2024

What you will be doing:

  • Primary responsibilities will include deploying, managing, and maintaining AI/HPC infrastructure in Linux-based environments for new and existing customers.

  • Be the domain expert with customers during planning calls through implementation.

  • Handover-related documentation and perform knowledge transfers required to support customers as they begin rolling out some of the most sophisticated systems in the world!

  • Provide feedback to internal teams, such as opening bugs, documenting workarounds, and suggesting improvements.

What we need to see:

  • 5+ years, providing in-depth support and deployment services, solving problems for hardware and software products.

  • Knowledge and experience with Linux System Administration, process management, package management, task scheduling, kernel management, bootprocedures/troubleshooting,performancereporting/optimization/logging,network routing/advanced networking (tuning and monitoring).

  • Cluster management technologies (Bright, XCat, etc.).

  • Minimum of a four-year degree from an accredited university or college or equivalent experience in computer science, electrical engineering, or computer engineering (or equivalent experience).

  • Scripting proficiency (Bash, Ansible, etc.).

  • Good interpersonal skills with the ability to maintain and deliver resolutions for customer-blocking issues as they arise.

  • Strong organizational skills and ability toprioritize/multi-taskeasily with limited supervision.

  • Experience with Schedulers such as SLURM, LSF, UGE, etc.

Ways to stand out from the crowd:

  • Ethernet and Storage technologies.

  • InfiniBand experience.

  • Experience with GPU-focused hardware/software.

  • Experience with MPI.

  • Automation tooling background (Ansible, Salt, Puppet etc.).