Expoint - all jobs in one place

The point where experts and best companies meet

Limitless High-tech career opportunities - Expoint

Nvidia Senior Architect Large Scale Distributed Training 
Israel, North District 
258886560

24.06.2024

to be part of the technology seeding phase. In this position, you will invent, run proof of concepts,

be doing:

  • Learn our architecture with a focus on the technology that wedrive

  • OptimizeAI/MLmodeltrainingtimeatlargescale

  • Code and build proof-of-concept prototypes.

  • Design and define protocols and APIs forleveragingour technology in a datacenter

  • Research and evaluate algorithms currently used in relatedapplications

  • Participate in defining hardware and systemfeatures, andassistsoftwareandhardware groups in enablingnew technologies

What we need to see:

  • M.Sc. or equivalent experience in Electrical Engineering or Computer Science from a leading university

  • 3-5 years of proven experience in the industry, specifically inSWengineering,distributed AI system training

  • Familiarity with networking concepts, terms, and software stack

  • Passion for problem-solving and algorithms research and development

  • Background in distributed AI/MLmodelstraining onGPUs


Ways to stand out from the crowd:

  • Background in data center architecture

  • Experience withCollective Communications Library such asNCCL

  • good understanding of OS,driverand performance aspects of a system

  • Backgroundin network synchronization protocols such as IEEE 1588 PTP

  • Good command of Python, C/C++

in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment. Please contact us to request