Expoint – all jobs in one place
The point where experts and best companies meet
Limitless High-tech career opportunities - Expoint

Nvidia Deep Learning Solutions Architect – Inference Optimization 
United Kingdom, England, Southampton 
413501350

Today
UK, Remote
Poland, Remote
Spain, Remote
Switzerland, Remote
Germany, Remote
time type
Full time
posted on
Posted 5 Days Ago
job requisition id

What you will be doing:

  • Work directly with key customers to understand their technology and provide the best AI solutions.

  • Perform in-depth analysis and optimization to ensure the best performance on GPU architecture systems (in particular Grace/ARM based systems). This includes support in optimization of large scale inference pipelines.

  • Partner with Engineering, Product and Sales teams to develop, plan best suitable solutions for customers. Enable development and growth of product features through customer feedback and proof-of-concept evaluations.

What we need to see:

  • Excellent verbal, written communication, and technical presentation skills in English.

  • MS/PhD or equivalent experience in Computer Science, Data Science, Electrical/Computer Engineering, Physics, Mathematics,other Engineeringfields.

  • 5+ years workor research experience with Python/ C++ / other software development

  • Work experience and knowledge of modern NLPincluding good understandingof transformer, state space, diffusion, MOE model architectures. This can includeeither expertise intraining oroptimization/compression/operationof DNNs.

  • Understanding of key libraries used for NLP/LLM training (such as Megatron-LM,NeMo, DeepSpeed etc.)and/or deployment(e.g. TensorRT-LLM, vLLM,Triton Inference Server).

  • Enthusiastic about collaborating with various teams and departments—such as Engineering, Product, Sales, and Marketing—this person thrives in dynamic environments and stays focused amid constant change.

  • Self-starter with demeanor for growth, passion forcontinuous learning andsharing findings across the team.


Ways to Stand Out from The Crowd:

  • Demonstrated experience in running and debugging large-scale distributed deep learning training or inferenceprocesses.

  • Experience working with larger transformer-based architectures for NLP, CV, ASR or other.

  • Applied NLP technology in productionenvironments.

  • Proficient with DevOps tools including Docker, Kubernetes, andSingularity.

  • Understanding of HPC systems: data center design, high speed interconnect InfiniBand, Cluster Storage and Scheduling related design and/or management experience.