The point where experts and best companies meet

Nvidia Deep Learning Solutions Architect – Inference Optimization
United Kingdom, England, Southampton
413501350

20.10.2025

UK, Remote

Poland, Remote

Spain, Remote

Switzerland, Remote

Germany, Remote

What you will be doing:

Work directly with key customers to understand their technology and provide the best AI solutions.
Perform in-depth analysis and optimization to ensure the best performance on GPU architecture systems (in particular Grace/ARM based systems). This includes support in optimization of large scale inference pipelines.
Partner with Engineering, Product and Sales teams to develop, plan best suitable solutions for customers. Enable development and growth of product features through customer feedback and proof-of-concept evaluations.

What we need to see:

Excellent verbal, written communication, and technical presentation skills in English.
MS/PhD or equivalent experience in Computer Science, Data Science, Electrical/Computer Engineering, Physics, Mathematics,other Engineeringfields.
5+ years workor research experience with Python/ C++ / other software development
Work experience and knowledge of modern NLPincluding good understandingof transformer, state space, diffusion, MOE model architectures. This can includeeither expertise intraining oroptimization/compression/operationof DNNs.
Understanding of key libraries used for NLP/LLM training (such as Megatron-LM,NeMo, DeepSpeed etc.)and/or deployment(e.g. TensorRT-LLM, vLLM,Triton Inference Server).
Enthusiastic about collaborating with various teams and departments—such as Engineering, Product, Sales, and Marketing—this person thrives in dynamic environments and stays focused amid constant change.
Self-starter with demeanor for growth, passion forcontinuous learning andsharing findings across the team.

Ways to Stand Out from The Crowd:

Demonstrated experience in running and debugging large-scale distributed deep learning training or inferenceprocesses.
Experience working with larger transformer-based architectures for NLP, CV, ASR or other.
Applied NLP technology in productionenvironments.
Proficient with DevOps tools including Docker, Kubernetes, andSingularity.
Understanding of HPC systems: data center design, high speed interconnect InfiniBand, Cluster Storage and Scheduling related design and/or management experience.

These jobs might be a good fit

Nvidia Deep Learning Solutions Architect – Large Scale Inference Op... United Kingdom, England, Southampton

Get to the top of the "yes list" with a standout CV!

CREATE CV