Share
What you'll be doing:
Analyze state-of-the-art DL networks (LLM etc.), identify and prototype performance opportunities to influence SW and Architecture team for NVIDIA's current and next gen inference products.
Develop analytical models for the state-of-the-art deep learning networks and algorithm to innovate processor and system architectures design for performance and efficiency.
Specify hardware/software configurations and metrics to analyze performance, power, and accuracy in existing and future uniprocessor and multiprocessor configurations.
Collaborate across the company to guide the direction of next-gen deep learning HW/SW by working with architecture, software, and product teams.
What we need to see:
BS, MS or PhD in relevant discipline (CS, EE, Math, etc.) or equivalent experience.
5+ years’ work experience.
Experience with popular AI models (e.g., LLM and AIGC models)
Be familiar with typical deep learning SW framework (e.g.,Torch/JAX/TensorFlow/TensorRT)
Knowledge and experience on hardware architectures for deep learning applications
Ways to stand out from the crowd:
Background with CUDA and GPU computing systems
Experience on performance modelling or optimization of DL workloads
These jobs might be a good fit