Expoint – all jobs in one place
The point where experts and best companies meet

Deep Learning Performance Architect jobs at Nvidia in India, Bengaluru

Discover your perfect match with Expoint. Search for job opportunities as a Deep Learning Performance Architect in India, Bengaluru and join the network of leading companies in the high tech industry, like Nvidia. Sign up now and find your dream job with Expoint
Company (1)
Job type
Job categories
Job title (1)
India
Bengaluru
27 jobs found
23.11.2025
N

Nvidia Senior GPU System Architect India, Karnataka, Bengaluru

Limitless High-tech career opportunities - Expoint
Architect multi-GPU system topologies for scale-up and scale-out configurations, balancing AI throughput, scalability, and resilience. Define, modify and evaluate future architectures for high-speed interconnects such as NVLink and Ethernet co-designed...
Description:
India, Bengaluru
time type
Full time
posted on
Posted 4 Days Ago
job requisition id

What you will be doing:

  • Architect multi-GPU system topologies for scale-up and scale-out configurations, balancing AI throughput, scalability, and resilience.

  • Define, modify and evaluate future architectures for high-speed interconnects such as NVLink and Ethernet co-designed with the GPU memory system.

  • Collaborate with other teams to architect RDMA-capable hardware and define transport layer optimizations for GPU-based large scale AI workload deployments.

  • Use and modify system models, perform simulations and bottleneck analyses to guide design trade-offs.

  • Work with GPU ASIC, compiler, library and software stack teams to enable efficient hardware-software co-design across compute, memory, and communication layers.

  • Contribute to interposer, package, PCB and switch co-design for novel high-density multi-die, multi-package, multi-node rack-scale systems consisting of hundreds of GPUs.

What we need to see:

  • BS/MS/PhD in Electrical Engineering, Computer Engineering, or equivalent area.

  • 8 years or more of relevant experience in system design and/or ASIC/SoC architecture for GPU, CPU or networking products.

  • Deep understanding of communication interconnect protocols such as NVLink, Ethernet, InfiniBand, CXL and PCIe.

  • Experience with RDMA/RoCE or InfiniBand transport offload architectures.

  • Proven ability to architect multi-GPU/multi-CPU topologies, with awareness of bandwidth scaling, NUMA, memory models, coherency and resilience.

  • Experience with hardware-software interaction, drivers and runtimes, and performance tuning for modern distributed computing systems.

  • Strong analytical and system modeling skills (Python, SystemC, or similar).

  • Excellent cross-functional collaboration skills with silicon, packaging, board, and software teams.

Ways to stand out from the crowd:

  • Background in system design for AI and HPC.

  • Experience with NICs or DPU architecture and other transport offload engines.

  • Expertise in chiplet interconnect architectures or multi-node fabrics and protocols for distributed computing.

  • Hands-on experience with interposer or 2.5D/3D package co-design.

Show more
15.11.2025
N

Nvidia Senior Architect GPU SoC Modelling India, Karnataka, Bengaluru

Limitless High-tech career opportunities - Expoint
Modeling and analysis of graphics and / or SOC algorithms and features. Work in a matrixed environment, across the different modelling teams, to document, design, develop tools to analyze and...
Description:
India, Bengaluru
time type
Full time
posted on
Posted Today
job requisition id

What you'll be doing:

  • Modeling and analysis of graphics and / or SOC algorithms and features

  • Work in a matrixed environment, across the different modelling teams, to document, design, develop tools to analyze and simulate, validate, and verify models

  • Familiarize with the different models (functional and performance) that are used at Nvidia and help with feature implementation as required.

  • Develop tests, test plans, and testing infrastructure for newarchitectures/features.

  • Mentor younger engineers and help sustain good coding practices.

  • Learn about newer modelling techniques and frameworks, evaluate the best solution for our needs and work with your manager to drive the change

  • Help develop AI based tools to increase efficiency.

What we need to see:

  • Bachelors Degree (or equivalent experience) in a relevant discipline (Computer Science, Electrical Engineering or Computer Engineering)

  • 8+ years of relevant work experience or MS with 5+ years of experience or PhD with 2+ years of experience

  • Strong programming ability: C++, C along with a good understanding of build systems (CMAKE, make), toolchains (GCC, MSVC) and libraries (STL, BOOST)

  • Computer Architecture background with experience in performance modeling with C++ and SystemC preferred

  • Familiarity with Docker, Jenkins, Python, Perl

  • Excellent communication and interpersonal skills and ability to work in a distributed team environment.

Show more

These jobs might be a good fit

10.11.2025
N

Nvidia Senior Verification Engineer Performance India, Karnataka, Bengaluru

Limitless High-tech career opportunities - Expoint
Responsibilities will include development of test plans and strategies, develop simulation environments, system bring-up, validation, and automation to deliver best-in-class CPUs. Develop and maintain CPU simulator infrastructure, hardware CPU test...
Description:
India, Bengaluru
time type
Full time
posted on
Posted Today
job requisition id

What will you be doing:

  • Responsibilities will include development of test plans and strategies, develop simulation environments, system bring-up, validation, and automation to deliver best-in-class CPUs.

  • Develop and maintain CPU simulator infrastructure, hardware CPU test and performance infrastructure.

  • Analyze and validate CPU and fabric performance, helping to understand current, and guide the development of future CPU products.

  • Definition and development of tool chain and workflows that enables the full system performance alignment.

  • Silicon based competitive analysis of NVIDIA CPUs.

What we need to see:

  • Master's or Bachelor's degree in EE/CS or equivalent experience

  • 5+ years of experience preferably in the areas of CPU / SOC Performance Verification and Analysis

  • Strong understanding of computer system architecture and operating system fundamentals.

  • Hands-on experience with HDLs such as Verilog / System Verilog.

  • Knowledge of verification methodologies and tools for IP and SoC level verification.

  • Experience with System Verilog, C/C++, Python languages and relevant frameworks.

  • Background with debug on Silicon.

Ways to stand out from the crowd:

  • Detailed knowledge of the ARM and/or x-86 architecture.

  • Prior experience with performance analysis of CPUs.

  • Experience with analysis and characterization of CPU workloads.

Show more

These jobs might be a good fit

09.11.2025
N

Nvidia DGX Cloud Performance Engineer India, Karnataka, Bengaluru

Limitless High-tech career opportunities - Expoint
Develop benchmarks, end to end customer applications running at scale, instrumented for performance measurements, tracking, sampling, to measure and optimize performance of important applications and services;. Construct carefully designed experiments...
Description:
India, Bengaluru
India, Hyderabad
India, Pune
time type
Full time
posted on
Posted 16 Days Ago
job requisition id

What you will be doing:

  • Develop benchmarks, end to end customer applications running at scale, instrumented for performance measurements, tracking, sampling, to measure and optimize performance of important applications and services;

  • Construct carefully designed experiments to analyze, study and develop critical insights into performance bottlenecks, dependencies, from an end to end perspective;

  • Develop ideas on how to improve the end to end system performance and usability by driving changes in the HW or SW (or both).

  • Collaborate with AI researchers, developers, and application service providers to understand internal developer and external customer pain points, requirements, project future needs and share best practice.

  • Develop the necessary modeling framework and the TCO (total cost of ownership) analysis to enable efficient exploration and sweep of the architecture and design space

  • Develop the methodology needed to drive the engineering analysis to Inform the architecture, design and roadmap of DGX Cloud

What we need to see:

  • Expertise in working with large scale parallel and distributed accelerator-based system systems

  • Expertise optimizing performance and AI workloads on large scale systems

  • Experience with performance modeling and benchmarking at scale

  • Strong background in Computer Architecture, Networking, Storage systems, Accelerators

  • Familiarity with popular AI frameworks (PyTorch, TensorFlow, JAX, Megatron-LM, Tensort-LLM, VLLM) among others

  • Experience with AI/ML models and workloads, in particular LLMs as well as an understanding of DNNs and their use in emerging AI/ML applications and services

  • Bachelors/Masters in Engineering or equivalent experience (preferably, Electrical Engineering, Computer Engineering, or Computer Science)

  • 10 years experience in the above areas

  • Proficiency in Python, C/C++

  • Expertise with at least one of public CSP infrastructure (GCP, AWS, Azure, OCI, …);

Ways to stand out from the crowd:

  • PhD in the relevant areas

  • Very high intellectual curiosity; Confidence to dig in as needed; Not afraid of confronting complexity; Able to pick up new areas quickly;

  • Proficiency in CUDA, XLA

  • Excellent interpersonal skills

Show more

These jobs might be a good fit

08.11.2025
N

Nvidia Architect - Performance Verification Analysis India, Karnataka, Bengaluru

Limitless High-tech career opportunities - Expoint
Performance analysis/ bottleneck analysis of complex, high performance GPUs and System-on-Chips (SoCs). Work on hardware models of different levels of extraction, including performance models, RTL test benches and emulators to...
Description:
India, Bengaluru
time type
Full time
posted on
Posted 2 Days Ago
job requisition id

What you'll be doing:

  • Performance analysis/ bottleneck analysis of complex, high performance GPUs and System-on-Chips (SoCs).

  • Work on hardware models of different levels of extraction, including performance models, RTL test benches and emulators to find performance bottlenecks in the system.

  • Work closely with the architecture and design teams to explore architecture trade-offs related to system performance, area, and power consumption.

  • Understand key performance usecases for the product. Develop workloads and test suites targeting graphics, machine learning, automotive, video, compute vision applications running on these products.

  • Drive methodologies for improving turnaround time, finding representative data-sets and enabling performance analysis early in the product development cycle.

  • Develop required infrastructure including performance simulators, testbench components and analysis tools.

What we need to see:

  • BE/BTech or MS/MTech, or equivalent experience in relevant area, PhD is a plus.

  • 3+ years of relevant experience dealing with system level architecture and performance issues.

  • Strong understanding of System-on-Chip (SoC) architecture, graphics pipeline, CPU architecture, memory subsystem architecture and Network-on-Chip (NoC)/Interconnect architecture.

  • Solid programming (C/C++) and scripting (Bash/Perl/Python) skills. Exposure to Verilog/System Verilog, SystemC/TLM is a strong plus.

  • Strong debugging and analysis (including data and statistical analysis) skills, including use of RTL dumps to debug failures.

  • Exposure to performance simulators, cycle accurate/approximate models or emulators for pre-silicon performance analysis is a plus.

  • Excellent communication and organization skills.

  • Ability to work in a global team environment.

Ways to stand out from the crowd:

  • Strong background in System Level Performance aspects for Graphics and High Performance Computing.

  • Exposure to GPU application programming interfaces like CUDA, OpenGL, DirectX.

  • Expertise in data analysis and visualization.

Show more

These jobs might be a good fit

26.10.2025
N

Nvidia Senior DGX AI Cloud Performance Analysis Tools Engineer India, Karnataka, Bengaluru

Limitless High-tech career opportunities - Expoint
Develop AI performance tools for large scale AI systems providing real time insight into applications performance and system bottlenecks. Conduct in-depth hardware-software performance studies. Define performance and efficiency evaluation methodologies....
Description:
India, Bengaluru
India, Hyderabad
India, Pune
time type
Full time
posted on
Posted 5 Days Ago
job requisition id

What you'll be doing:

  • Develop AI performance tools for large scale AI systems providing real time insight into applications performance and system bottlenecks.

  • Conduct in-depth hardware-software performance studies

  • Define performance and efficiency evaluation methodologies

  • Automate performance data analysis and visualization to convert profiling data into actionable optimizations

  • Support deep learning software engineers and GPU architects in their performance analysis efforts

  • Work with various teams at NVIDIA to incorporate and influence the latest technologies for GPU performance analysis

What we need to see:

  • Minimum of 8+ years of experience insoftware infrastructure and tools

  • BS or higher degree in computer science or similar (or equivalent experience)

  • Adept programming skills in multiple languages including C++ and Python

  • Solid foundation in operating systems and computer architecture

  • Outstanding ability to understand users, prioritize among many contending requests, and build consensus

  • Passion for “it just works” automation, eliminating repetitive tasks, and enabling team members

Ways to stand out from the crowd:

  • Experience in working with the large scale AI cluster

  • Experience with CUDA and GPU computing systems

  • Hands-on experience with deep learning frameworks (TensorFlow, PyTorch, JAX/XLA etc.)

  • Deep understanding of the software performance analysis and optimization process

Show more

These jobs might be a good fit

Limitless High-tech career opportunities - Expoint
Architect multi-GPU system topologies for scale-up and scale-out configurations, balancing AI throughput, scalability, and resilience. Define, modify and evaluate future architectures for high-speed interconnects such as NVLink and Ethernet co-designed...
Description:
India, Bengaluru
time type
Full time
posted on
Posted 4 Days Ago
job requisition id

What you will be doing:

  • Architect multi-GPU system topologies for scale-up and scale-out configurations, balancing AI throughput, scalability, and resilience.

  • Define, modify and evaluate future architectures for high-speed interconnects such as NVLink and Ethernet co-designed with the GPU memory system.

  • Collaborate with other teams to architect RDMA-capable hardware and define transport layer optimizations for GPU-based large scale AI workload deployments.

  • Use and modify system models, perform simulations and bottleneck analyses to guide design trade-offs.

  • Work with GPU ASIC, compiler, library and software stack teams to enable efficient hardware-software co-design across compute, memory, and communication layers.

  • Contribute to interposer, package, PCB and switch co-design for novel high-density multi-die, multi-package, multi-node rack-scale systems consisting of hundreds of GPUs.

What we need to see:

  • BS/MS/PhD in Electrical Engineering, Computer Engineering, or equivalent area.

  • 8 years or more of relevant experience in system design and/or ASIC/SoC architecture for GPU, CPU or networking products.

  • Deep understanding of communication interconnect protocols such as NVLink, Ethernet, InfiniBand, CXL and PCIe.

  • Experience with RDMA/RoCE or InfiniBand transport offload architectures.

  • Proven ability to architect multi-GPU/multi-CPU topologies, with awareness of bandwidth scaling, NUMA, memory models, coherency and resilience.

  • Experience with hardware-software interaction, drivers and runtimes, and performance tuning for modern distributed computing systems.

  • Strong analytical and system modeling skills (Python, SystemC, or similar).

  • Excellent cross-functional collaboration skills with silicon, packaging, board, and software teams.

Ways to stand out from the crowd:

  • Background in system design for AI and HPC.

  • Experience with NICs or DPU architecture and other transport offload engines.

  • Expertise in chiplet interconnect architectures or multi-node fabrics and protocols for distributed computing.

  • Hands-on experience with interposer or 2.5D/3D package co-design.

Show more
Find your dream job in the high tech industry with Expoint. With our platform you can easily search for Deep Learning Performance Architect opportunities at Nvidia in India, Bengaluru. Whether you're seeking a new challenge or looking to work with a specific organization in a specific role, Expoint makes it easy to find your perfect job match. Connect with top companies in your desired area and advance your career in the high tech field. Sign up today and take the next step in your career journey with Expoint.