Expoint – all jobs in one place
המקום בו המומחים והחברות הטובות ביותר נפגשים
Limitless High-tech career opportunities - Expoint

Nvidia Senior Test Development Engineer – Datacenter GPU Systems 
United States, Texas 
121526767

16.09.2025
US, CA, Santa Clara
US, TX, Austin
US, OR, Hillsboro
time type
Full time
posted on
Posted 8 Days Ago
job requisition id

What You'll Be Doing:

  • Innovate & Build – Design and implement novel test plans, tools, and automation frameworks to validate GPU functionality, performance, and reliability in complex datacenter environments.

  • Safeguard Data Integrity – Develop groundbreaking stress tests and methodologies to detect, characterize, and eliminate silent data errors.

  • Build the Future of Hardware – Partner with architecture and silicon construction teams to influence system and chip-level features that improve diagnostics, debuggability, and root-cause analysis.

  • Deep Dive Debugging – Analyze test results, investigate complex failures, and drive solutions in close collaboration with design, firmware, and software teams.

  • Lead & Mentor – Provide technical leadership, guide junior engineers, and shape validation strategy across datacenter product lines.

What We Need to See:

  • BS/MS in Electrical Engineering, Computer Engineering, Computer Science, or related field (or equivalent experience).

  • 8+ years of experience in hardware validation, test development, or datacenter hardware engineering.

  • Expert programming skills in Python and/or C/C++ for automation and tool development.

  • Deep Linux/Unix expertise, including advanced shell scripting.

  • Strong knowledge of server architecture: CPUs, GPUs, PCIe, networking, and storage.

  • A hard-working, proactive approach with a proven ability to own and deliver complex projects.

Ways to Stand Out From the Crowd:

  • Hands-on experience with NVIDIA GPU architecture (Hopper, Ampere) and software stack (CUDA, NCCL).

  • Experience testing high-speed interconnects such as NVLink or InfiniBand.

  • Familiarity with AI/ML or HPC benchmarking and stress-testing tools.

  • Proven track record of identifying and resolving critical bugs in pre-production hardware.

You will also be eligible for equity and .