Role and Responsibilities
We are seeking a highly skilled GPU Performance Architect (PPA) to analyze, verify, and optimize the performance of our GPU system, identifying bottlenecks, prototyping and proposing solutions to improve end-to-end performance. You will be responsible for executing and analyzing performance of our GPU, including, but not limited to, shader-level analysis, pipeline optimization, system-level performance characterization, verification and prototyping, using programming languages such as C/C++ and Python, and leveraging tools such as waveform level debugging tools (e.g. Synopsys Verdi) and cycle approximate performance model.
- You have a strong background in GPU architecture, performance analysis, and verification, with experience in waveform level debugging and RTL debugging.
- You have a good understanding of the entire GPU system, with a focus on performance analysis, verification, and optimization.
- You develop tools to analyze performance data, identify bottlenecks, and propose solutions to improve end-to-end performance of the GPU system
- You do prototyping works based on performance models and RTL, analyze the results and propose GPU hardware optimizations.
- You debug RTL code, including simulation, waveform analysis, andidentifying/correctingfunctional errors to ensure correct implementation of GPU architecture.
- You verify and validate the performance of a GPU core, ensuring it meets the required specifications
- You analyze system performance using various metrics, such as latency, throughput, and power consumption, to identify bottlenecks and optimization opportunities
- You develop and run benchmarks to characterize system performance and identify areas for improvement
- You stay up-to-date with the latest developments in graphics technology, including new APIs, tools, and methodologies
Skills and Qualifications
Minimum Requirements:
- 5+ years of experience with a Bachelor’s degree in Computer Science/Computer Engineering/relevant technical field, or 3+ years of experience with a Master’s degree, or 1+ years of experience with a PhD
- Strong background in GPU system level architecture (not limited to a specific block), performance analysis, verification, and optimization
- Understanding of RTL (System Verilog/Verilog) is a must – basic to intermediate level
- Experience with prototyping GPU optimizations is preferred.
- Experience with waveform level debugging tools (e.g., Synopsys Verdi) and RTL debugging
- Proficiency in C/C++ and Python programming languages
- Ability to work on the execution side, with a focus on performance analysis and optimization
- Excellent analytical and problem-solving skills, with the ability to identify bottlenecks and propose solutions to improve performance.
- You are a team player, with excellent communication and collaboration skills, and experience working with cross-functional teams.
Nice to Have:
- Experience with GPU profiling tools (e.g., RenderDoc, PIX, AMD RGP, Nvidia Nsight) to analyze and optimize graphics performance, power consumption, and system-level interactions
- Knowledge of OpenGL, Vulkan, DX11/12
- Familiarity with mobile platforms
U.S. Export Control
This position requires the ability to access information subject to U.S. export control restrictions. Applicants must have the ability to access export-controlled information or be eligible to receive a government authorization to access export-controlled information.