The point where experts and best companies meet
Share
Key Responsibilities:
Design and prototype advanced compute arithmetic units (e.g., MAC arrays, ALUs) for GPUs targeting AI and deep learning workloads.
Develop and optimize GPU microarchitectures to enhance performance, energy efficiency, and scalability for AI-specific applications.
Create and refine RTL implementations to validate and benchmark new design concepts.
Conduct detailed performance modelling and analysis to identify bottlenecks and propose innovative solutions for next-generation GPU designs.
Produce comprehensive power and area estimates for proposed designs, enabling informed trade-off analysis and decision-making.
Collaborate with cross-functional teams, including software, hardware, and machine learning experts, to align architecture design with application requirements.
Research and integrate emerging technologies and methodologies in GPU compute design for AI workloads.
Lead the evaluation of design trade-offs in terms of performance, area, and power metrics.
Drive innovation in custom compute unit design, ensuring compatibility with broader GPU pipeline architecture.
Required:
Master's or Ph.D. in Electrical Engineering, Computer Engineering, Computer Science, or a related field.
Proven related experience in GPU/ASIC architecture design, with a focus on compute arithmetic via course work or relevant projects.
Expertise in microarchitecture design and RTL coding (e.g., SystemVerilog).Strong understanding of GPU pipelines, parallel computing concepts, and AI/ML workloads.
Proven experience in designing and optimizing MAC arrays, ALUs, or similar compute units.
Solid knowledge of hardware modelling and simulation tools (e.g., VCS, Synopsys, ModelSim).Experience in producing and interpreting power and area estimates for complex hardware designs.
Proficiency in performance analysis tools and techniques.
Strong problem-solving skills with the ability to innovate and think out of the box.
Preferred:
Familiarity with high-level synthesis (HLS) tools and methodologies.
Background in machine learning algorithms and their hardware acceleration.
Understanding of power optimization techniques and methodologies for compute-intensive hardware.
Requirements listed would be obtained through a combination of industry relevant job experience, internship experiences and or schoolwork/classes/research.
These jobs might be a good fit