Expoint - all jobs in one place

מציאת משרת הייטק בחברות הטובות ביותר מעולם לא הייתה קלה יותר

Limitless High-tech career opportunities - Expoint

Microsoft Senior AI Hardware Quality Engineer 
United States, Washington 
411838537

Yesterday

your technicalby working in ato discover, define, and deliverstorage innovation at Cloud-scale


Required Qualifications:

  • Master's Degree in Electrical Engineering, Computer Engineering, or related field AND 3+ years technical engineering experience

    o OR Bachelor's Degree in Electrical Engineering, Computer Engineering, or related field AND 5+ years technical engineering experience

    o OR equivalent experience.

  • 5+ years of work experience in managing manufacturing quality in the electronic industry.
  • 5+ years of direct engineering experience in hardware system issue resolution for GPU Servers.
  • 5+ years debugging data, i.e. telemetry and logs to identify and investigate HW failure signatures.

Other Requireements:

Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include but are not limited to the following specialized security screenings: Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter.

Preferred Qualifications:

  • Bachelor's Degree inElectrical Engineering, Computer Engineering,or related field AND 7+ years experience in technical engineering
    • OR 9+ years equivalent experience.
  • Patent or track record of engineering excellency.
  • 12+ years of experience in working with the modern server architectures – includes understanding of GPU, CPU methods for failure analysis, debugging or validation.
  • 8+ years of system level server debugging with anunderstanding of platform, power, system and network environments
  • 3+ years of direct GPU related engineering experience in issue debug/test log review.
  • Leadership skills and ability to collaborate with diverse teams and drive a call to action.
  • Expert of root cause analysis and corrective action methods to identify contributing factors of production defects.
  • Ability to analyze large data sets, extract key insights, and effectively present and communicate the results.
  • Proficient communication and project management skills.


Find additional benefits and pay information here:

Microsoft will accept applications for the role until January 29, 2025.

Responsibilities
  • Develop and implement a robust supplier quality management strategy to ensure the data center hardware is manufactured at the highest level of quality standards.
  • Lead quality issues and improvement task force to contain, mitigate, and resolve the top-quality issues impacting global data centers.
  • Conduct debug and failure analysis for GPU subsystems in the Azure fleet and drive resolution with partners and suppliers.
  • Drive the continuous improvement process based on Root Cause Analysis (RCA) and identified opportunities.
  • Responsible for quality readouts based on your telemetry data analysis, to bring clarity on status, actions across the organization and next steps for issue resolution.
  • Establish Critical-to-Quality performance metrics to measure and improve product quality.
  • Act as the voice of quality in the hardware change management process, ensuring quality requirements are considered and met and improved.
  • Embody our