Expoint - all jobs in one place

Finding the best job has never been easier

Limitless High-tech career opportunities - Expoint

Microsoft Director Reliability Engineering 
United States, Washington 
449248443

Yesterday

Required Qualifications

  • Doctorate Degree in Mechanical Engineering, Materials Engineering, Reliability Engineering, Electrical Engineering, or related field AND 5+ years technical engineering experience
    • OR Master's Degree in Mechanical Engineering, Materials Engineering, Reliability Engineering, Electrical Engineering, or related field AND 7+ years technical engineering experience
    • OR Bachelor's Degree in Mechanical Engineering, Materials Engineering, Reliability Engineering, Electrical Engineering, or related field AND 8+ years technical engineering experience.
  • 5+ Years of Management including resource planning, career development and performance management.

Other Requirements

Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include but are not limited to the following specialized security screenings: Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter.

Preferred Qualifications:

  • MBA in engineering management or operations.
  • Experience with cloud fleet management, telemetry, diagnostic and troubleshooting of IT systems.
  • Experience and knowledge in the server industry product development process.
  • Experience in leading system engineering teams in both NPI and Sustaining lifecycles, and managing suppliers.
  • Experience and background developing design specifications and or developing product requirement documents.
  • Experience with system reliability, manufacturing process and datacenter operations, leading continuous improvements through automation
  • Experience with liquid cooling infrastructure for IT racks

Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here:

Responsibilities

As a Director, Reliability Engineering, you will be responsible for the following:

  • Leading the Cloud System and Components Reliability Engineering organization with an ability to operate in a fast-paced environment, transforming ambiguity into clarity.
  • Leading strategic innovations and developing processes which integrate industry practices to ensure scalability and efficiency to achieve high reliability and quality performance.
  • Leading by example and coaching to inspire team members to grow and develop in the field of System and Components Reliability Engineering.
  • Leading retrospective and deep dives to drive root cause and corrective actions to prevent future escapes.
  • Combine technical and process expertise with in-depth understanding of cloud operations, to optimize reliability solutions for future server and storage products.
  • Define, facilitate and manage integration of architecture, design, manufacturing, operation, troubleshooting and diagnostic methods to optimize cloud infrastructure reliability.
  • Participate in, and approve, mechanical, thermal, electrical, telemetry & diagnostic design reviews to ensure system reliability requirements are properly implemented.
  • Drive System Reliability Readiness of new cloud platforms landing in Microsoft Datacenters.
  • Support Hardware Systems Group development, deployment and sustaining teams from system concept to decommission. Work with cross-functional strategic teams on process optimizations and inter-related strategic initiatives.
  • Develop key metrics to evaluate system reliability program’s performance and build implementation plans to confirm our performance and compliance against program metrics and internal company requirements.