Expoint – all jobs in one place
המקום בו המומחים והחברות הטובות ביותר נפגשים
Limitless High-tech career opportunities - Expoint

Microsoft Site Reliability Engineer 
Taiwan, Taoyuan City 
151692491

16.10.2025

Firmware DeploymentCloud Hardware Infrastructure Engineering (CHIE) organizationis responsible forbuilding and

We are seeking awithin the Firmware Deployment team, you will be instrumental in shaping the future of the Azure Fleet.Your primary responsibility will involve developing and applying stable firmware releases across the GPU fleet, as well as potentially supporting other related environments.This work is essential toan outstanding

Your efforts in deploying and managing firmware updates will ensure the reliability and efficiency of Azure’s hardware infrastructure. By focusing on stability and operational excellence, you will help safeguard system health and contribute to the ongoing success and growth of Azure’s global infrastructure.

Required/minimum qualifications:

  • Master's Degree in Computer Science, Information Technology, or related field AND 2+ years technical experience in software engineering, network engineering, or systems administration OR Bachelor's Degree in Computer Science, Information Technology, or related field AND 4+ years technical experience in software engineering, network engineering, or systems administration
    • OR equivalent experience.
  • 3+ years of experiencein software engineering or operations for large-scale distributed systems.
  • Ability to support a 24x7 data center environment, including participation in an on-call rotation and availability during non-standard business hours(evening, nights, weekends, or holidays) as operational needs require.
  • Proficiencyin one or more programming languages (C#, Python, Go, or similar).
  • Understanding of cloud infrastructure (Azure preferred), networking, and system design.
  • Familiarity with monitoring tools, incident management frameworks, and DevOps practices.
  • Problem-solving and debugging skills.

Other Requirements:

Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include, but are not limited to the following specialized security screenings:

This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter.


Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here:

Responsibilities
  • Build and bring specializedknowledgeacross multiple production aspects (monitoring, release engineering, testing, live site excellence, buildout, performance optimization, capacity management)
  • Analyze large-scale telemetry and operational data to uncover insights and drive data-informed decisions.
  • Use the proven set of principles and practices such as safe deployment, testing for reliability, single point of failures elimination, disaster recovery, SLOs based monitoring, throttling, infrastructure management automation, post-mortem excellence, and adoption of common systems
  • Respond to alerts and incidents.
  • Build and follow playbooks to driveroot cause analysis and reviews
  • identifyopportunities for predictive analytics.
  • Participatein an on-call rotation and availability during non-standard business hoursand contribute to service reliability and incident resolution.