Expoint – all jobs in one place
Finding the best job has never been easier
Limitless High-tech career opportunities - Expoint

Microsoft Senior Software Engineer - CTJ Top Secret 
Taiwan, Taoyuan City 
338099955

25.09.2025

The Site Reliability Engineering (SRE) team provides leadership, direction and accountability for application architecture, system design, and end-to-end implementation. As a, you will identify and deliver software improvements using your expertise in software development, complexity analysis, and scalable system design. Collaboration skills will be required to work closely with other engineering teams to ensure services/systems are highly stable and performant, meeting the expectations of our government customers and users.The right candidate for this job (is):Excited about making better software and continuously improving the development, integration, and deployment processes.

Required/Minimum Qualifications:

  • Master's Degree in Computer Science, Information Technology, or related field AND 2+ years technical experience in software engineering, network engineering, or systems administration OR Bachelor's Degree in Computer Science, Information Technology, or related field AND 4+ years technical experience in software engineering, network engineering, or systems administration OR equivalent experience.

Other Requirements:

  • Security Clearance Requirements: Candidates must be able to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include, but are not limited to the following specialized security screenings:
    • Candidates must have an active TS and be willing to upgrade to TS/SCI (with polygraph) or have an active TS/SCI and be willing to upgrade to TS/SCI (with polygraph). This role will require candidates to maintain the TS/SCI (with polygraph) clearance. Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. Failure to maintain or obtain the appropriate clearance and/or customer screening requirements may result in employment action up to and including termination.
    • Clearance Verification : This position requires successful verification of the stated security clearance to meet federal government customer requirements. You will be asked to provide clearance verification information prior to an offer of employment.
    • Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.
  • Citizenship & Citizenship Verification:This position requires verification of U.S. citizenship due to citizenship-based legal restrictions. Specifically, this position supports United States federal, state, and/or local United States government agency customer and is subject to certain citizenship-based restrictions whererequiredorpermittedby applicable law. To meet this legal requirement, citizenship will be verified via a valid passport, or other approved documents, or verified US government Clearance

Preferred/Additional Qualifications:

  • Doctorate Degree in Computer Science, Information Technology, or related field AND 3+ years technical experience in software engineering, network engineering, or systems administration OR Master's Degree in Computer Science, Information Technology, or related field AND 6+ years technical experience in software engineering, network engineering, or systems administration OR Bachelor's Degree in Computer Science, Information Technology, or related field AND 8+ years technical experience in software engineering, network engineering, or systems administration OR equivalent experience.
    3+ years technical experience working with large-scale cloud or distributed systems.

Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here:

Technical Knowledge and Domain-Specific Expertise

  • Demonstrates end-to-endexpertisein distributed systems design, interactions between cloud technology layers and components, functions of physical network devices, and dependencies at scale. Drives efforts within an organization toidentifyand recommendoptimalconfigurations of cloud technology solutions and develops ormodifiesthe code base that defines infrastructures to improve the reliability and operability of supported products.
  • Develops end-to-end technicalexpertisein the architecture, code, features, and operations of specific products asrequiredto implement improvements in product availability, reliability, efficiency, observability, and/or performance. Drives code/design reviews with the engineering teams that develop and/or manage those products and shares learnings and recommendations across engineering teams working on related products within their organization.
  • Researches andmaintainsdeep knowledge of industry trends as well as advances in large-scale distributed systems and cloud technologies;identifiesopportunities to create, implement, and/or optimallyutilizenew tools, technologies, and/or processes to solve ambiguous problems and improve product availability, reliability, efficiency, observability, and/or performance. Drives the adoption ofnew solutions

Contributions to Development and Design

  • Leverages technical expertise in the infrastructure of large scale distributed systems and specific products, as well as objective insights drawn from analyses of production telemetry data to advocate for, or directly contribute to, changes tothe code base to improve the availability, reliability, efficiency, observability, and performance of related sets of products developed and supported by teams within an organization.
  • Develops, tests, and implements changes to optimize code and improve the observability, reliability and operability of platforms, systems, and products at scale. Reviews the effect of these changes to document and share development insights within their team.
  • Engages with product engineering teams within an organization by driving code/design reviews, hosting regular meetings, and participating in on-call rotations and incident responses throughout product development and operations cycles; leverages end-to-end technical expertise on underlying systems/platforms and insights from engagements with product engineering teams and telemetry analyses to propose scalable improvements in code and designs with attention to customer/business objectives and incident prevention.

Driving Operational Excellence

  • Develops code, scripts, systems, or platforms that automate moderately complex but repetitive operations processes (e.g., monitoring, alerting, deploying products and updates, debugging) at scale; reviews existing automation code and scripts to evaluate reusability, extendibility, and scalability within an organization.
  • Leverages end-to-end technical expertise and telemetry analysis to identify patterns and opportunities to implement configuration and data changes for related sets of platforms, systems, or products in production using code, tooling, and automation; identifies cases where teams lack the tools and/or capability to manage platforms, systems, or products using code and drives efforts within an organization to expand capabilities and/or tooling accordingly.
  • Leverages existing tools and automation to enable product engineering teams within their organization to increase the velocity in which they can reliably and safely implement changes in production;monitorsthe effects of changes across platforms or systems.
  • Analyzes data from telemetry pipelines and monitoring tools that detail operations metrics (e.g., availability, reliability, performance, efficiency) of systems, platforms, or productsoperatingat scale. Contributes to the development of new tooling and/or predictive models toidentifyand test potential improvements in product development and/oroperations, andmonitors the impact of changes on operations metrics (e.g., Time-to-X) within an organization.
  • Identifies optimal uses for existing tools and/or models to identify contributing factors or points of failure that are affecting the availability, reliability, performance, and/or efficiency of systems, platforms, or products; proposes andimplements solutions that resolve root cause(s) and prevent issues from occurring in related products by working with product engineering teams within an organization to test and deploy them to production.
  • Responds to incidents during regular on-call rotations by identifying the level of impact, troubleshooting complex issues, and deploying appropriate fixes to resolve root cause(s); alerts product teams, owners, and leadership to issues with major customer/business impact and escalates resolution of the highly complex, ambiguous, and impactful issues to include other engineering teams and/or subject matter experts as needed. Shares details related to incidents and their resolution through post-mortem reports and during regular review meetings.
  • Develops,maintains, andleveragescapacityplanning models and monitoring tools to forecast productcapacityand resource demands; models the predicted effect of changes tocapacityplans to optimize code bases to better manage resources in respond to dynamiccapacitydemands. May contribute to the development of automated resourceutilizationtools or processes that can dynamically scale compute resources up or down to adjust tocapacitydemands.
  • Draws insights from performance and resource monitoring across products within their organization to identify whether there is a need to optimize code, infrastructure, or architecture - or if changes to compute resources are required; uses advanced models to forecast and verify the efficacy of changes at scale and proposes solutions that are aligned with customer/business needs.
  • Shares insights and best practices that can be applied to improve development and operations across related sets of systems, platforms, and/or products. Continues to develop their understanding of insights and best practices through interactions with more experienced SREs and members of product engineering teams. Mentors and coaches more engineers to help themidentifyand propose relevant solutions.


Additional Responsibilities

  • Design, develop, and deliver engineering solutions that serve and protect M365 government clouds.
  • Own deployment, availability, reliability, performance and customer escalation targets for sovereign environments.
  • Proactively identify and reduce issues through design, testing, and implementation of software-based solutions.
  • Collaborate with Engineering and Program Management partners to translate customer, business, and technical requirements into architectural designs and feature releases.
  • Drive efficiencies through software improvement and root cause analysis resulting in service delivery, maturity, and scalability.
  • Develop, test, and implement changes to optimize code and improve platforms. You leverage end-to-end technical expertise and telemetry analysis to identify patterns and opportunities to implement configuration and data changes. You review the effect of changes to documents and share development insights within your team. You drive code/design reviews, host regular meetings, and participate in on-call rotations and incident responses throughout product development and operations cycles.
  • In addition, you respond to incidents during regular on-call rotations and share details related to incidents and their resolution through post-mortem reports and regular review meetings.
  • mbody our