Expoint - all jobs in one place

The point where experts and best companies meet

Limitless High-tech career opportunities - Expoint

Microsoft Senior Site Reliability Engineer 
United States, Massachusetts 
522218020

30.07.2024

As a Site Reliability Engineer, you will be an integral member of a team within HLS Solutions that is working to empower clinicians to achieve more with groundbreaking healthcare-oriented copilots and provide a secure, scalable, reliable solution. The candidate will be excited about waking up every morning to apply their skills in automation, Continuous Integration/Continuous Deployment (CI/CD), and Infrastructure as code (IAC) to develop and deploy new technologies and experiences centered around driving positive healthcare outcomes.

Minimum Qualifications:

  • 6+ years technical experience in software engineering, network engineering, or systems administration
    • OR Bachelor's Degree in Computer Science, Information Technology, or related field AND 3+ years technical experience in software engineering, network engineering, or systems administration
    • OR Master's Degree in Computer Science, Information Technology, or related field AND 2+ years technical experience in software engineering, network engineering, or systems administration.
  • 2+ years of experience with Azure Cloud.

Other Requirements:

Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include, but are not limited to, the following specialized security screenings:

  • Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.

Preferred Qualifications:

  • 7+ years technical experience in software engineering, network engineering, or systems administration
    • OR Bachelor's Degree in Computer Science, Information Technology, or related field AND 4+ years technical experience in software engineering, network engineering, or systems administration
    • OR Master's Degree in Computer Science, Information Technology, or related field AND 3+ years technical experience in software engineering, network engineering, or systems administration
    • OR Doctorate Degree in Computer Science, Information Technology, or related field.

Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here:Microsoft will accept applications for the role until August 12, 2024.


Responsibilities

Responsibilities:

Responsibilities include:

  • Demonstrates expertise in distributed systems design, interactions between cloud technology layers and components, common dependencies at scale, and the code that defines infrastructures. Can identify and recommend configurations optimal of cloud technology solutions and modify the code base that defines systems or cloud technologies to improve the reliability and operability of supported products with minimal guidance from other engineers.
  • Develops an understanding of the code, features, and operations of specific products at scale as required to contribute to incremental improvements in product availability, reliability, efficiency, observability, and/or performance; participates in on-boarding, code/design reviews, and regular meetings with the engineering teams that develop and/or manage those products.
  • Researches and maintains an awareness in industry trends, advances in distributed systems and cloud technologies, new tools, and/or processes for maintaining and improving product availability, reliability, efficiency, observability, and/or performance. Contributes to the implementation of new solutions within their team by identifying ways they can be applied to solve persistent problems.
  • Leverages technical expertise in large scale distributed systems and specific products, as well as objective insights drawn from analyses of production telemetry data to suggest changes or add-ons to product features or code to improve the availability, reliability, efficiency, observability, and performance of product components or features supported by their team.
  • Independently develops code or scripts that automate the performance of repetitive and easily scalable operations processes (e.g., monitoring, alerting, deploying products and updates) across components and features of products operating at scale.
  • Independently uses existing tools and/or models to troubleshoot problems or flaws affecting the availability, reliability, performance, and/or efficiency of components and features; proposes solutions that will resolve and prevent recurring issues and brings them to the attention of their Site Reliability Engineering (SRE) and/or product engineering teams.
  • Embody our and