Expoint - all jobs in one place

Finding the best job has never been easier

Limitless High-tech career opportunities - Expoint

Microsoft Senior Site Reliability Engineer 
India, Karnataka, Bengaluru 
703426567

16.07.2024

. This is a world of, more innovation, more openness, and the skya cloud-enabled world.data platform for the age of AI, powering a new class of data-first applications and driving a data culture.

As a Senior Site Reliability Engineer, you will identify and deliver software improvements using your expertise in software development, complexity analysis, and scalable system design to ensure services/systems are highly stable, performant, and meeting the expectations of our customers. You will work closely with other engineering teams and provide a holistic view of our cloud service.

better served

Qualifications

Bachelor's degree in computer science/Engineering/related fields or equivalent industry experience.6+ years of experience with writing tools, automation / scripting (Powershell, Python or similar), programming (C++, C# or equivalent) and making enhancements in subcomponents within and around services/products to deliver and manage software in production.6+ years of troubleshooting/debugging experience: telemetry-based analysis (KQL or equivalent preferred), troubleshooting skills across network, hardware, and distributed service layers, with demonstrated ability to debug, fix, and optimize code.Good communications skills, both verbal and written.

Other Requirements

to meet Microsoft, customer and/or government security screening requirementsfor this role. These requirements include, but are not limited to the following specialized security screenings: Microsoft Cloud Background Check:

  • This position will berequiredto pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.


Preferred/Additional Qualifications

Experience aiding understanding of distributed systems and networking is preferred.

Responsibilities

Identify opportunities and drive the design and implementation of end-to-end telemetry, alerting, self-healing and automation capabilities to improve service health, manageability, and reliability.Participate in on-call rotations and own, triage, investigate and resolve service issues with an emphasis on broad communications, learning & teaching throughout the process.Own availability, performance, and supportability targets for the service.Author functional and technical documentation and remain current on relevant technologies and procedures.

Embody our