Expoint – all jobs in one place
The point where experts and best companies meet
Limitless High-tech career opportunities - Expoint

Microsoft Customer Experience Engineering IC3 
Taiwan, Taoyuan City 
68671148

Yesterday
Qualifications
  • We are looking for a customer obsessed Site Reliability Engineer with extensive experience in implementing Service Level Objectives (SLOs) monitoring solutions to top Azure customers.
  • Experience : Atleast 6+years experience in driving platform reliability and customer satisfaction through proactive engagement, technical resolution, and cross-functional collaboration. Skilled in observability, automation, and translating operational insights into meaningful customer outcomes.

3+ years of experience in designing Observability and monitoring solutions in Azure(or AWS/GCP), SLO/SLI Implementation is a plus.

3+ years of experience in an external client facing role or customer handling.

  • Degree: Bachelor’s or master’s degree in computer engineering (or equivalent)
  • Customer Obsession : Passion for customers and focus on delivering the right customer experience.
  • Growth Mindset : Openness and ability to learn new skills and technologies in a fast-paced environment.
  • Excellent Communication : Must have the ability to empathize with customers and convey confidence. Able to explain highly technical issues to varied audiences. Able to prioritize and advocate customer’s needs to the proper channels. Take ownership and work towards a resolution.
  • Technical Skills :
    • Proven expertise in implementing and managing Service Level Objectives (SLOs) and Service Level Indicators (SLIs) for cloud customers.
    • Proven experience in designing and implementing monitoring solutions for customers.
    • Extensive experience with monitoring tools and platforms
    • Advanced certifications in SRE or related fields.
    • Experience in observability, SRE OpenTelemetry, Prometheus, Grafana, Dynatrace, Datadog, AzureMonitor, AI, ML
Responsibilities
  • Collaborate with customers to jointly define and establish SLOs and SLIs that align with their business goals and expectations.
  • Instrument code to measure SLOs , develop solutions to detect SLO breaches
  • Develop automated solutions and troubleshooting guides to remediate or mitigate SLO breaches.
  • Collaborate closely with service engineering teams to develop solutions for corelating customer-defined SLOs with relevant platform SLOs, signals to effectively pinpoint, address, and resolve customer-impacting issues.
  • Ensure customer-centric SLOs are consistently exceeded through cross-functional collaboration.
  • Analyze SLO data for trends, improvements, and reliability risks, proposing remediation plans.
  • Proactively engage customers on SLO performance, addressing concerns and offering insights.
  • Lead optimization efforts for system performance, scalability, and efficiency to exceed SLOs.
  • Develop and maintain documentation related to customer-specific SLOs, SLIs, and monitoring processes.