Expoint - all jobs in one place

מציאת משרת הייטק בחברות הטובות ביותר מעולם לא הייתה קלה יותר

Limitless High-tech career opportunities - Expoint

Nike Principal Software Engineer Site Reliability 
United States, Oregon, Beaverton 
480666674

27.03.2025

As a Principal Site Reliability Engineer you will:

  • Partner with leaders in product, engineering, business, and operations to identify and address risks, vulnerabilities, and limits in our end-to-end systems

  • Technically lead and mentor the SRE team with a focus towards improving the availability, reliability, and observability of Nike’s digital platforms while reducing the burden of toil using tooling, automation, or process change

  • Use your technical expertise to identify training and up-skilling opportunities, monitor industry trends, and define new reliability patterns for the broader organization

  • Influence systems design decisions and patterns across business-value engineering teams, infrastructure teams, and architecture

  • Make the life of on-call engineers safe by delivering deep observability, actionable alerts and runbooks, and iterative Service Level Objectives that truly align with consumer experience

  • Strategically define a multi-year roadmap in collaboration with peer engineering teams, geo partners, and product management teams

  • Identify, curate, implement, and adapt key metrics for end-to-end system health and performance

WHO YOU WILL WORK WITH

The Principal Site Reliability Engineer will work alongside a talented team of Site Reliability Engineers focused on delivering reliable and observable software used by millions of athletes* around the world. You will be a part of the Resilience Engineering organization which includes Site Reliability Engineering, Quality & Release Engineering, Accessibility Engineering, and High Availability/Disaster Recovery. This role reports to the Senior Director, Reliability Engineering

In order to deliver Reliability Engineering goals, you will partner and influence at multiple levels of not only Global Technology (Director up to CTO), but across business units and geographical locations.

WHAT YOU BRING

  • 12+ years combined work experience as a software engineer, team lead/principal engineer, or manager leading distributed teams

  • Deep understanding of how to deliver large scale software with modern reliability and resilience concepts (multi-region, multi-cloud, active/active, canary deploys, synthetic testing, containers, etc.)

  • Hands-on experience architecting, deploying, and operating software using modern cloud-based distributed system techniques, micro-service architecture patterns, and DevOps processes

  • Expertise in data structures, algorithms, and complexity analysis. Experience with AI Ops, AI/ML a plus

  • Ability to build strong relationships with partners/stakeholders and use technical credibility and influence to drive positive outcomes

  • Demonstrated experience implementing Service Level Objectives, error budgets, and the associated cultural change

  • A history of finding and reducing toil within complex systems and processes

  • Experience with modern observability tooling, processes, and mindset – Splunk, SignalFx, New Relic, CatchPoint, etc. Bonus points for experience with Open Source observability stacks

  • A passion for learning, teaching, and mentoring

  • A strong desire for building and motivating teams focused on data-driven continuous improvement