Bachelor's degree in Computer Science or equivalent hands-on experience in Software Engineering.
You are a subject matter expert on multiple aspects of engineering, including designing for cloud resiliency, computer engineering, optimization, and performance
The SRE Staff Engineer will hold our team accountable for high quality designs through periodic reviews (including api's, operational plans, data models)
You demonstrate a deep understanding of deployment architecture and consistently design for low MTTD/MTTR & fast recovery/self healing
You have experience transforming metrics and raw data into dashboards and actionable alerts using tools like Grafana, Wavefront, Prometheus, and Splunk
You have experience defining Service Level Indicators (SLIs) and Service Level Objectives (SLOs) to help teams balance reliability with engineering velocity.
You’re curious and comfortable with troubleshooting production incidents and debugging across interconnected systems and layers (including network, command line, and application).
You have experience writing well-tested, observable, maintainable code in one or more programming languages. We primarily use PHP, Python, and Golang, but prior experience with these languages is not a requirement
You have strong collaboration and communication skills, with the ability to work effectively in cross-functional teams
You are comfortable working in complex production environments and seek out ways to drive ambiguity down. You seek to understand before making changes, and actively work to facilitate communication to better understand other approaches to problem