Required qualifications, capabilities, and skills
- Bachelor’s Degree in Computer Science, Engineering, Mathematics or other related disciplines
- Minimally 5 years of site reliability engineering or related experience
- Minimally 3 years of network engineering or related experience.
- Ability to contribute to large and collaborative teams by presenting information in a logical and timely manner with compelling language and limited supervision
- Ability to proactively recognize road blocks and demonstrates interest in learning technology that facilitates innovation
- Experience in observability such as white and black box monitoring, service level objective alerting, and telemetry collection using tools such as Grafana, Dynatrace, Prometheus, Datadog, Splunk, and others
- Familiarity with troubleshooting common networking technologies and issues
- Experience with one or more application performance management technologies (AppDynamics, Dynatrace, Riverbed SteelCentral, Prometheus)
- Ability to initiate and implement ideas to solve business problems
- Experience triaging and diagnosing issues in complex distributed architectures leveraging infrastructure and application telemetry
- Experience with one or more infrastructure automation technologies (Ansible, Terraform, Puppet, building APIs and services using REST, SOAP, etc.)