Strong understanding of the Linux operating system and TCP/IP suite of networking protocols
Ability to design, author, and release code in languages like Go or Python
Hands-on experience managing large numbers of diverse systems with configuration management or software delivery platforms (such as Puppet, Chef, Ansible)
Familiarity with microservices architecture and container orchestration with Kubernetes
Preferred Qualifications
Excellent troubleshooting and problem solving skills
Bare metal management experience
Experience with deploying, supporting and monitoring new and existing services, platforms, and application stacks
Acute drive to automate manual operations and to improve them through repeated iteration
Experience with scale testing, disaster recovery, and capacity planning
Strong sense of ownership and integrity demonstrated through clear communication and collaboration
Experience in managing and scaling distributed systems in a public, private, or hybrid cloud environment
Experience with the Prometheus ecosystem
Good understanding of infrastructure observability principles
Education & Experience
BS/MS in Computer Science or Equivalent (5+ years of software development or production operations experience in a large-scale environment)