Required qualifications, capabilities, and skills
- Formal training or certification on infrastructure disciplines concepts and 3+ years applied experience
- Strong knowledge of one or more infrastructure disciplines such as hardware, networking terminology, databases, storage engineering, deployment practices, integration, automation, scaling, resilience, and performance assessments
- Experience with multiple cloud technologies with the ability to operate in and migrate across public and private clouds
- Drives to develop infrastructure engineering knowledge of additional domains, data fluency, and automation knowledge
- AWS Exposure (Understanding and working experience in AWS applications, and understanding of resiliency, scalability, observability, monitoring etc,)
- Experience in provisioning AWS infrastructure through Terraform
- Experience as SRE in complex and mission critical applications involving multitude of components of varying technical generations
- Deep proficiency in reliability, scalability, performance, security, enterprise system architecture, toil reduction, and other site reliability best practices with the ability to implement these practices within an application or platform
- Strong knowledge in site reliability culture and principles with demonstrated ability to implement site reliability within an application or platform
- Strong knowledge and experience in observability, monitoring, alerting, and telemetry collection using tools such as Cloudwatch, Grafana, Dynatrace, Prometheus, Splunk, etc.
- Fluency in at least one programming language such as (e.g., Python, Terraform, Ansible, Java Spring Boot, Shell Scripting, .Net, etc.)
Preferred qualifications, capabilities, and skills