Required qualifications, capabilities, and skills
- Formal training or certification on SRE concepts and 5+ years applied experience
- Strong organizational skills to manage multiple releases simultaneously.
- Excellent communication and teamwork abilities.
- Ability to collaborate across various levels and stakeholder groups.
- Expertise in site reliability practices, including reliability, scalability, and security.
- Proficiency in programming languages, especially Python & Java
- Skilled in continuous integration and delivery tools like Jenkins and Terraform
Preferred qualifications, capabilities, and skills
- Knowledge of software applications and technical processes with depth in AWS, Kubernetes, and Data Engineering technologies like Spark, Python.
- Familiar with container and container orchestration (e.g. Kubernetes, Docker, etc.).
- Familiar with Apache Iceberg and its integration with data engineering pipelines.
- Analytical and problem-solving skills with a focus on root cause analysis.