Develop, maintain, and enhance software-based solutions to achieve improvements in service stability, reliability, and operations
Act as technical expert during incidents, investigate and solve incidents on a deep technical level. Perform troubleshooting and log analysis to identify and solve issues in accordance with internal and external SLAs
Drive root cause analysis and follow-up improvements to prevent reoccurring issues
Learn new technologies and keep up to date with latest product releases
Work with tools like Concourse, GitHub & GitHub Actions, Grafana, Prometheus,Prow
Use programming languages like Go, Python and Bash
Understanding Kubernetes is a must! Certified Kubernetes Administrator (CKA) is a plus or taking the certificate is expected within a year
Voluntarily weekend OnCall duties
Experience with Istio, Linux and security hardening procedures is a plus.
What you bring
A degree in computer science, software engineering or similar (Bachelor/Master)
Good understanding of modern cloud architectures. Experience with Cloud Platforms such as AWS, GCP (and especially GKE) and Azure is a plus
Enthusiasm for automation and software development in general
Working efficiently in emergency situations. Affinity to quickly analyze and solve problems in a worldwide team setup
Excellent team player, be passionate about your work
Excellent communication skills – be precise, based on facts
Being our colleague, you will…
Help Gardener service meet its SLAs
Be engaged with the Gardener and CNCF/Kubernetes community, report or follow up on upstream issues
Regularly share your knowledge andachievementswith colleagues.