EXPECTATIONS AND TASKS
As an SRE in the SAP BTP SRE Team, you will have the opportunity to operate and support business critical Cloud services. As part of your daily job, you will proactively monitor the service behavior and identify areas for improvement. You will participate in the development of tools for monitoring and troubleshooting cloud services built on latest open source and SAP technologies, following SRE principles.
What you’ll do
As a Site Reliability Engineer (SRE) you will:
- Build software-based solutions to address improvements in service stability and reliability
- Enhance infrastructure and platform monitoring by gathering system metrics (4 Golden SRE Signals) and implementing tools for recovery
- Strive for and enable proactive approach to Incident Management, alerting and recommended actions to reduce risk of failure, apply new SRE tendencies
- Integrate and collaborate closely with development teams and work with them on outputs from Postmortems and product improvements
- Participate in the weekend and overnight On-Call rotation for reacting to major incidents.It comes with special compensation package!
- Act as technical expert during Live site incidents (downtimes of supported services in scope), investigate and solve incidents on a deep technical level
- Contribute to root cause analysis and follow-up improvements to prevent issues from reoccurring
- Perform in-depth troubleshooting and log analysis to identify and solve complex issues in accordance with internal and external SLAs
- Learn new technologies and keep up to date with latest development increments
- Be able to leverage artificial intelligence (AI) and machine learning (ML) technologies
What you bring
Following skills and competencies will bring you closer to meeting with us:
- BSc degree in Computer Science, Software Engineering, Telecommunications or related technical area
- Good understanding of Cloud Platforms such as AWS, Azure, GCP, AliCloud
- Experience with Linux/Unix and Bash
- Enthusiasm for automation - make the computers do the work for you
- Working efficiently in critical situations and affinity to quickly analyze and solve problems
- Excellent team player, passionate about his/her work, self-motivated and driven
- Excellent communication skills - precise, based on facts
- Fluency in English
If you have knowledge and experience in any of the following areas it will be considered an advantage:
- Programing languages like Python, Java, Go etc.
- Monitoring, logging, and alerting tools (Dynatrace, Grafana, Kibana)
- CI/CD tools (e.g. Jenkins, Concourse)
- Experience working with virtual machines and container technologies (e.g. Cloud Foundry, Kubernetes, Docker)
- Git, GitHub, Terraform, Databases
- Networking knowledge
- Experience working in an Agile environment
- Experience with OpenStack
- Database (e.g. PostgreSQL) administration and support