Expoint - all jobs in one place

Finding the best job has never been easier

Limitless High-tech career opportunities - Expoint

IBM Site Reliability Engineer 
United Kingdom, England, London 
835041924

24.06.2024

Your Role and Responsibilities
Your roles & responsibilities:

What you’ll do

  • You’ll be responsible for supporting development teams by implementing SRE practices, standards and processes across Intelligent Automation.
  • You’ll continuously improve the service levels provided to our end users, thinking about application stability, customer satisfaction and strengthening incident and problem management practises.
  • You’ll ensure our products and services are monitored proactively, so we know about any issues before our customers do.
  • You’ll understand metrics that will keep you informed of where to make improvements to get the highest value through e.g. automation and simplification.

As a Site Reliability Engineer you’ll:

  • Work closely with application teams to ensure their products, service and tools follow SRE best practices relating to observability, reliability, scalability and resilience.
  • Proactively look to improve day to day SRE tasks through automation and collaboration.
  • Take ownership of incident management and participate in post incident reviews and carry out any remediation and/or mitigation work.
  • Participate in Change Management, including engagement with stakeholders and senior level engineers/developers.
  • Have a clear understanding of CI/CD processes and route to live implementation.


Required Technical and Professional Expertise

  • Among your strengths will be your passion for SRE practices including SLIs, SLOs, Error budgets and Toil removal.
  • You will be proficient with CI/CD tools (e.g. Jenkins, Spinnaker, Azure DevOps & GCP)
  • You should be able to define and monitor success metrics and build dashboards using tools such as Dynatrace, Kibana and Splunk.
  • You will have proficiency in one or more programming languages (e.g. Java, Javascript or Python) to help with automation.
  • You will possess good communication skills, verbal and written, to ensure documentation is clear and concise.


Preferred Technical and Professional Expertise

  • Have broad technical knowledge across a range of platforms, cloud computing (Azure, Google Cloud) and networking.
  • Have a keen interest in keeping up to date on emerging technologies and methodologies.
  • Have the ability to translate business and technical requirements or problems into solutions.