Expoint - all jobs in one place

מציאת משרת הייטק בחברות הטובות ביותר מעולם לא הייתה קלה יותר

Limitless High-tech career opportunities - Expoint

Red hat Senior Site Reliability Engineer 
Canada, Ontario 
660206244

26.06.2024

What You'll Do

The day-to-day responsibilities of an SRE involve working with live systems and coding automation. As an SRE you will be expected to:

  • Contribute code to increase the scalability and reliability of the service

  • Contribute software tests and participate in peer review to increase the quality of our codebase

  • Help and develop peers’ capabilities through knowledge sharing, mentoring, and collaboration

  • Participate in a regular on-call schedule, including occasional paid weekends and holidays

  • Practice sustainable incident response and blameless postmortems

  • Resolve customer issues escalated from the Red Hat Global Support team

  • Work within a small agile team to develop and improve SRE software, support your peers, plan and self-improve

What You'll Bring

  • Bachelor's degree in Computer Science or related technical field; or equivalent experience

  • Programming experience in at least one of the following languages: Python, Golang, Java, C, C++ or another object-oriented language

  • Experience working with public clouds such as AWS, GCP, or Azure

  • Ability to collaboratively troubleshoot and solve problems in a team setting

  • Experience troubleshooting an as-a-service offering (SaaS, PaaS, etc.)

  • Experience working with complex distributed systems. Direct experience with Kubernetes or OpenShift is a plus. We like to see a demonstrated ability to debug, optimize code and automate routine tasks. We are Red Hat, so you need a basic understanding of Unix/Linux operating systems.

Desired skills

  • Demonstrated ability to debug, optimize code and automate routine tasks

  • 2+ years of experience programming with at least one object-oriented language; Golang, Java, or Python are preferred

  • 2+ years of experience delivering a hosted service

  • 2+ years of experience managing Linux servers running Red Hat Enterprise Linux (RHEL), CentOS, or Fedora hosted at a cloud provider such as Amazon Web Services (AWS), Google Compute Engine (GCE), or Microsoft Azure

  • 3+ years of experience with enterprise systems monitoring; knowledge of Prometheus is a plus

  • 3+ years of experience with enterprise configuration management software like Ansible by Red Hat, Puppet, or Chef

  • Demonstrated ability to quickly and accurately troubleshoot system issues

  • Solid understanding of standard TCP/IP networking and common protocols like DNS and HTTP

  • Solid communications skills and experience working directly with and presenting to customers

  • 1+ year(s) of experience with Kubernetes is a plus

  • 1+ year(s) of experience with docker-based containers is a plus