Expoint - all jobs in one place

The point where experts and best companies meet

Limitless High-tech career opportunities - Expoint

Red hat Site Reliability Engineer - OpenShift 
India, Karnataka, Bengaluru 
617810750

07.07.2024

What will you do:

  • Applies software engineering principles to the operations domain.

  • Contributes to a service's codebase, writes automation that aids in the management of a service, and performs operational engineering work to support a service's Service Level Objectives (SLO).

  • Ensures service reliability meets users’ needs, including internally critical and externally visible services

  • Uses software & systems engineering to design, build, and run large-scale, distributed, fault-tolerant systems

  • Focuses on iterative improvement through toil reduction and error-budget enforcement

  • Interfaces with both cloud IaaS and SaaS providers and internal stakeholders, including Support, IT, and Product Engineering, to achieve desired outcomes.

  • Participates in an on-call rotation within a geographically distributed team to provide 24x7x365 production support, with responsibility to respond to urgent customer issues

  • Practice sustainable incident response and blameless postmortems

  • Work within a small agile team to develop and improve SRE methodologies, support your peers, plan and self-improve

  • Provide feedback around bugs and feature improvements to the various Red Hat Product Engineering teams

What will you bring:

  • Bachelor's degree in computer science or a related technical field involving software or systems engineering, or practical experience demonstrating interest in SRE

  • 2+ years of experience of using cloud providers and technologies (Google, Azure, Amazon, OpenStack, etc.)

  • 1+ years of experience administering a kubernetes-based production environment

  • 2+ years of experience programming with at least one object-oriented language; Golang, or Python are a big plus

  • Ability to collaboratively troubleshoot and solve problems in a team setting

  • Basic understanding of UNIX or Linux operating systems The following will be considered a plus:

  • Demonstrated comfort with collaboration, open communication, and reaching across functional boundaries

  • Passion for understanding users’ needs and delivering outstanding user experiences

Additional Skills:

  • Demonstrated ability to quickly and accurately troubleshoot system issues

  • Solid understanding of standard TCP/IP networking and common protocols like DNS and HTTP

  • 2+ years of experience managing Linux servers running Red Hat Enterprise Linux (RHEL), CentOS, or Fedora hosted at a cloud provider such as Amazon Web Services (AWS), Google Compute Engine (GCE), or Microsoft Azure

  • 1+ years of experience with enterprise systems monitoring

  • 2+ years of experience with enterprise configuration management software like Red Hat Ansible Automation Platform (AAP)

  • Experience with static code analysis tools

  • Some experience with code deployment across cloud-based environments

  • Some experience with continuous Integration and continuous deployment approaches

  • Some experience working with complex distributed systems

  • Demonstrated ability to debug, optimize code and automate routine tasks

  • Ability to work with minimal supervision and as part of a global team, and problem solving skills

  • Experience working with agile development methodologies