Expoint - all jobs in one place

מציאת משרת הייטק בחברות הטובות ביותר מעולם לא הייתה קלה יותר

Limitless High-tech career opportunities - Expoint

Red hat Site Reliability Engineer 
United States, Massachusetts, Boston 
510409701

05.01.2025

The SPRE (Software Production Resilience) team is seeking a Site Reliability Engineer (SRE) with passion for maintaining highly reliable cloud-based services. In this role, you will support Red Hat’s software manufacturing services on our hybrid cloud infrastructure. You will partner with development, quality engineering and release engineering colleagues to support the health and well-being of the infrastructure hosting Software Production services. Maintaining service monitoring, improving automation and upholding security best practices will be your daily work. You will participate in communities of practice to coordinate and influence the design of our hybrid cloud platform. You will be co-responsible for defining Service Level Indicators (SLIs) and Service Level Objectives (SLOs) for the services the team needs to support stakeholders, and executing remediation plans if the SLOs are not met.

  • In this role, you are expected to respond in a timely manner during a critical outage, and participate in learning events to identify improvements that will make our services more resilient. Join us and our passion in helping Red Hat to produce world-class open source software.

  • What you will do:

  • Design, build, and revise CI/CD systems

  • Work in a geographically distributed team

  • Configure and maintain service infrastructure

  • Write automation and documentation to make service maintenance faster, easier, and less error-prone

  • Coordinate your actions with other Red Hat teams such as IT Platforms, Storage and Network and ensure our internal cloud deployment meets expectations

  • Provide consulting on infrastructure health, status, stability, and enhancements to other internal teams

  • Contribute to and highlight the requirement to enforce best practices and change management for the infrastructure supporting software production services

  • Migrate software production services from legacy environments to our hybrid cloud infrastructure

  • Assess and champion opportunities to use Red Hat's emerging solutions in our engineering pipeline

  • Develop best practices around next generation deployment patterns like service-mesh and serverless; migrate advanced projects to those patterns

  • Implement monitoring, alerting, and escalation plans in the event of an infrastructure outage or performance problem

  • Work with service owners to co-define SLIs and SLOs for the services your team relies on, ensure they are met, and execute remediation plans if they are not

  • What you will bring:

  • Linux administration experience

  • Working knowledge of AWS technologies like S3, DynamoDB, Lambda, CloudFront, CloudFormation, IAM, KMS and Kinesis

  • Ability to work Hybrid in Raleigh NC, Durham NC, Boston MA or Lowell MA

  • Experience with container-related technologies like Kubernetes

  • Experience with CI/CD platforms like GitHub Actions and Jenkins

  • Experience with automation services like Ansible or Terraform

  • Ability to understand graphically represented concepts and architectures in documentation

  • Excellent written and verbal communication skills in English, as you'll be working in a globally distributed team

  • The following skills will be considered a plus:

  • Previous experience with SRE model is a plus

  • Experience with OpenTelemetry or Prometheus is a plus

  • Experience with software development using Python or GoLang will be considered a plus

  • Advance understanding of networking and security practices will be considered a plus

The salary range for this position is $74,900.00 - $119,830.00. Actual offer will be based on your qualifications.

Pay Transparency

● Comprehensive medical, dental, and vision coverage

● Flexible Spending Account - healthcare and dependent care

● Health Savings Account - high deductible medical plan

● Retirement 401(k) with employer match

● Paid time off and holidays

● Paid parental leave plans for all new parents

● Leave benefits including disability, paid family medical leave, and paid military leave