Expoint - all jobs in one place

The point where experts and best companies meet

Limitless High-tech career opportunities - Expoint

Red hat Principal Software Engineer Site Reliability 
Australia 
980350348

26.06.2024

What you will do

  • Manage, deploy, and operate cloud solutions at scale using the principles of Site Reliability Engineering

  • Participate in the design and development of new features to enable OpenShift 'as-a-service' across multiple public clouds

  • Design and write automation software to provision, upgrade, monitor, and heal a large global fleet of OpenShift clusters deployed across multiple public clouds

  • Identify single points of failure and other high-risk architecture issues; propose and implement more resilient resolutions

  • Interact with multiple teams within Red Hat and with the open source community to contribute to both the upstream and downstream projects to deliver functionality

  • Participate in product release cycles, deploying code to integration, staging and production environments, integrating with CI/CD tooling, monitoring and change management

  • Perform software updates, peer code reviews, testing, and CVE analysis; respond to security threats

  • Interact with automated monitoring and healing infrastructure to ensure healthy environments

  • Provide engineering support to Red Hat's global technical support team to resolve customer issues

  • Help and develop peers through knowledge sharing, mentoring and collaboration

  • Create and maintain standard operating procedures (SOPs) for performing maintenance tasks, applying configuration changes and remediating problems in our environment

  • Participate in a follow-the-sun on-call rotation

What you will bring

  • 10+ years software engineering experience using object-oriented languages; golang is preferred

  • 5+ years experience managing Linux-based systems in a public cloud such as AWS, GCP, or Azure

  • 5+ years experience with enterprise systems monitoring; knowledge of Prometheus is preferred

  • 5+ years experience with enterprise configuration management such as Ansible, Puppet, or Chef

  • 3+ years experience delivering hosted cloud services

  • 1+ year experience with Kubernetes

  • 1+ year experience with containers on Linux

  • Superior communications skills and experience working directly with and presenting to customers

  • Ability to quickly learn new technologies and follow industry trends

  • Demonstrated ability to quickly and accurately troubleshoot systems issues

  • Solid understanding of standard TCP/IP networking and common protocols like DNS and HTTP