The point where experts and best companies meet

Limitless High-tech career opportunities - Expoint

IBM Site Reliability Engineer
Ireland
141238669

04.09.2024

Primary Roles & Responsibilities:
In this Site Reliability Engineer role, you will work closely with several Data Centers, the entire Cloud organization and IBM vendors to support, maintain and operationally improve the IBM cloud infrastructure. You will focus on the following key responsibilities: This is a shift position-The shift will be 4pm to 12.30 am Sun-Thurs or Tues to Saturday.

Monitor the health of production and test systems 24×7
Ability to respond promptly to production issues and alerts 24×7
Execute changes in the production environment through automation
Partner with other SRE teams and program managers to deliver mission-critical services to the market
Support development of new and existing capabilities for our compute, storage and network infrastructure services
Implement and automate infrastructure solutions that support IBM Cloud products and infrastructure
Support the compliance and security integrity of the environment
Work with Engineering to:
- Provide initial assessment and possible workaround of production issue
- Troubleshoot and resolve production issues
Work with Support and Development teams to:
- Identify and resolve issues
- Discuss and plan integration tasks
Provide technical escalation support for other Infrastructure Operations teams

Required Technical and Professional Expertise

Excellent written and verbal communication skills
Experience in hands-on production administration of large systems and environment
Experience establishing and improving procedures within a mission critical environment
Must be efficient in writing and debugging scripts
Must be extremely comfortable using and navigating within a Linux environment
Ability to do low level debugging and problem analysis by examining logs and running Unix commands
Knowledge in Monitoring Technologies, Virtualization Technologies and Automation / Configuration Managements
- Monitoring technologies: Zabbix (preferred), Grafana, Nagios, ELK, Splunk, etc. (at least one)
- Virtualization technologies: Citrix Xen Hypervisor (Preferred), KVM(also preferred), libvirt, VMware vSphere, etc. (at least one)
- Automation and configuration management tools/solutions: Ansible, Salt, Chef, python, bash, puppet, Rundeck, etc. (at least one)
Working knowledge with ServiceNow, JIRA, Confluence, and GitHub

Working knowledge with Container technologies: Kubernetes (preferred), Docker, etc.
Preferred Technical and Professional Expertise

• 2+ years of experience with GitHub, Perl and Python
• 2+ years of experience in virtualization environments such as AWS /Softlayer/Zen/VMWARE

These jobs might be a good fit

Google Site Reliability Engineer Ireland, Dublin

IBM