Responsibilities:
- The Senior Linux & Cloud Administrator is responsible for 24x7 availability of SAP systems running in SAP ECS;
- Respond, troubleshoot and resolve alerts & incidents in the Linux OS or infra layer – VMWare/ Azure/ AWS/ GCP/ BareMetal;
- Be a L3/L4 escalation point for production issues;
- Follow change management processes during service request execution;
- Seek opportunities to streamline standard operating procedures through automation;
- Must feel comfortable working in a fast paced, dynamic and flexible environment;
- Operate effectively in a global 24x7 International environment.
- To take the On Call / On Duty / Weekend task based on shift schedule.
- Work closely with team members to troubleshoot and resolve escalated incidents promptly.
- Apply ITIL incident management processes to ensure timely resolution.
What you bring; (Role Requirements):
- Experience: 8+ years of professional experience in Linux system administration with a demonstrated ability to perform trouble shooting and incident handling.
- Technical Skills:
- Expert-level Linux knowledge: Deep understanding of Linux internals, kernel architecture, process and memory management, filesystems, and system calls.
- Troubleshooting and Diagnostics: Mastery of tools like top, htop, vmstat, iostat, sar, ps, netstat, ss, journalctl, rsyslog, dmesg, strace, lsof, tcpdump, wireshark, perf, systemd-analyze.
- Networking: Advanced understanding of TCP/IP, network interfaces, routing, DNS, DHCP, firewalls, and diagnostic tools.
- Security: Strong knowledge of security principles, OS hardening, compliance, and tools for vulnerability scanning and intrusion detection.
- Scripting and Automation: Proficiency in Shell scripting, Python, and/or other scripting languages. Experience with Infrastructure-as-Code tools like Ansible, Puppet, Chef, or Terraform.
- Cloud Infrastructure: Experience with AWS, Azure, or GCP, including core services.
- Virtualization: Familiarity with Docker, Kubernetes, and other virtualization technologies.
- Soft Skills:
- Analytical and Problem-Solving: Exceptional ability to analyze complex issues, identify root causes, and implement effective solutions.
- Communication: Excellent written and verbal communication skills, with the ability to explain technical findings clearly to both technical and non-technical audiences.
- Documentation: Ability to create clear, concise, and comprehensive technical documentation.
- Incident Management: Experience with ITIL or similar incident management frameworks.
- Continuous Learning: A strong commitment to ongoing learning and professional development.
Language Skills:
- Fluency in English is essential.
Tools and Technologies:
- Monitoring Tools: Prometheus, Grafana
- Log Management: Splunk
- Diagnostic Tools: (As listed in technical skills)
Job Segment:Cloud, Linux, ERP, Operations Manager, System Administrator, Technology, Operations