Responsibilities:
- The Senior Linux & Cloud Administrator is responsible for 24x7 availability of SAP systems running in SAP ECS;
- Respond, troubleshoot and resolve alerts & incidents in the Linux OS or infra layer – Azure hyperscaler.
- Manage Azure services and resources , including virtual machines,
storage and network. - Monitor and manage Azure infrastructure to ensure optimal performance, scalability, and security.
- Manage Azure virtual networks , subnets, routing, and network security groups.
Implement monitoring solutions and configure alerts to proactively monitor the Azure environment. - Develop and maintain automation scripts (e.g., PowerShell) to streamline routine tasks and optimize processes.
- Collaborate with internal teams to understand and address their Azure requirements and challenges.
- Maintain detailed documentation of Azure configurations, procedures, and best practices.
- Follow change management processes during service request execution;
- Seek opportunities to streamline standard operating procedures through automation;
- Must feel comfortable working in a fast paced, dynamic and flexible environment;
- Support Operations 24/7 model with oncall/on duty/ weekend tasks/activities based on the shift schedule.
- Apply ITIL incident management processes to ensure timely resolution.
Experience (Role Requirements):
- Experience: 8+ years of professional experience in Linux system administration with a demonstrated ability to perform trouble shooting and incident handling.
Technical Skills:
- Expert-level Linux knowledge : Deep understanding of Linux internals, kernel architecture, process and memory management, filesystems, and system calls.
- Cloud Infrastructure : Experience with Azure Cloud.
- Troubleshooting and Diagnostics : Mastery of tools like top, htop, vmstat, iostat, sar, ps, netstat, ss, journalctl, rsyslog, dmesg, strace, lsof, tcpdump, wireshark, perf, systemd-analyze.
- Networking: Advanced understanding of TCP/IP, network interfaces, routing, DNS, DHCP, firewalls, and diagnostic tools.
- Backup and Restore:
- Infra service: Knowledge of infra services like DNS, LDAP etc.
- Security: Strong knowledge of security principles, OS hardening, compliance, and tools for vulnerability scanning and intrusion detection.
- Scripting and Automation : Proficiency in Shell scripting, Python, and/or other scripting languages. Experience with Infrastructure-as-Code tools like Ansible, Puppet, Chef, or Terraform.
- Virtualization: Familiarity with Docker, Kubernetes, and other virtualization technologies.
Soft Skills:
- Analytical and Problem-Solving : Exceptional ability to analyze complex issues, identify root causes, and implement effective solutions.
- Communication: Excellent written and verbal communication skills, with the ability to explain technical findings clearly to both technical and non-technical audiences.
- Documentation : Ability to create clear, concise, and comprehensive technical documentation.
- Incident Management : Experience with ITIL or similar incident management frameworks.
- Continuous Learning: A strong commitment to ongoing learning and professional development.
Language Skills:
- Fluency in English is essential.
Tools and Technologies:
- Monitoring Tools: Prometheus, Grafana
- Log Management: Splunk
- Diagnostic Tools: (As listed in technical skills)
Job Segment:Linux, Cloud, ERP, Operations Manager, System Administrator, Technology, Operations