המקום בו המומחים והחברות הטובות ביותר נפגשים
• Perform thorough Root Cause Analysis (RCA) to identify, analyze, and resolve complex issues within Linux server infrastructure.
• Monitor, troubleshoot, and optimize the performance of Linux-based systems.
• Collaborate with cross-functional teams to gather data, replicate issues, and implement solutions.
• Create comprehensive RCA reports, system documentation, and knowledge base articles.
• Implement automation through scripting and configuration management tools to streamline diagnostic processes.
• Maintain security, compliance, and OS hardening across the infrastructure.
• Stay current with industry trends, technologies, and best practices to continuously improve systems and processes.
• Provide mentorship and detailed documentation to assist junior colleagues in implementing technical plans and adhering to best practices.
• 10+ years of related professional experience with a focus on system diagnostics and Root Cause Analysis (RCA).• Linux Systems: In-depth knowledge of Linux system internals, kernel architecture, process and memory management, filesystems, and system calls.
• Monitoring Tools: Proficiency with tools such as top, htop, vmstat, iostat, sar, ps, netstat, ss, etc.
• Logs and Tracing: Experience with journalctl, rsyslog, syslog-ng, dmesg, strace, lsof, etc.
• Networking: Advanced understanding of TCP/IP, network interfaces, routing, DNS, DHCP, firewalls, and diagnostic tools like ping, traceroute, tcpdump, wireshark, iftop, netcat, nmap, etc.
• Performance Analysis: Proficiency with tools like perf, systemd-analyze, iotop, blktrace, ioping, and benchmarks.
• Security Incident Management: Knowledge of security principles, OS hardening, compliance, and tools for vulnerability scanning and intrusion detection.
• Scripting and Automation: Strong knowledge of Shell scripting, Python, Perl, or other scripting languages, and Infrastructure-as-Code tools like Ansible, Puppet, Chef, or Terraform.
• Cloud Infrastructure: Experience with AWS, Azure, GCP, including services such as EC2, S3, IAM, VPC, security groups, and load balancers.
• Virtualization Technologies: Familiarity with Docker, Kubernetes, VMware, KVM, and other virtualization or containerization technologies.
• Analytical and Problem-Solving: Strong ability to analyze issues, identify root causes, and implement effective solutions systematically.
• Documentation: Ability to create clear and detailed RCA reports and technical documentation.
• Communication: Excellent communication and networking skills, with the ability to articulate findings and solutions to technical and non-technical stakeholders.
• Incident Management: Experience with ITIL or similar frameworks for incident management.
• Continuous Learning: Proactive in acquiring new knowledge and staying updated with the latest trends and technologies.
• Fluency in English, with excellent communication skills tailored towards explaining complex RCA findings.
• Monitoring Tools: Prometheus, Grafana.
• Log Management: Splunk.
• Diagnostic Tools: top, htop, vmstat, iostat, sar, ps, netstat, ss, tcpdump, wireshark, strace, lsof.
Job Segment:Cloud, ERP, Linux, System Administrator, Virtualization, Technology
משרות נוספות שיכולות לעניין אותך