Perform daily system monitoring, verifying the integrity and availability of all hardware, resources, systems, and key processes, reviewing system and application logs, and verifying completion of scheduled jobs.
Provide support per request from various constituencies. Investigate and troubleshoot issues.
Repair and recover from hardware or software failures. Coordinate and communicate with impacted infrastructure.
Perform preventative maintenance (and upgrade, as required) on devices, and related peripherals to meet IT specifications.
Essential Requirements:
12+ Years Industry experience, strong experience in IBM Spectrum Scale, Redhat Linux.
Experience in Observability & Log Collection (Prometheus and Grafana).
Experience in RedHat Ansible & Python.
Knowledge in DevOps Skills
Desirable Requirements
Knowledge on Observability & log collection (Prometheus and Grafana)