Monitor, analyze, and troubleshoot performance and reliability issues across servers, networks, and applications to ensure high availability and optimal performance.
Serve as a key technical point of contact for both internal and external customers, handling service-related queries via calls and emails with professionalism and efficiency.
Manage support tickets, coordinate with L2/L3 teams and vendors, and ensure timely resolutions to maintain uninterrupted service and high customer satisfaction.
Perform incident, problem, and change management activities in alignment with ITSM, QMS, and ISMS frameworks; contribute to accurate and timely reporting.
Investigate and resolve alerts from tools like Zabbix, Icinga, Grafana, CloudWatch, and Azure Monitor, ensuring rapid diagnosis and appropriate escalation when necessary.
Oversee SLA compliance related to service uptime, response, and resolution timelines; conduct service transitions and gap analysis during handovers from project teams.
Troubleshoot infrastructure issues using strong foundational knowledge of Linux, networking, and databases.
Leverage experience in Azure and AWS environments to support cloud infrastructure operations and monitoring.
Collaborate with cross-functional teams to drive operational excellence and continuous improvement in system support and service delivery.
Who You Are
Bachelor's degree in Computer Science, Engineering, or related field (or equivalent experience).
The ideal candidate should have substantial technical proficiency with a minimum of two years of experience, possessing expertise in cloud environments (AWS and Azure), Linux server management, Python scripting, database administration, and various DevOps tools and technologies, along with a good understanding of Kubernetes and containerized applications.
Key Skills:
Skilled in identifying and troubleshooting system issues through alert analysis using tools such as Zabbix, Icinga, Grafana, Azure Monitor, and CloudWatch.
Solid understanding of Linux system administration and networking concepts.
Familiar with relational and NoSQL databases such as MySQL and MongoDB.
Knowledge of cloud platforms, including provisioning and monitoring on Microsoft Azure and Amazon Web Services (AWS).
Working knowledge of ITSM tools (e.g., ServiceNow, JIRA) and experience in managing SLAs and service reports.
Strong communication skills with a focus on customer service and technical troubleshooting.
Experience with enterprise support environments and understanding of change, incident, and problem management processes.
Ability to analyze data, identify recurring issues, and contribute to process improvements and automation.
Flexibility to work in rotational shifts to ensure comprehensive support coverage.