Provide business facing technology application support to business and operations groups across the Asia Pacific region.
Manage production technology incidents up to resolution, ensuring timely engagement, escalation and effective communication to business, technology and vendor partners
Perform database troubleshooting of data and report issues.
Drive automation initiatives to reduce manual tasks
Improve support documentations and procedures
Perform review and implementation of technology Application & Infrastructure release, Disaster Recovery exercises and Patch management.
Work within a follow-the-sun support model with global counterparts
Act as Subject Matter Expert (SME) for key applications, responsible for maintaining global best practice and hygiene standards
Perform root cause analysis post incident analysis, identifying, tracking and implementing preventative measures
Act as a key contributor in the continued development of tools, frameworks & techniques to improve productivity and quality of the production support, adopting SRE principles to manage and support the environment
Develop and support automation tooling to improve the reliability of the platform and to increase the productivity of the team
Required qualifications, capabilities and skills:
Bachelor's degree in Engineering, Computer Science, or Information Technology.
Minimum of 10 years of experience within application support, production management, Site Reliability Engineering.
Experience in Unix/Linux, Windows Operating system and scripting languages such as Perl, Shell, and Python.
Strong database background with SQL ,Oracle, MS-SQL, PostgreSQL, Casandra..
Support Management skills: design and use monitoring dashboards for day-to-day support, generate service KPIs, report on service stability & performance and log & message bus monitoring
A proven track record of running Incident & Problem Management calls for business impacting outages, performing post incident analysis, identifying & implementing preventative measures and lessons learned following outages
Experience with Control-M, Splunk, AppDynamics, Dynatrace, Grafana, ITRS Geneos as monitoring tools, and Kubernetes for container orchestration.