AIOps, Observability & Continuous Service Improvement (Primary Focus):
- Design and implement observability solutions enterprise-grade AIOps solutions using IBM Watson AIOps, Splunk ITSI, Dynatrace, and Datadog
- Integrate AIOps platforms with ITSM, CMDB, and DevOps pipelines to enable intelligent operations and automation.
- Integrate AIOps platforms with ITSM tools (ServiceNow, BMC), CMDB, and DevOps pipelines to enable intelligent automation
- Develop ML-driven monitoring frameworks for logs, metrics, and traces using OpenTelemetry, Prometheus, and Grafana
- Implement predictive incident management with anomaly detection, event correlation, and automated root cause analysis
- Design real-time operational dashboards and alerting systems to improve MTTR and service reliability
- Lead Continuous Service Improvement (CSI) initiatives to enhance service quality, performance, and reliability.
- Analyse incident trends, service metrics, and customer feedback to identify and implement improvement opportunities.
- Establish KPIs, SLOs, and SLAs to measure and improve service performance
- Collaborate with stakeholders to implement process improvements and automation across IT operations.
FinOps & Cloud Cost Optimization (Secondary Focus):
- Develop and implement FinOps frameworks for AWS, Azure, and GCP environments
- Collaborate with finance, engineering, and operations teams to align cloud usage with business and budgetary goals.
- Provide insights and recommendations for cost-saving opportunities across multi-cloud environments.
- Develop dashboards and reports for cloud spend visibility, forecasting, and anomaly detection.
- Assist in budgeting, chargeback/showback models, and cost allocation strategies.
- Advise clients on cloud financial governance and cost-efficient architecture design
Client Engagement, Advisory & Consulting Competencies
- Act as trusted advisor to C-level executives on AIOps and FinOps transformation roadmaps
- Lead solution presentations, workshops, and proof-of-concepts for client engagements
- Develop business cases and ROI analyses for AIOps and FinOps initiatives
- Translate technical capabilities into measurable business value propositions
- Proven experience in client-facing advisory roles
- Excellent stakeholder management and executive communication skills
- Ability to translate technical concepts into business value
- Strong problem-solving and analytical thinking
Thought Leadership & Practice Development
- Create whitepapers, case studies, and best practice guides for AIOps and FinOps
- Mentor team members and contribute to practice capability development
- Stay current with emerging trends in Generative AI for IT operations (GenAIOps) and MLOps
Required Skills & Experience:
- 12–15 years of experience in IT Operations, Monitoring, or Cloud Infrastructure.
Hands-on expertise in:
- AIOps Platforms: IBM Watson AIOps, Splunk (Core & ITSI), Dynatrace (Davis AI), Datadog
- Observability Stack: OpenTelemetry, Prometheus, Grafana, ELK
- Cloud & FinOps Tools: AWS Cost Explorer, Azure Cost Management, CloudHealth, Kubecost
- Strong understanding of AIOps principles: event correlation, anomaly detection, RCA, automation.
- Solid experience in FinOps practices, cloud cost optimization, and financial governance.
- Familiarity with cloud platforms (AWS, Azure, GCP) and hybrid environments.
- Proficiency in scripting (Python, Shell, PowerShell) for automation and integration.
- Knowledge of ITIL, DevOps, and Infrastructure as Code (IaC).
Preferred Certifications (Minimum 2):
- IBM Certified Specialist – Watson AIOps
- Splunk Core Certified Power User / Admin / ITSI
- Dynatrace Associate / Professional Certification
- Datadog Certified Monitoring Professional
- FinOps Certified Practitioner
- ITIL v4 Foundation
Education & Requirements:
- Bachelor's or Master's in Computer Science, Data Science, or related field
- MBA preferred for consulting roles
- Willingness to travel for client engagements