Apidays New York 2024 - The value of a flexible API Management solution for O...
SplunkLive! London 2019: Allied Irish Banks
1. Allied Irish Banks -
Monitoring Payments with real time insights using
Splunk and ITSI
June 2019
2. 2
Background to AIB Payments Landscape
Responsible for the Assurance of Payment Services and Platforms
Transformation to
New Payments
Platforms
Highly Complex
Services &
Architectures
Critically Important
to our Customers &
Irish Economy
Changing Regulatory
Environment / Open
Banking
Perfect
Payments
Outcomes
24x7 “Always On”
Time Critical
3. 3
AIB Splunk Journey (2015 - 2017)
Payments Business Activity Monitoring (BAM) using Splunk
Assurance Objectives
Business Activity Monitoring of critical Payment Services using Splunk
Time
Critical Payment Services BAM
2015
-2017
E2E Integrity for Payments
Performance & Volume Trends
Automated & Independent
Improved SLA Performance
Reduced Incidents & MTTR
Grow software intelligence capability to automate monitoring of complex payments processes
2015
-2019
E2E Payment Service Performance
Reduce Operational Risk
Reduce Transformational Risk
Reduce Regulatory Risks
Become Proactive & Predictive
Improve Customer Outcomes
4. 4
Payments Business Activity Monitoring (BAM) with Splunk
AIB Incoming SEPA Credits Integrity End to End (SCF -> Posting)
5. 5
Assurance Objectives
AIB Splunk Journey (2017) – Service Insights Pilot
Mobile App & Payments using Splunk ITSI
Critical Payments BAM
Mobile App Splunk ITSI Pilot
Capture Mobile Service Architecture
360 Health View of Business, App & Infra
Acquire Data Sources (App, OS etc.)
Define KPIs & Train using ITSI Machine Learning
Derive Service Health Quality Scores
Prove ITSI Anomaly Detection
Reduce Incidents and Problem MTTR
Proactive monitoring of Mobile App using Splunk ITSI
Business Activity Monitoring of critical Payment Services using Splunk
2015
-2017
2017
Time
E2E Integrity for SEPA Payments
Payments Volume Trends
Processing Performance Trends
Automated “4 eyes” Monitoring
Reduced Incidents & MTTR
Grow combined analytics capability to assure digital and payments transformation strategies
2015
-2019
E2E Payment Service Performance
Reduce Operational Risk
Reduce Transformational Risk
Reduce Regulatory Risks
Become Proactive & Predictive
Improve Customer Outcomes
6. 6
Pilot with AIB Mobile Payments Journey using Splunk ITSI
Step 1 & 2. Understand the Service / Define the Service Topology & KPIS
-
-
-
SME workshops to decompose the service and capture “metrics that matter” as KPIs.
Highly agile process over 2-4 weeks depending on data availability.
Mobile
Payment
Service
Health
Tech.
Health
Cust.
Exp.
Health
Login
Auth.
Load
A/Cs
Pay
Resp.
Time
#
Errors
Vol.
Traffic
Network
APIs
Payment
Engine
DatabaseITSI ML learns normal data patterns
7. 7
Developing the solution using Splunk ITSI
Step 3 & 4. Build the Glass Tables & Tune the KPI’s
Troubleshooting Deep Dive
Customer Service Health “Live Glass Table” View
“3 Clicks to Root Cause”
8. 8
Pilot Solution to evaluate Effectiveness - Anomaly Detection
Step 5. Splunk ITSI Glass Table detects serious issue with the Mobile Channel
Splunk ITSI has the ability to look back at previous incidents
9. 9
Test the Effectiveness of KPIs – Deep Dive Views
Step 5/6. Demonstrate KPI Effectiveness to detect Anomalies ahead of Incidents
Incident time
Compare to 1
week previous
Proposed ITSI Response Time KPI detected
anomaly 3.5 hours ahead of the Incident
10. 10
2018
-2019
Mobile Service Architecture
360 view of Business, App & Infra
Acquired Data Sources (App, OS)
Trained ITSI ML Models
“Live” Health Glass Tables
Reduced Incidents & MTTR
Anomalies Detected
Assurance Objectives
AIB Splunk Journey (2018-2019) – Roll Out Service Insights
Mobile App ITSI Success
Business, App & Infra “Live” service health
Channels & Payments Services
Trusted 24x7 Service “Radars”
Reduced Incidents & Problem MTTR
Prevented critical Incidents
ITSM, IT Operations & ADM Support teams
Very Positive Stakeholder feedback
Service Insights Rollout with ITSI
Proactive monitoring of Mobile App using Splunk ITSI
Business Activity Monitoring of Critical Payment Services using Splunk
Predictive Business Service Monitoring
2015
-2016
2017
Time
Critical Payments BAM
Integrity for SEPA Payments
Payments Volume Trends
Processing Performance Trends
Automated “4 eyes” Monitoring
Reduced Incidents & MTTR
Grow service intelligence capability to enable real time monitoring of Channels and Payments
2015
-2019
E2E Payment Service Performance
Reduce Operational Risk
Reduce Transformational Risk
Reduce Regulatory Risks
Become Proactive & Predictive
Improve Customer Outcomes
11. 11
AIB Splunk ITSI “Live” Service Health Glass Table
Personal Channels, Business Banking, Payment Services, Applications and Infrastructure monitored in Real Time
Friday March 2nd 2018: ITSI ML detects lower than normal Business Payments & Bulk File Volumes ????
12. 12
RED WEATHER WARNING IN IRELAND! – Businesses Closed
AIB ITSI solution can predict a lot of things but not the Weather (Yet!)
13. Timely Business Process & Data Integrity
Automated “4 eyes” Monitoring & Alerting
Improved Incident, Change & Problem
Trusted Health Platform for AI Operations
Reduced SME dependencies
Reduced Risk of missing critical cut off times
Reduced Operational Risks
Reduced Transformational Risks
Improved SLA Performance
Reduced Incidents & MTTR
One minute Change Verification
Business and IT Service alignment
Continually Improving Customer and Service
Outcomes
Improved Customer Experience
24x7 Real Time Service Radar
Risk
Improving Customer Outcomes
Process
Customer
Data
Service
Single Source of Truth
Machine Learning & Analytics
Capacity Planning
13
AIB has realised significant benefits from using Splunk and ITSI to-date
The next step in our journey is only beginning
Benefits Realised
14. 14
AIB Splunk Journey (2019-2020) – Next Steps
Payments BAM Splunk
360 Service Insights ITSI
Splunk Enabler for AI Ops
Powerful Enabler for AI Operations
First Payment use cases Live
New Data Sources (MQ) & Business Services
New Splunk Apps (AIOPs)
Splunk COE enabled
Improve Prediction times & Seasonality
Predictive Business Service Monitoring using Splunk ITSI
Business Activity Monitoring of Critical Payment Services using core Splunk
Automated Service Intelligence Platform
2015
-2017
2017
-2018
Time
Splunk/ITSI
Business Value
Delivered
Grow intelligent automation capability to optimise payments and customer outcomes
2015
-2019
Business, App & Infra “Live” service health
Channels & Payments Services
Trusted 24x7 Service “Radars”
Reduced Incidents & Problem MTTR
Prevented critical Incidents
ITSM, IT Operations & ADM Support teams
Positive Stakeholder feedback
Integrity for SEPA Payments
Payments Volume Trends
Processing Performance Trends
Automated “4 eyes” Monitoring
Reduced Incidents & MTTR
2019
-2020
17. 17
IT Support Tool
Log File Aggregation
Basic Alerts & Reports
Adhoc Data Analysis
Standalone Splunk Server
AIB Splunk Infrastructure Journey (2011-2019) -
Standalone Splunk VM to resilient Multi-Site Splunk Cluster
Log File Aggregation & Analysis using core Splunk
Splunk ITSI app, OS app, H/F and ML components
Splunk Dual Cluster to support critical Splunk BAM Integrity Alerts
Splunk Multi-Site Resilient Cluster
2011
-2014
2015
-2016
Time
Splunk
Infrastructure
Maturity
Splunk Enterprise Cluster
Splunk Multi-Site Resilient Cluster
6 Search Heads, 4 Indexers
Powerful Physical Servers
Deployment & Cluster Master
Heavy Forwarder Load Mgmt.
250GB Daily Ingestion
250 Forwarders
Replica Test VM Cluster
Prioritise Scheduled Searches
User Access Control Framework
2018
2017
Splunk Dual Cluster
Splunk Multi-Site Cluster
3 Search Heads, 2 Indexers
Physical Servers
Splunk Enterprise
150GB-200GB Daily Ingestion
120 Forwarders
Standalone Test VM
Splunk ITSI
Splunk ITSI App
Splunk Heavy Forwarder
Linux/Unix OS Add On
Machine Learning Component
Common Information Model
DB Connect
18. 18
Case Study: Platform detects server TCP/IP connections rising
above normal values triggering health scores to drop and alerts
SI Platform detected an anomaly on a Mobile Server allowing resolution preventing Channels Outage later that evening
19. 19
Oracle Service Performance Metrics captured in Near Real
Time for All Hosts, Databases and Services across Oracle
Private Cloud
TOP SQL statements running in the Database to find out which SQL statement is consuming most time