HDI Capital Area Local Chapter Meeting, Local Chapter Updates, HDI Corporate Updates and presentation on Finding the "Root" in Root Cause Analysis - A Client Case
2. New!
HDI
Cer:fica:on
• HDI
announces
a
new
cerBficaBon:
HDI
Problem
Management
Professional
• The
first
offering
will
be
at
HDI
2014
in
Orlando.
• Learn
more
at
www.ThinkHDI.com/PM
3. New!
Corporate
Membership
• Offer
your
team
the
ability
to
benefit
from
an
HDI
membership!
• Special
pricing
for
medium
to
large
organizaBons
with
twenty-‐five
or
more
support
staff.
Contact
your
HDI
Account
Manager
to
learn
more!
800.248.5667
4. HDI
Desktop
Support
Prac:ces
&
Salary
Survey
• The
HDI
Desktop
Support
PracBces
&
Salary
Survey
is
now
open!
• Contribute
to
this
acclaimed
research
and
gain
insight
into
current
desktop
support
processes,
technologies,
metrics,
staffing,
and
salaries.
Take
the
survey
|
learn
more
at
ThinkHDI.com/DS
5. New!
Content
to
Share!
“Show
Me
the
Value:
Support’s
Mandate”
–
Roy
Atkinson
SupportWorld:
“Moments
of
Truth:
The
Future
of
the
Customer
Experience”
–
Charles
Araujo
Most
Recent
Blog:
“
Are
You
Keeping
the
Lights
On,
or
Are
You
Driving
Change?”
–
Rob
Stroud
#HDIStatToday:
15%
of
technical
support
Bckets
are
related
to
supporBng
mobile
devices.
Research
Brief:
“Mobile
Device
Support
and
BYOD:
Where
Are
We
Now?
“
Current
Survey:
4th
annual
Desktop
Support
PracBces
&
Salary
Survey
Webcast:
“Managing
Change
and
Technology
MigraBons”
–
Peter
Jurhs
Coming
up!
SupportWorld
App
debuts
in
January
!
6. HDI
Forum
Roundtables
Take
your
networking
power
to
the
next
level.
Join
HDI
for
a
Forum
Roundtable
event!
• During
the
HDI
Forum
Roundtables,
you
will
experience
a
taste
of
what
happens
during
the
HDI
Forum
meeBngs.
• The
HDI
Forum
program
is
designed
to
encourage,
enhance
and
strengthen
the
members
and
their
organizaBons
by
providing
a
vision
for
the
future
based
on
today’s
actual
innovaBons
in
support.
Register
Today!
www.ThinkHDI.com/2014Roundtable
7. HDI
2014
Conference
&
Expo
• Register
by
January
24,
2014,
and
you
could
save
up
to
$500!
Early
Bird
Discount
$200
Alumni
Discount
$100
Member
Discount
(Gold
and
above)
$200
Save
$500
Register
today:
www.HDIConference.com
8. Renew
Online!
• It’s
fast,
it’s
simple,
and
you
can
do
it
in
your
pajamas!
• You
can
renew
your
membership
online
up
to
ninety
days
before
its
expiraBon
date
through
your
MyHDI
account.
Visit:
www.ThinkHDI.com/Renew
Want
to
renew
early?
Call
the
HDI
Customer
Care
Center
at
800-‐248-‐5667!
9. Not
a
Member?
Join
Today!
Become
a
Local
Chapter
member
for
just
$75!
This
individual
local
chapter
membership
is
an
opportunity
to
connect,
network,
and
learn
in
your
own
backyard.
Enjoy
benefits
like:
•
•
•
•
•
•
Amend
local
chapter
and
vChapter
meeBngs
Digital
subscripBon
to
SupportWorld
magazine
Apply
for
HDI
awards
Access
to
the
HDI
Job
Board
Regular
e-‐newslemers
and
digests
And
much
more!
Learn
more
at
www.ThinkHDI.com/Join
or
by
calling
800.248.5667
10. HDI
is
25
Years
Old!
l Established
in
1989
by
Ron
Muns
as
the
Help
Desk
Ins:tute.
l 1996
Sold
to
Ziff
Davis
l 1999
Bought
back
by
Ron
Muns
under
the
company
name
Think
Service,
Inc.
(TSI)
l 2000
Merged
Help
Desk
Professional
Associa:on
into
HDI
l 2004
Rebranded
to
HDI
l 2006
Acquired
STI
Knowledge
(d.b.a.
HelpDesk
2000)
l 2008
Sold
to
UBM
and
became
part
of
ThinkServices
division
l 2010
UBM
restructured
and
HDI
joined
UBM
TechWeb
1989
2000
2002
2006
2010
2012
11. HDI
Conferences
Beat
Expecta:ons
• HDI
Conference
–
April
1-‐4,
2014;
Orlando,
FL
• Fusion
Conference
–
October
19-‐22,
2014;
Washington,
DC
HDI
has
signed
with
itSMF-‐USA
to
deliver
FUSION
through
2016.
16. HDI
Capital
Area
Sponsors
• Diamond
–
The
MIL
CorporaBon
• PlaBnum
Plus
-‐
LanDesk
• PlaBnum
– Beyond20
– EasyVista
• Gold
– Robert
Half
Technology
(Global
Sponsor)
– Bomgar
– IssueTrak
– Cherwell
• Silver
– Service
Now
– Time
Warner
Cable
• Web/Event
–
–
–
–
–
–
IBnvolve
RemedyForce
ReACT
StrataCom
TechnoLava
ArBsys
17. Call
for
Capital
Area
Board
Nomina:ons
• NominaBons
due
by
January
27
– VP
Programs
– VP
Membership
– VP
Finance
– VP
Special
Programs
&
Vendor
Liaison
18. Next
Mee:ng
• Who
Moved
My
Service
Desk?
February
19,
12:00
pm
-‐
2:00
pm
Marc
Fey,
Cherwell
Sofware
• We
typically
meet
the
3rd
Wednesday
of
the
month
and
most
meeBngs
are
free
• Visit
www.hdicapitalarea.com
to
register
20. FINDING THE “ROOT” IN ROOT CAUSE ANALYSIS
– A CLIENT CASE
MATT FOURIE
THINKING DIMENSIONS
21. Some of our recent
clients...
Barclays IT
Macquarie ITG
Unisys
Woolworths IT
SGX IT
SITA Global
BT Financial
McDonalds IT
Queensland Police IT
DBS IT
Lockheed Martin Space Systems
• Thinking Dimensions
International - operating
KEPNERandFOURIE RCA
company initiatives for the last
25 years
• Specializes in RCA
Methodology for IT Incidents,
Problems and Projects
22. AGENDA
“Most incident
investigators ask
the wrong
questions, so do not
change your people
but change the
questions they are
asking”
•
•
•
•
Introduction
Introduce Client Situation
The Three Skills sets
The Common Process and Language
•
•
Technical Cause Analysis
Root Cause Components
•
•
Client outcomes
Questions & answers
23. Investigation Info
“It takes a company without a formal and
effective Root Cause Analysis culture up
to 23 days to repair service incidents”
Aberdeen Group – Boston 2010
24. AGENDA
“Most incident
investigators ask
the wrong
questions, so do not
change your people
but change the
questions they are
asking”
•
•
•
•
Introduction
Introduce Client Situation
The Three Skills sets
The Common Process and Language
•
•
Technical Cause Analysis
Root Cause Components
•
•
Client outcomes
Questions & answers
25. Client Case - PM Improvement
International
Investment
Bank’s IT PM
Division
2009-2012
•
•
•
•
•
Lack of Stakeholder commitment
Poor management of information
Working with poor quality information
Poor problem investigation support
Not really solving problems
permanently
26. Actions taken…
• For P1 & 2 Incidents a PM was assigned immediately
• All Incident & Problem Management Staff trained in some
common process
• Embedded the tools and templates into existing process to
make handover seamless
• Used in-house facilitators to lead PM teams in a very strong
and decisive way
27. AGENDA
“Most incident
investigators ask
the wrong
questions, so do not
change your people
but change the
questions they are
asking”
•
•
•
•
Introduction
Introduce Client Situation
The Three Skills sets
The Common Process and Language
•
•
Technical Cause Analysis
Root Cause Components
•
•
Client outcomes
Questions & answers
28. PAST
NOW
FUTURE
STANDARD
TECHNICAL
CAUSE
ANALYSIS
INCIDENT
RESTORATION
ROOT
CAUSE
ANALYSIS
29. AGENDA
“Most incident
investigators ask
the wrong
questions, so do not
change your people
but change the
questions they are
asking”
•
•
•
•
Introduction
Introduce Client Situation
The Three Skills sets
The Common Process and Language
•
•
Technical Cause Analysis
Root Cause Components
•
•
Client outcomes
Questions & answers
30. Common process
•
Step
1:
IdenBfy
Problem
SituaBon
•
Step
2:
Gather
Incident
InformaBon
•
Step
3:
Analyse
Incident
InformaBon
Step
4:
Determine
Conclusion
Everybody uses the same
process for finding causes and
solutions
The process determines which
questions to ask at each step
for each type of incident
investigation approach
Designed for minimalistic
information combined with a
good focus to provide quick
answers
31. AGENDA
“Most incident
investigators ask
the wrong
questions, so do not
change your people
but change the
questions they are
asking”
•
•
•
•
Introduction
Introduce Client Situation
The Three Skills sets
The Common Process and Language
•
•
Technical Cause Analysis
Root Cause Components
•
•
Client outcomes
Questions & answers
32. Extreme Focus With “Specificity”
Object
Servers
Fault
Not
communicaBng
Data
not
transferred
Sent
but
not
received
by
receiving
servers
Data
for
Large
Outlets
Not
received
Sales
turnover
numbers
for
Large
Outlets
Not
received
Specificity Rules
• One object one fault
• Single-minded & simplistic
• Highly focused
• Must find the correct entry
point
• Ask a question – expect an
answer
33. Extreme Focus With “Specificity”
Object
Murex Chip
Fault
Produces latency
Transactions
queuing up
TX’s taking longer
than 100
milliseconds
Transactions
Takes longer to
process
“Futures”
transactions
Takes longer to
process
Specificity Rules
• One object one fault
• Single-minded & simplistic
• Highly focused
• Must find the correct entry
point
• Ask a question – expect an
answer
35. Thinking more specifically…
Incident
Statement
Technical
Cause
G-‐Force
System
Freezing
G-‐Force
SQL
DB
thread
count
exceeding
Root
Cause
High
volume
Too
many
users
allowed
access
G-‐Force
Vendor
program
not
implemented
closing
out
an
untested
threads
program
36. Crea:ng
Intelligence
DATA
IS
INFORMATI
ON
BUT
NOT
KNOWLEDGE
WHY
NOT
APAC
users
Freezing
Different
rouBng
SSL
handshake
USA,
UK
Volume?
Started
Oct
1
Before
ADSL
lines
Awer
4pm
New
passwords
Internet
Banking
Slow
ConBnuou
Intranet
Banking
Unexpected Outcomes
• “BUT NOT” clarifies the
facts
• Creates a curious “contrast”
• Looking at answers at a
“granular level”
• Stimulates deductive
reasoning
38. “Minimalistic principle”..
• Only need to analyse the information that
would be relevant to the incident
• Worked questions within a customised
“factor analysis” framework
• Get a quick factual “snapshot” of the
characteristics of the incident and then
use SME experience and gut feel to
explain the snapshot
• Test SME inputs against logic of snapshot
“Too much information
can cause confusion.
The key is to get all the
relevant information only
and that is normally
substantially less than
gathering all the
Information.”
Innovation – the FreeZone
thinking experience.
by Kepner & Fourie
39. Snapshot info for causes
IS
BUT
NOT
WHY
NOT
OBJECT – What object and which other object(s) not?
FAULT – What fault and which other typical faults not?
OBJECT
USERS – Who has the problem and who does not?
FAULT
WHERE – Where are these users and where could they
have been but are not?
USERS
WHERE
TIMING
PATTER
N
CYCLE
TIMING – When did it happen first time and when not?
PATTERN – What is the pattern of faults and what could
it have been but is not?
CYCLE – In which cycle does the problem occur and in
which cycle does it not occur?
40. CauseWise sample
DIMENSIO
N
Object
Fault
Loc
of
Object
Timing
Pahern
Life
Cycle
Phase
of
Work
IS
BUT
NOT
WHY
NOT
Possible
Causes
&
Tes:ng
41. CauseWise sample
DIMENSIO
N
Object
Fault
Loc
of
Object
Timing
Pahern
Life
Cycle
Phase
of
Work
IS
Fireburst
V2.0
connecBon
BUT
NOT
WHY
NOT
Possible
Causes
&
Tes:ng
42.
DIMENSIO
N
Object
Fault
Loc
of
Object
Timing
Pahern
Life
Cycle
Phase
of
Work
IS
Fireburst
V2.0
connecBon
BUT
NOT
E-‐Express,
Mango
connecBons
WHY
NOT
Possible
Causes
&
Tes:ng
43.
DIMENSIO
N
Object
Fault
Loc
of
Object
Timing
Pahern
Life
Cycle
Phase
of
Work
IS
Fireburst
V2.0
connecBon
BUT
NOT
E-‐Express,
Mango
connecBons
WHY
NOT
F/B
upgrade
from
V1
to
V2,
Poor
tesBng
issue
Possible
Causes
&
Tes:ng
44. DIMENSI
ON
IS
BUT NOT
WHY NOT
Objec
t
Fireburst V2.0
connection
E-Express,
Mango
connections
F/B upgrade from
V1 to V2, Poor
testing issue
Fault
dropping
Freezing, slow
Time out settings,
configuration of
drivers
Loc of
Objec
t
Timin
g
Patter
n
Life
Cycle
Phase
Possible Causes &
Testing
45. DIMENSI
ON
IS
BUT NOT
WHY NOT
Objec
t
Fireburst V2.0
connection
E-Express,
Mango
connections
F/B upgrade from
V1 to V2, Poor
testing issue
Fault
dropping
Freezing, slow
Time out settings,
configuration of
drivers
Loc of
Objec
t
ANZ, USA,
UK
Asia
LAN, Proxy server
issues, F/Wall rules
Timin
g
Patter
n
Life
Cycle
Phase
Possible Causes &
Testing
46. DIMENSI
ON
IS
BUT NOT
WHY NOT
Objec
t
Fireburst V2.0
connection
E-Express,
Mango
connections
F/B upgrade from
V1 to V2, Poor
testing issue
Fault
dropping
Freezing, slow
Time out settings,
configuration of
drivers
Loc of
Objec
t
ANZ, USA,
UK
Asia
LAN, Proxy server
issues, F/Wall rules
Timin
g
Monday, Sept
2nd with SOB
Any time
earlier than
Sept 2nd
Java upgrade,
Netscape upgrade
Patter
n
Life
Cycle
Phase
Possible Causes &
Testing
47. DIMENSI
ON
IS
BUT NOT
WHY NOT
Objec
t
Fireburst V2.0
connection
E-Express,
Mango
connections
F/B upgrade from
V1 to V2, Poor
testing issue
Fault
dropping
Freezing, slow
Time out settings,
configuration of
drivers
Loc of
Objec
t
ANZ, USA,
UK
Asia
LAN, Proxy server
issues, F/Wall rules
Timin
g
Monday, Sept
2nd with SOB
Any time
earlier than
Sept 2nd
Java upgrade,
Netscape upgrade
Patter
n
Continuous
Sporadic,
Periodic
Don’t know
Life
Cycle
Phase
Possible Causes &
Testing
48. DIMENSI
ON
IS
BUT NOT
WHY NOT
Objec
t
Fireburst V2.0
connection
E-Express,
Mango
connections
F/B upgrade from
V1 to V2, Poor
testing issue
Fault
dropping
Freezing, slow
Time out settings,
configuration of
drivers
Loc of
Objec
t
ANZ, USA,
UK
Asia
LAN, Proxy server
issues, F/Wall rules
Timin
g
Monday, Sept
2nd with SOB
Any time
earlier than
Sept 2nd
Java upgrade,
Netscape upgrade
Patter
n
Continuous
Sporadic,
Periodic
Don’t know
Life
Cycle
When doing a
transaction
“x” time into
transaction
Operator error,
Code error on a
specific page
Possible Causes &
Testing
49. DIMENSI
ON
IS
BUT NOT
WHY NOT
Objec
t
Fireburst V2.0
connection
E-Express,
Mango
connections
F/B upgrade from
V1 to V2, Poor
testing issue
Fault
dropping
Freezing, slow
Time out settings,
configuration of
drivers
Loc of
Objec
t
ANZ, USA,
UK
Asia
LAN, Proxy server
issues, F/Wall rules
Timin
g
Monday, Sept
2nd with SOB
Any time
earlier than
Sept 2nd
Java upgrade,
Netscape upgrade
Patter
n
Continuous
Sporadic,
Periodic
Don’t know
Life
Cycle
When doing a
transaction
“x” time into
transaction
Operator error,
Code error on a
specific page
Possible Causes &
Testing
50. DIMENSI
ON
IS
BUT NOT
WHY NOT
Objec
t
Fireburst V2.0
connection
E-Express,
Mango
connections
F/B upgrade from
V1 to V2, Poor
testing issue
Fault
dropping
Freezing, slow
Time out settings,
configuration of
drivers
Loc of
Objec
t
ANZ, USA,
UK
Asia
LAN, Proxy server
issues, F/Wall rules
Timin
g
Monday, Sept
2nd with SOB
Any time
earlier than
Sept 2nd
Java upgrade,
Netscape upgrade
Patter
n
Continuous
Sporadic,
Periodic
Don’t know
Life
Cycle
When doing a
transaction
“x” time into
transaction
Operator error,
Code error on a
specific page
Possible Causes &
Testing
1. Proxy server tampered with during
the Java upgrade on the LAN
51. DIMENSI
ON
IS
BUT NOT
WHY NOT
Objec
t
Fireburst V2.0
connection
E-Express,
Mango
connections
F/B upgrade from
V1 to V2, Poor
testing issue
Fault
dropping
Freezing, slow
Time out settings,
configuration of
drivers
Loc of
Objec
t
ANZ, USA,
UK
Asia
LAN, Proxy server
issues, F/Wall rules
Timin
g
Monday, Sept
2nd with SOB
Any time
earlier than
Sept 2nd
Java upgrade,
Netscape upgrade
Patter
n
Continuous
Sporadic,
Periodic
Don’t know
Life
Cycle
When doing a
transaction
“x” time into
transaction
Operator error,
Code error on a
specific page
Possible Causes &
Testing
1. Proxy server tampered with during
the Java upgrade on the LAN
2. Java upgrade caused driver
incompatibility with Fireburst website
V2.0
52. DIMENSI
ON
IS
BUT NOT
WHY NOT
Objec
t
Fireburst V2.0
connection
E-Express,
Mango
connections
F/B upgrade from
V1 to V2, Poor
testing issue
Fault
dropping
Freezing, slow
Time out settings,
configuration of
drivers
Loc of
Objec
t
ANZ, USA,
UK
Asia
LAN, Proxy server
issues, F/Wall rules
Timin
g
Monday, Sept
2nd with SOB
Any time
earlier than
Sept 2nd
Java upgrade,
Netscape upgrade
Patter
n
Continuous
Sporadic,
Periodic
Don’t know
Life
Cycle
When doing a
transaction
“x” time into
transaction
Operator error,
Code error on a
specific page
Possible Causes &
Testing
1. Proxy server tampered with during
the Java upgrade on the LAN
2. Java upgrade caused driver
incompatibility with Fireburst website
V2.0
3. Netscape upgrade caused driver
incompatibility with Fireburst website
V2.0
53. DIMENSI
ON
IS
BUT NOT
WHY NOT
Possible Causes &
Testing
Objec
t
Fireburst V2.0
connection
E-Express,
Mango
connections
F/B upgrade from
V1 to V2, Poor
testing issue
1. Proxy server tampered with during
the Java upgrade on the LAN
Fault
dropping
Freezing, slow
Time out settings,
configuration of
drivers
X
Loc of
Objec
t
ANZ, USA,
UK
Asia
LAN, Proxy server
issues, F/Wall rules
2. Java upgrade caused driver
incompatibility with Fireburst website
V2.0
Timin
g
Monday, Sept
2nd with SOB
Any time
earlier than
Sept 2nd
Java upgrade,
Netscape upgrade
Patter
n
Continuous
Sporadic,
Periodic
Don’t know
Life
Cycle
When doing a
transaction
“x” time into
transaction
Operator error,
Code error on a
specific page
3. Netscape upgrade caused driver
incompatibility with Fireburst website
V2.0
54. DIMENSI
ON
IS
BUT NOT
WHY NOT
Possible Causes &
Testing
Objec
t
Fireburst V2.0
connection
E-Express,
Mango
connections
F/B upgrade from
V1 to V2, Poor
testing issue
1. Proxy server tampered with during
the Java upgrade on the LAN
Fault
dropping
Freezing, slow
Time out settings,
configuration of
drivers
X
Loc of
Objec
t
ANZ, USA,
UK
Asia
LAN, Proxy server
issues, F/Wall rules
2. Java upgrade caused driver
incompatibility with Fireburst website
V2.0
Timin
g
Monday, Sept
2nd with SOB
Any time
earlier than
Sept 2nd
Java upgrade,
Netscape upgrade
√
Patter
n
Continuous
Sporadic,
Periodic
Don’t know
3. Netscape upgrade caused driver
incompatibility with Fireburst website
V2.0
Life
Cycle
When doing a
transaction
“x” time into
transaction
Operator error,
Code error on a
specific page
√
X
55. DIMENSI
ON
IS
BUT NOT
WHY NOT
Possible Causes &
Testing
Objec
t
Fireburst V2.0
connection
E-Express,
Mango
connections
F/B upgrade from
V1 to V2, Poor
testing issue
1. Proxy server tampered with during
the Java upgrade on the LAN
Fault
dropping
Freezing, slow
Time out settings,
configuration of
drivers
X
Loc of
Objec
t
ANZ, USA,
UK
Asia
LAN, Proxy server
issues, F/Wall rules
2. Java upgrade caused driver
incompatibility with Fireburst website
V2.0
Timin
g
Monday, Sept
2nd with SOB
Any time
earlier than
Sept 2nd
Java upgrade,
Netscape upgrade
√
Patter
n
Continuous
Sporadic,
Periodic
Don’t know
3. Netscape upgrade caused driver
incompatibility with Fireburst website
V2.0
Life
Cycle
When doing a
transaction
“x” time into
transaction
Operator error,
Code error on a
specific page
√
√
√
X
A1
√
√
√
√
A1- Only if the staff in Asia did not upgrade
to Netscape
56. AGENDA
“Most incident
investigators ask
the wrong
questions, so do not
change your people
but change the
questions they are
asking”
•
•
•
•
Introduction
Introduce Client Situation
The Three Skills sets
The Common Process and Language
•
•
Technical Cause Analysis
Root Cause Components
•
•
Client outcomes
Questions & answers
57. Stair stepping…
3. Netscape upgrade caused
driver incompatibility with
Fireburst website V2.0
WHY?
Installed the wrong driver
WHY?
Driver Specs not correct
WHY?
Don’t Know
58. Root Cause Analysis Components
COMPONENT
CAUSAL FACTORS
Decision Making
DM Process and Collaboration for inputs
Implementation issues
Resourcing Issues and Scope & Definition of project
Standard Operating
Procedures
Applicability of SOP, Awareness of SOP and
Documentation
Management
Management of Work and Management of Staff
Measurement
Key Performance Indicators (KPI’s) and Roles &
Responsibilities
Support
Internal Support and External Vendor support
Communications
Clarity of communications and verbal instructions
Work Environment
Task Interference and consequences of doing the task
Skills
Complexity needed and applicability of person to task
Testing Practices
Testing Procedures and Testing Requirements
59. Root Cause Analysis Components
COMPONENT
CAUSAL FACTORS
Decision Making
DM Process and Collaboration for inputs
Implementation issues
Resourcing Issues and Scope & Definition of project
Standard Operating
Procedures
Applicability of SOP, Awareness of SOP and
Documentation
Management
Management of Work and Management of Staff
Measurement
Key Performance Indicators (KPI’s) and Roles &
Responsibilities
Support
Internal Support and External Vendor support
Communications
Clarity of communications and verbal instructions
Work Environment
Task Interference and consequences of doing the task
Skills
Complexity needed and applicability of person to task
Testing Practices
Testing Procedures and Testing Requirements
60. Root Cause Analysis Components
COMPONENT
CAUSAL FACTORS
Decision Making
DM Process and Collaboration for inputs
Implementation issues
Resourcing Issues and Scope & Definition of project
Standard Operating
Procedures
Applicability of SOP, Awareness of SOP and
Documentation
Management
Management of Work and Management of Staff
Measurement
Key Performance Indicators (KPI’s) and Roles &
Responsibilities
Support
Internal Support and External Vendor support
Communications
Clarity of communications and verbal instructions
Work Environment
Task Interference and consequences of doing the task
Skills
Complexity needed and applicability of person to task
Testing Practices
Testing Procedures and Testing Requirements
61. Snapshot info for Solutions
Four Question Drill
Key
Requirements
1
Best
data
transfer
rate
•
•
2
No
loss
of
data
•
3
Improve
system
up-‐
Bme
•
4
Improve
trickle
&
purge
5
Reduce
DR
Bme
6
Capex
<
$2m
7
Implement
<
3
mos
What are the results you want to
achieve with this solution?
What are the existing problems you
would like to remove with this
solution?
What are the potential risks you
would like to avoid with this
solution?
What money and time do you have
or do you need to preserve? What
are the restrictions out of your
control?
62. Snapshot info for Solutions
Questions for Actions
Key
Requirements
1
Best
data
transfer
rate
•
•
2
No
loss
of
data
•
3
Improve
system
up-‐
Bme
•
4
Improve
trickle
&
purge
5
Reduce
DR
Bme
6
Capex
<
$2m
7
Implement
<
3
mos
•
•
What action can we take today and
implement tomorrow that would meet the
1st requirement? (repeat)
What action can we take today and
implement tomorrow that would meet the
2nd requirement? (repeat)
Looking at all the actions, which one(s)
would satisfy the 1st requirement best?
Which actions has got nothing to do with
the 1st requirement?
Looking at all the actions, which one(s)
would satisfy the 2st requirement best?
Repeat for each requirement…
63. Snapshot info for Solutions
Questions for Actions
Key
Requirements
1
Best
data
transfer
rate
1
2
3
4
5
6
7
8
•
•
2
No
loss
of
data
•
3
Improve
system
up-‐
Bme
•
4
Improve
trickle
&
purge
5
Reduce
DR
Bme
6
Capex
<
$2m
7
Implement
<
3
mos
•
•
What action can we take today and
implement tomorrow that would meet the
1st requirement? (repeat)
What action can we take today and
implement tomorrow that would meet the
2nd requirement? (repeat)
Looking at all the actions, which one(s)
would satisfy the 1st requirement best?
Which actions has got nothing to do with
the 1st requirement?
Looking at all the actions, which one(s)
would satisfy the 2st requirement best?
Repeat for each requirement…
64. Snapshot info for Solutions
Questions for Actions
Key
Requirements
1
Best
data
transfer
rate
1
2
3
4
5
6
7
8
3
3
•
•
2
No
loss
of
data
•
3
Improve
system
up-‐
Bme
•
4
Improve
trickle
&
purge
5
Reduce
DR
Bme
6
Capex
<
$2m
7
Implement
<
3
mos
•
•
What action can we take today and
implement tomorrow that would meet the
1st requirement? (repeat)
What action can we take today and
implement tomorrow that would meet the
2nd requirement? (repeat)
Looking at all the actions, which one(s)
would satisfy the 1st requirement best?
Which actions has got nothing to do with
the 1st requirement?
Looking at all the actions, which one(s)
would satisfy the 2st requirement best?
Repeat for each requirement…
65. Snapshot info for Solutions
Questions for Actions
Key
Requirements
1
Best
data
transfer
rate
1
2
3
4
5
6
7
8
3
3
0
0
0
0
0
•
•
2
No
loss
of
data
•
3
Improve
system
up-‐
Bme
•
4
Improve
trickle
&
purge
5
Reduce
DR
Bme
6
Capex
<
$2m
7
Implement
<
3
mos
•
•
What action can we take today and
implement tomorrow that would meet the
1st requirement? (repeat)
What action can we take today and
implement tomorrow that would meet the
2nd requirement? (repeat)
Looking at all the actions, which one(s)
would satisfy the 1st requirement best?
Which actions has got nothing to do with
the 1st requirement?
Looking at all the actions, which one(s)
would satisfy the 2st requirement best?
Repeat for each requirement…
66. Snapshot info for Solutions
Questions for Actions
Key
Requirements
1
Best
data
transfer
rate
1
2
3
4
5
6
7
8
3
3
0
1
0
0
0
0
•
•
2
No
loss
of
data
•
3
Improve
system
up-‐
Bme
•
4
Improve
trickle
&
purge
5
Reduce
DR
Bme
6
Capex
<
$2m
7
Implement
<
3
mos
•
•
What action can we take today and
implement tomorrow that would meet the
1st requirement? (repeat)
What action can we take today and
implement tomorrow that would meet the
2nd requirement? (repeat)
Looking at all the actions, which one(s)
would satisfy the 1st requirement best?
Which actions has got nothing to do with
the 1st requirement?
Looking at all the actions, which one(s)
would satisfy the 2st requirement best?
Repeat for each requirement…
67. Snapshot info for Solutions
Questions for Actions
Key
Requirements
1
2
3
4
5
6
7
8
•
•
1
Best
data
transfer
rate
3
3
0
1
0
0
0
0
2
No
loss
of
data
0
3
3
0
0
1
2
2
•
3
Improve
system
up-‐
Bme
1
2
1
0
3
1
2
1
•
4
Improve
trickle
&
purge
2
2
3
1
0
0
0
0
5
Reduce
DR
Bme
1
1
3
2
1
1
3
1
6
Capex
<
$2m
3
1
2
2
1
0
3
1
7
Implement
<
3
mos
0
3
0
0
0
1
3
3
•
•
What action can we take today and
implement tomorrow that would meet the
1st requirement? (repeat)
What action can we take today and
implement tomorrow that would meet the
2nd requirement? (repeat)
Looking at all the actions, which one(s)
would satisfy the 1st requirement best?
Which actions has got nothing to do with
the 1st requirement?
Looking at all the actions, which one(s)
would satisfy the 2st requirement best?
Repeat for each requirement…
68. Snapshot info for Solutions
Develop the new solution
Key
Requirements
1
2
3
4
5
6
7
8
1
Best
data
transfer
rate
3
3
0
1
0
0
0
0
2
No
loss
of
data
0
3
3
0
0
1
2
2
3
Improve
system
up-‐
Bme
1
2
1
0
3
1
2
1
4
Improve
trickle
&
purge
2
2
3
1
0
0
0
0
5
Reduce
DR
Bme
1
1
3
2
1
1
3
1
6
Capex
<
$2m
3
1
2
2
1
0
3
1
7
Implement
<
3
mos
0
3
0
0
0
1
3
3
•
Circle the best performing areas to
give you a visual impact of best
performing actions.
69. Snapshot info for Solutions
Develop the new solution
•
Key
Requirements
1
2
3
4
5
6
7
8
1
Best
data
transfer
rate
3
3
0
1
0
0
0
0
•
2
No
loss
of
data
0
3
3
0
0
1
2
2
3
Improve
system
up-‐
Bme
1
2
1
0
3
1
2
1
•
4
Improve
trickle
&
purge
2
2
3
1
0
0
0
0
5
Reduce
DR
Bme
1
1
3
2
1
1
3
1
6
Capex
<
$2m
3
1
2
2
1
0
3
1
7
Implement
<
3
mos
0
3
0
0
0
1
3
3
Circle the best performing areas to
give you a visual impact of best
performing actions.
ASK: Which combination of actions
would give you at least one [3] for
each requirement? Look for at least
3-4 actions to combine
Lastly, which action(s) do we have
to add to the mix to make the
suggested solution even more
effective?
70. Snapshot info for Solutions
The
Solu.on!
Key
Requirements
1
2
3
4
5
6
7
8
1
Best
data
transfer
rate
3
3
0
1
0
0
0
0
2
No
loss
of
data
0
3
3
0
0
1
2
2
3
Improve
system
up-‐
Bme
1
2
1
0
3
1
2
1
4
Improve
trickle
&
purge
2
2
3
1
0
0
0
0
5
Reduce
DR
Bme
1
1
3
2
1
1
3
1
6
Capex
<
$2m
3
1
2
2
1
0
3
1
7
Implement
<
3
mos
0
3
0
0
0
1
3
3
•
•
•
Which
combinaBon
of
acBons
makes
the
best
sense
and
would
provide
the
best
chance
of
success?
Check
whether
your
suggested
soluBon
would
saBsfy
all
the
requirements.
Also
check
impact
on
implementaBon
costs
How
would
you
suggest
we
implement
these
acBons?
71. AGENDA
“Most incident
investigators ask
the wrong
questions, so do not
change your people
but change the
questions they are
asking”
•
•
•
•
Introduction
Introduce Client Situation
The Three Skills sets
The Common Process and Language
•
•
Technical Cause Analysis
Root Cause Components
•
•
Client outcomes
Questions & answers
72. Application Performance results
Our client’s
systems
availability went
from 76% at the
end of 2009 to
88% end 2010 to
eventually 95% at
the end of 2012 –
A gain of 19%
100
90
80
70
Mean-‐Bme-‐to-‐repair
60
Improvement
m-‐t-‐t-‐r
50
Availability
40
30
20
10
0
2009
2010
2012
73. Improvement in escalations
In our client’s case
the P2 to P1
escalations dropped
by 38%
90
80
70
60
Recurring incidents
dropped by 21%
50
40
P3 to P2 escalations
dropped by 19%
30
20
10
0
Sev
3
to
Sev
2
Sev2
to
Sev
1
Recurring
problems
Vendor
Interven:ons
74. Lessons learned..
•
•
•
•
•
•
•
Most of the recurring incidents and problems are caused by “out of date
procedures” and lack of proper documentation
TCA is a “mental orientation” which people have to get trained in – “does not
come with experience”
IT professionals need a “thinking approach” that could be applied in most
situations
Rules of Engagement to become a standing order
Encourage use in all incident investigation meetings – ask for the
paperwork/evidence
Sponsors continuous TCA/RCA training
Regular email communications to publish successes