Enviar búsqueda
Cargar
Stuart rance defining availability for an it service
•
13 recomendaciones
•
1,668 vistas
Stuart Rance
Seguir
Denunciar
Compartir
Denunciar
Compartir
1 de 37
Recomendados
More effective and more flexible security to lower your total cost of ownersh...
More effective and more flexible security to lower your total cost of ownersh...
InSync Conference
Virtualize More While Improving Your Cybersecurity Risk Posture - The "4 Must...
Virtualize More While Improving Your Cybersecurity Risk Posture - The "4 Must...
HyTrust
Orientation.Key
Orientation.Key
allisoncatlin
David wilkie task 1 gbs7502
David wilkie task 1 gbs7502
dawilkie
ZENDAL BACKUP
ZENDAL BACKUP
Néstor Alemán Esteban
Undgå sikkerhedstrusler med Security Intelligence. Filip Schepers, IBM
Undgå sikkerhedstrusler med Security Intelligence. Filip Schepers, IBM
IBM Danmark
kelly meaux resume 2016
kelly meaux resume 2016
kelly meaux
Latest_Resume
Latest_Resume
Kenneth Cooper
Recomendados
More effective and more flexible security to lower your total cost of ownersh...
More effective and more flexible security to lower your total cost of ownersh...
InSync Conference
Virtualize More While Improving Your Cybersecurity Risk Posture - The "4 Must...
Virtualize More While Improving Your Cybersecurity Risk Posture - The "4 Must...
HyTrust
Orientation.Key
Orientation.Key
allisoncatlin
David wilkie task 1 gbs7502
David wilkie task 1 gbs7502
dawilkie
ZENDAL BACKUP
ZENDAL BACKUP
Néstor Alemán Esteban
Undgå sikkerhedstrusler med Security Intelligence. Filip Schepers, IBM
Undgå sikkerhedstrusler med Security Intelligence. Filip Schepers, IBM
IBM Danmark
kelly meaux resume 2016
kelly meaux resume 2016
kelly meaux
Latest_Resume
Latest_Resume
Kenneth Cooper
Mobile Security
Mobile Security
Doug Robinson
Mobile Security
Mobile Security
Fresh Digital Group
Business Driven Security Securing the Smarter Planet pcty_020710_rev
Business Driven Security Securing the Smarter Planet pcty_020710_rev
Shanker Sareen
Cso oow12-summit-sonny-sing hv4
Cso oow12-summit-sonny-sing hv4
OracleIDM
MBM's InterGuard Security Suite
MBM's InterGuard Security Suite
Charles McNeil
Stream 2 - Don't Risk IT
Stream 2 - Don't Risk IT
IBM Business Insight
360-Degree Approach to DR / BC
360-Degree Approach to DR / BC
AISDC
Infromation Security as an Institutional Priority
Infromation Security as an Institutional Priority
zohaibqadir
Hp Fortify Pillar
Hp Fortify Pillar
Ed Wong
Nebezpecny Internet Novejsi Verze
Nebezpecny Internet Novejsi Verze
TUESDAY Business Network
Designing your applications with a security twist 2007
Designing your applications with a security twist 2007
Blue Slate Solutions
Dirty Little Secret - Mobile Applications Invading Your Privacy
Dirty Little Secret - Mobile Applications Invading Your Privacy
Tyler Shields
Jedi mind tricks for building application security programs
Jedi mind tricks for building application security programs
Security BSides London
DSS ITSEC Conference 2012 - Cyberoam Layer8 UTM
DSS ITSEC Conference 2012 - Cyberoam Layer8 UTM
Andris Soroka
Más contenido relacionado
Similar a Stuart rance defining availability for an it service
Mobile Security
Mobile Security
Doug Robinson
Mobile Security
Mobile Security
Fresh Digital Group
Business Driven Security Securing the Smarter Planet pcty_020710_rev
Business Driven Security Securing the Smarter Planet pcty_020710_rev
Shanker Sareen
Cso oow12-summit-sonny-sing hv4
Cso oow12-summit-sonny-sing hv4
OracleIDM
MBM's InterGuard Security Suite
MBM's InterGuard Security Suite
Charles McNeil
Stream 2 - Don't Risk IT
Stream 2 - Don't Risk IT
IBM Business Insight
360-Degree Approach to DR / BC
360-Degree Approach to DR / BC
AISDC
Infromation Security as an Institutional Priority
Infromation Security as an Institutional Priority
zohaibqadir
Hp Fortify Pillar
Hp Fortify Pillar
Ed Wong
Nebezpecny Internet Novejsi Verze
Nebezpecny Internet Novejsi Verze
TUESDAY Business Network
Designing your applications with a security twist 2007
Designing your applications with a security twist 2007
Blue Slate Solutions
Dirty Little Secret - Mobile Applications Invading Your Privacy
Dirty Little Secret - Mobile Applications Invading Your Privacy
Tyler Shields
Jedi mind tricks for building application security programs
Jedi mind tricks for building application security programs
Security BSides London
DSS ITSEC Conference 2012 - Cyberoam Layer8 UTM
DSS ITSEC Conference 2012 - Cyberoam Layer8 UTM
Andris Soroka
Similar a Stuart rance defining availability for an it service
(14)
Mobile Security
Mobile Security
Mobile Security
Mobile Security
Business Driven Security Securing the Smarter Planet pcty_020710_rev
Business Driven Security Securing the Smarter Planet pcty_020710_rev
Cso oow12-summit-sonny-sing hv4
Cso oow12-summit-sonny-sing hv4
MBM's InterGuard Security Suite
MBM's InterGuard Security Suite
Stream 2 - Don't Risk IT
Stream 2 - Don't Risk IT
360-Degree Approach to DR / BC
360-Degree Approach to DR / BC
Infromation Security as an Institutional Priority
Infromation Security as an Institutional Priority
Hp Fortify Pillar
Hp Fortify Pillar
Nebezpecny Internet Novejsi Verze
Nebezpecny Internet Novejsi Verze
Designing your applications with a security twist 2007
Designing your applications with a security twist 2007
Dirty Little Secret - Mobile Applications Invading Your Privacy
Dirty Little Secret - Mobile Applications Invading Your Privacy
Jedi mind tricks for building application security programs
Jedi mind tricks for building application security programs
DSS ITSEC Conference 2012 - Cyberoam Layer8 UTM
DSS ITSEC Conference 2012 - Cyberoam Layer8 UTM
Stuart rance defining availability for an it service
1.
Defining availability for an
IT service Stuart Rance / November 2012 Twitter: @StuartRance Email: stuart.rance@hp.com
2.
Agenda Service Warranty Traditional view
of Availability End-to-end services and SLAs Outage Frequency and Duration Number of users affected Critical business functions Poor performance Planned downtime Measurement periods How to measure availability 2 © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
3.
Service Warranty © Copyright
2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
4.
Service Value Comes
From… Service Utility What does the service do? Functional requirements Features, inputs, outputs… “fit for purpose” Service Warranty How well does the service do it? Non-functional requirements Capacity, performance, availability, security, continuity… “fit for use” 4 © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
5.
Service Warranty and
Risks high natural disaster- fire, flood, adverse weather man made disaster- terrorism, malicious damage security breach- hacker denial of service attack virus attack internal security/fraud impact insufficient capacity data corruption configuration issues software failure power/ network failure hardware failure application error planned downtime low low frequency high 5 © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
6.
Service Warranty and
Risks high natural disaster- fire, flood, adverse weather man made disaster- terrorism, malicious Continuity damage security breach- hacker Security denial of service attack virus attack internal security/fraud impact insufficient capacity Capacity data corruption configuration issues software failure Availability power/ network failure hardware failure application error planned downtime low low frequency high 6 © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
7.
Traditional View of Availability ©
Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
8.
Traditional View of
Availability Percentage Availability Annual Downtime 99% 87.6 hours (3½ days) 99.5% 43.8 hours 99.9% 8.8 hours 99.95% 4.4 hours 99.99% 53 minutes 99.999% 5.3 minutes 8 © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
9.
The Traditional Calculation AST
= Agreed Service Time DT = Downtime 9 © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
10.
What’s Wrong with
Tradition? What if some locations are OK and others aren’t What if some users are OK and others aren’t What if some operations work and others don’t What if the service is so slow that it is unusable? What if there are frequent 5 second outages? What are we actually measuring and reporting? 10 © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
11.
End-to-end Services and SLAs ©
Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
12.
Where to Measure
Availability? Database Network Server Desktop 12 © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
13.
As Seen by
the Customer / User… 13 © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
14.
Service Level Agreements An
SLA documents what has been agreed From the perspective of the users and customers Contents should include Availability definitions Targets Measurement and reporting Penalties Every goal in an SLA must be SMART Specific, Measurable, Achievable, Relevant, Time-based 14 © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
15.
Outage Frequency and
Duration MTBF = Mean Time Between Failures MTBSi = Mean Time Between System Incidents MTRS = Mean Time to Restore Service TBSi Up TBF TRS TRS Down 15 © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
16.
Outage Frequency and
Duration Which of these is better? Up MTBF = 19 days MTTR = 1 day Availability = 95% Dow n MTBF = 22.8 hrs MTTR = 1.2 hrs Availability = 95% Up Dow n 16 © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
17.
Failover Events How long
does a failover take? Between cluster members? When a RAID disk fails? When a network link fails? Does fail over have a business impact? Do transactions have to be restarted? What is the longest “short” outage that can be ignored? What if the cluster continuously fails over? What is the maximum frequency of these types of event 17 © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
18.
Outage Frequency and
Duration Summary Agree availability in terms of Frequency of incidents Duration of incidents Agree failover events which won’t be counted Frequency Duration Impact 18 © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
19.
An Agreement with
the Business Outage duration and frequency must be agreed In terms that the business understands With metrics that support the business mission What might such an agreement look like? 19 © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
20.
Example Agreement
Outage Duration Maximum Frequency 1 event in any hour Up to 2 minutes 3 events in any day 5 events in any week 1 events in any month 2 minutes to 30 minutes 2 events in any quarter 30 minutes to 4 hours 1 event in any year Maximum Annual Downtime 4 hours + (8 * 30 mins) = 8 hours Availability = (8760 – 8) / 8760 = 99.9% 20 © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
21.
Number of Users
Affected Most failures do not cause complete loss of service Typical scenario Some users have no service at all Other users completely unaffected Extreme cases Only one user is affected Only one user is able to work! Should these count as downtime or not? 21 © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
22.
User Outage Minutes Potential
User Minutes = Number of users * Agreed service time User Outage Minutes = Number of affected users * Downtime 22 © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
23.
Potential User Minutes Not
every minute is equal Day and time Potential Weekly PotentialUserMinutes no. of users Mon – Fri 00:00-07:00 500 5 x 7 x 60 x 500 = 1,050,000 5 x 2 x 60 x 2500 = Mon – Fri 07:00-09:00 2,500 1,050,000 5 x 9 x 60 x 5000 = Mon – Fri 09:00-18:00 5,000 13,500,000 Mon – Fri 18:00-21:00 1,000 5 x 3 x 60 x 1000 = 900,000 Mon – Fri 21:00-00:00 500 5 x 3 x 60 x 500 = 450,000 2 x 24 x 60 x 500 = Sat – Sun 500 1,440,000 23 © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. WEEKLY TOTAL 18,840,000
24.
User Outage Minutes
Example Lost email service to 500 users for 2 hours on a Monday morning at 10:00 UserOutageMinutes = 500 * 2 * 60 = 60,000 Using data from previous slide PotentialUserMinutes for the week = 18,840,000 Availability = 18,840,000 – 60,000 / 18,840,000 99.68% 24 © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
25.
What if there
aren’t users? Transaction based system Manufacturing system etc. 25 © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
26.
Critical Business Functions Some
failures only affect part of a service ATMs can dispense money but not print statements Can browse old emails but can’t send or receive Reservation system can see bookings but not make new ones It is up to the business to define the relative importance of each type of transaction You can use transaction weightings to modify availability figures 26 © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
27.
Example Transaction Weightings
IT function that is not available % Service Impact Sending email 100% Receiving email 100% Using shared distribution list to send 10% email Updating shared distribution lists 5% Accessing shared calendars 30% Updating shared calendars 10% Why don’t these add up to 100%? 27 © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
28.
What About Poor
Performance? Most SLAs have performance targets What if performance is SO SLOW that service can’t be used? Some SLAs count this as downtime Others count it separately, with its own penalties The important thing is to discuss, agree, and document IT can only agree performance if customer agrees maximum workload It is the job of the business to forecast the work, not IT 28 © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
29.
Example Performance Agreement
IT function Required response time (when service is available) 99% within 5 seconds Login 99.9% within 15 seconds 95% within 10 seconds Seat availability check 99% within 30 seconds 99% within 40 seconds Seat booking 100% within 60 seconds 95% within 20 seconds Check in 100% within 60 seconds 29 © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
30.
Planned Downtime What effect
does a planned outage have on availability? AST = Agreed Service Time If planned outage is in a service window then it isn’t downtime Some SLAs specify when maintenance will happen Some SLAs allow additional planned downtime with sufficient notice 30 © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
31.
Measuring Availability © Copyright
2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
32.
Measurement Period Remember that
Availability is defined as AST = Agreed Service Time DT = Downtime What time period should we use for the agreed service time? 32 © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
33.
Measurement Period Availability after
a single 8 hour incident Weekly Monthly Quarterly Annual 33 © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
34.
Measuring Availability You have
a good definition of Availability It is Specific about what will be delivered It is Achievable It is Relevant to the service you deliver It is defined over a clear Time period So what have we forgotten? A definition is of no use at all if you can’t Measure it 34 © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
35.
How can you
Measure Availability Service Desk Records Fairly easy to implement, inexpensive Can lead to disputes about accuracy of data Instrument all components and calculate Difficult to implement, expensive May fail to detect complex or subtle failures Use dummy transactions / clients to simulate Actually measures end-to-end availability May miss complex or subtle failures Instrument applications to report end-to-end availability Actually measures end-to-end availability Must be included in the early stages of application design 35 © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
36.
Summary How many 9s”
is not good enough Must account for End-to-end service availability Number and duration of outages Number of users or transactions affected by incidents Criticality of business functions affected by incidents Performance of critical functions Planned downtime Agreed measurement period Agreed measurement process Everything must be documented in an SLA Using SMART metrics 36 © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
37.
Thank you
Twitter: @StuartRance Email: stuart.rance@hp.com © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.