SlideShare una empresa de Scribd logo
1 de 84
Descargar para leer sin conexión
http://img11.imageshack.us/img11/2017/skatingdownarollercoastw.jpg!
Running a Cloud: How the
Cloud Impacts Service
Management and IT Operations!
!
Mr. White has fifteen years of experience designing and managing
the deployment of systems monitoring and Event Management
software. Prior to joining IBM, Mr. White held various positions
including the leader of the Monitoring and Event Management
organization of a Fortune 100 company and developing solutions as
a consultant for a wide variety of organizations, including the
Mexican Secretaría de Hacienda y Crédito Público, Telmex, Wal-
Mart of Mexico, JP Morgan Chase, Nationwide Insurance and the
US Navy Facilities and Engineering Command.!
!
Andrew White!
Cloud and Smarter Infrastructure Solution Specialist!
IBM Corporation!
http://weheartit.com/entry/12433848!
Follow Us: #ITSMSummit!
GROUND RULES FOR THIS SESSION…!
1.  If you can’t tell if I am trying to be funny…!
!GO AHEAD AND LAUGH!!
2.  Feel free to text, tweet, yammer, or whatever. Use!
3.  If you have a question, no need to wait until the
end. Just interrupt me. Seriously… I don’t mind.!
I have a lot of experience leading !
Systems and Event Management teams !
My name is Andrew White!
Cloud Operations!
I am here today to share some of what I have learned about!
More importantly, I am here today to talk about
how the cloud affects…!
QUESTION:!
What value does your IT organization
create for your business?!
If you can’t answer this question, how
can you be sure you are doing the
right things and doing them well…!
HINT: “We provide infrastructure or applications the
business uses” is not a value statement!
Follow Us: #ITSMSummit!
We are all here for one reason…!
How does IT preserve the value it creates?!
• 100% Uptime*!
• Scalability*!
• Performance*!
• Agility*!
• Good UX*!
!
*To the best of our ability!
How well would THEY say you are doing?!
Follow Us: #ITSMSummit!
CURRENT MARKET CONDITIONS!
§  The velocity of change and the volume of data is increasing!
§  Virtualization introduces complexity and increased
consumption of resources!
§  Shared services are forced to oversubscribe finite resources!
§  Expertise is limited to functional silos and there is no
understanding of how the system functions end-to-end!
§  Supporting a cloud requires the ability to manage a large-
scale dynamic infrastructure!
§  Agile development and Continuous Delivery are in conflict
with ITIL processes!
We need to recognize when we
have problems to solve!
Follow Us: #ITSMSummit!
To	
  solve	
  problems	
  quickly,	
  we	
  look	
  for	
  solu5ons	
  that	
  
we	
  can	
  use	
  to	
  define	
  best	
  prac5ces	
  and	
  develop	
  
processes	
  to	
  insert	
  a	
  measure	
  of	
  control.	
  
THE TRADITIONAL APPROACH!
Follow Us: #ITSMSummit!
§  Solutions are driven by accepted conventions!
§  Best practices are coveted and are usually adopted
without understanding how and why they were developed!
§  There must always be a right answer!
§  No logical analysis is required!
§  People are frequently seen as the “root cause”!
§  The outcomes are enforced using “re-dos” and punitive
actions (or the looming threat of these things)!
THE PROBLEM WITH THIS APPROACH!
http://leanhomebuilding.files.wordpress.com/2010/12/standard2.jpg!
Follow Us: #ITSMSummit!
§  We receive feedback from our business partners that system performance and
availability have been unacceptable for many of our critical business
applications!
§  Our productivity is impacted and we fail to meat delivery timelines!
§  IT is not able to measure its impact on the business or the end user experience!
§  There is a lack of clear communication during a problem!
§  People are “hoarding” data and reports!
§  IT lacks the information needed to prioritize performance issues and
opportunities based on business need!
§  We take a really long time to figure out what is wrong!
§  The same old problems keep coming back!
§  We never really get to the “true root cause”!
HOW DO WE KNOW WE NEED TO CHANGE!
Follow Us: #ITSMSummit!
Our typical approach towards service
improvement is a bit like attempting to
put the toothpaste back in the tube!
! “Some	
  problems	
  are	
  so	
  complex	
  
that	
  you	
  have	
  to	
  be	
  highly	
  
intelligent	
  and	
  well	
  informed	
  just	
  
to	
  be	
  undecided	
  about	
  them.”	
  
	
   	
   	
   	
  -­‐	
  Laurence	
  J.	
  Peter	
  
CONTROL IS AN ILLUSION!
Organizations don’t fail because they take the wrong
path, they fail because they can’t imagine a better
path than the one they are on.!
! ! ! ! ! ! ! ! ! ! !-- Marty Neumeier!
What is the next step in the evolution?!
Is it the infrastructure or the application?!
The perennial problem….!
Follow Us: #ITSMSummit!
DRIVING THE RIGHT KIND OF ACTION!
Application!
End User
Experience!
Gainesville!
Transaction 1!
Transaction 2!
Transaction N!
San Antonio!
Transaction 1!
Transaction 2!
Transaction N!
Des Moines!
Transaction 1!
Transaction 2!
Transaction N!
Columbus!
Transaction 1!
Transaction 2!
Transaction N!
Infrastructure!
Network!
KPI 1!
KPI 2!
KPI N!
Mainframe!
KPI 1!
KPI 2!
KPI N!
Storage!
KPI 1!
KPI 2!
KPI N!
Linux!
KPI 1!
KPI 2!
KPI N!
Middleware!
KPI 1!
KPI 2!
KPI N!
Database!
KPI 1!
KPI 2!
KPI N!
Follow Us: #ITSMSummit!
Application!
End User
Experience!
Gainesville!
Transaction 1!
Transaction 2!
Transaction N!
San Antonio!
Transaction 1!
Transaction 2!
Transaction N!
Des Moines!
Transaction 1!
Transaction 2!
Transaction N!
Columbus!
Transaction 1!
Transaction 2!
Transaction N!
Infrastructure!
Network!
KPI 1!
KPI 2!
KPI N!
Mainframe!
KPI 1!
KPI 2!
KPI N!
Storage!
KPI 1!
KPI 2!
KPI N!
Linux!
KPI 1!
KPI 2!
KPI N!
Middleware!
KPI 1!
KPI 2!
KPI N!
Database!
KPI 1!
KPI 2!
KPI N!
DRIVING THE RIGHT KIND OF ACTION!
Follow Us: #ITSMSummit!
Application!
End User
Experience!
Gainesville!
Transaction 1!
Transaction 2!
Transaction N!
San Antonio!
Transaction 1!
Transaction 2!
Transaction N!
Des Moines!
Transaction 1!
Transaction 2!
Transaction N!
Columbus!
Transaction 1!
Transaction 2!
Transaction N!
Infrastructure!
Network!
KPI 1!
KPI 2!
KPI N!
Mainframe!
KPI 1!
KPI 2!
KPI N!
Storage!
KPI 1!
KPI 2!
KPI N!
Linux!
KPI 1!
KPI 2!
KPI N!
Middleware!
KPI 1!
KPI 2!
KPI N!
Database!
KPI 1!
KPI 2!
KPI N!
DRIVING THE RIGHT KIND OF ACTION!
Follow Us: #ITSMSummit!
Application!
End User
Experience!
Gainesville!
Transaction 1!
Transaction 2!
Transaction N!
San Antonio!
Transaction 1!
Transaction 2!
Transaction N!
Des Moines!
Transaction 1!
Transaction 2!
Transaction N!
Columbus!
Transaction 1!
Transaction 2!
Transaction N!
Infrastructure!
Network!
KPI 1!
KPI 2!
KPI N!
Mainframe!
KPI 1!
KPI 2!
KPI N!
Storage!
KPI 1!
KPI 2!
KPI N!
Linux!
KPI 1!
KPI 2!
KPI N!
Middleware!
KPI 1!
KPI 2!
KPI N!
Database!
KPI 1!
KPI 2!
KPI N!
DRIVING THE RIGHT KIND OF ACTION!
Follow Us: #ITSMSummit!
Application!
End User
Experience!
Gainesville!
Transaction 1!
Transaction 2!
Transaction N!
San Antonio!
Transaction 1!
Transaction 2!
Transaction N!
Des Moines!
Transaction 1!
Transaction 2!
Transaction N!
Columbus!
Transaction 1!
Transaction 2!
Transaction N!
Infrastructure!
Network!
KPI 1!
KPI 2!
KPI N!
Mainframe!
KPI 1!
KPI 2!
KPI N!
Storage!
KPI 1!
KPI 2!
KPI N!
Linux!
KPI 1!
KPI 2!
KPI N!
Middleware!
KPI 1!
KPI 2!
KPI N!
Database!
KPI 1!
KPI 2!
KPI N!
DRIVING THE RIGHT KIND OF ACTION!
Follow Us: #ITSMSummit!
29
Who ya gonna call?
Is it the infrastructure or the application?!
The perennial problem….!
Follow Us: #ITSMSummit!
CLOUD PAIN POINTS!
§  It takes too long to diagnose problems in the
application and infrastructure!
§  Existing management tools are outdated and don’t
work at scale!
§  Critical information is missed causing outages and
poor user experiences!
§  Most problems are managed reactively!
Does any of this sound familiar?!
Follow Us: #ITSMSummit!
DRIVING THE RIGHT KIND OF ACTION!
Application!
End User
Experience!
Gainesville!
Transaction 1!
Transaction 2!
Transaction N!
San Antonio!
Transaction 1!
Transaction 2!
Transaction N!
Des Moines!
Transaction 1!
Transaction 2!
Transaction N!
Columbus!
Transaction 1!
Transaction 2!
Transaction N!
Infrastructure!
Network!
KPI 1!
KPI 2!
KPI N!
Mainframe!
KPI 1!
KPI 2!
KPI N!
Storage!
KPI 1!
KPI 2!
KPI N!
Linux!
KPI 1!
KPI 2!
KPI N!
Middleware!
KPI 1!
KPI 2!
KPI N!
Database!
KPI 1!
KPI 2!
KPI N!
The Cloud!
Follow Us: #ITSMSummit!
REQUIREMENTS FOR UNITY OF EFFORT!
1. Command
and Control!
2. Shared
Experience!
3. Situational
Awareness!
•  Command and control (No Leadership)!
•  The team lacks a clear direction!
•  Lots of activity, lack of progress!
•  Shared Experience (Poor Relationships)!
•  Us vs. Them mentality!
•  Unhealthy competition!
•  Situational Awareness (Poor Communication)!
•  Focused on cooperation, not collaboration!
•  Blame culture!
•  Infrequent or non-existent communication!
Symptoms of Missing Elements!
Follow Us: #ITSMSummit!
TWO TYPES OF DECISION MAKING!
§  Programmed Decisions!
§  Routine!
§  Repetitive!
§  Well-Structured!
§  Predetermined Decision
Rules!
§  Non-Programmed Decisions!
§  Unique!
§  Presence of Risk!
§  Presence of Uncertainty!
§  Black Swans!
Follow Us: #ITSMSummit!
BOYD’S OODA “LOOP”!
Observation!
Outside
Information!
Implicit Guidance & Control!
Unfolding Interaction
With Environment!
Feedback!
Feedback!
Unfolding
Circumstances! Cultural!
Norms!
Cognitive!
Abilities!
Knowledge !
Life Cycle!
Prior!
Wisdom!
New !
Information!
Feed
Forward! Decision!
(Hypothesis)!
Feed
Forward! Action
(Test)!
Feed
Forward!
•  Note how observation shapes orientation, shapes decision, shapes action, and in turn is shaped by the
feedback and other phenomena coming into our sensing or observing window.!
•  Also note how the entire “loop” (not just orientation) is an ongoing many-sided implicit cross-referencing
process of projection, empathy, correlation, and rejection.!
!
From “The Essence of Winning and Losing,” John R. Boyd, January 1996.!
Observe! Orient! Decide! Act!
Follow Us: #ITSMSummit!
Down	
  Time	
  
Detec5on	
  Time	
   Response	
  Time	
   Repair	
  Time	
   Recovery	
  Time	
  Outage	
  
Detec5on	
  
Diagnosis	
  
Repair	
  
Recover	
  
Restore	
  
Observe	
   Orient	
   Decide	
   Act	
  
INCIDENT LIFE CYCLE!
Follow Us: #ITSMSummit!
ANATOMY OF AN OUTAGE!
Corporate!
LANs & VPNs!
Load Balancer!
Firewall!
Web!
Servers!
Message!
Queue!
zOS!
CICS!
WAS!
Database!
WAS!
Database!
zOS!
MQ!
DB2!
IM01109089: P0 - Affecting Multiple apps!
!
!
!
!
4!
!
!
!
!
!
!
3!
!
!
!
!
!
!
1!
5:45-ish pm: CICS ABENDS
start flooding the console but
not high enough to ticket!
!
!
!
!
!
!
2!
6:00-ish pm: MQ flows start
are interrupted and are
alerting in Flow Diagnostics!
6:04pm: Synthetic transactions fail at
and 6:14 the Ops Center confirms the
issue and creates a P0 Incident!
6:54pm: Support teams
investigate the interrupted
flows and determine it is a
“back-end” problem!
10:29pm: Support teams
investigate MQ and ultimately
and rule it out and ultimately
decide to reset CICS to resolve
the issue!
!
!
!
!
5!
Follow Us: #ITSMSummit!hBp://www.ithakabound.com/wp-­‐content/uploads/2010/02/DC-­‐Snow-­‐men-­‐pushing-­‐car.jpg	
  
Why did this happen?!
Four Sources of Bad Decisions:!
!
1. Failure to frame the problem correctly!
2. Poor use of evidence!
3. Faulty decision making process!
4. No feedback for improvement!
Follow Us: #ITSMSummit!
WHERE THE BREAKDOWN OCCURS!
Observe! Orient! Decide! Act!
Situational Awareness!
Perception of
Elements in
Current Situation!
!
Level 1!
Comprehension
of Current
Situation!
!
Level 2!
Projection of
Future Status!
!
!
Level 3!
Decision!
Performance
of Actions!
CurrentState!
Feedback!
• Goals & Objectives!
• Preconceptions!
• Expectations!
• Abilities!
• Experience!
• Training!
Long Term
Memory!
Automaticity!
Cognitive Processes!
• System Capability!
• Interface Design!
• Stress & Workload!
• Complexity!
• Automation!
Adapted from Endsley, M.R. (1995b). Toward a theory of situation awareness
in dynamic systems. Human Factors 37(1), 32–64.!
Systemic Influences!
Individual Influences!
Follow Us: #ITSMSummit!
SOMETIMES WE MISS WHAT IS GOING ON!
Say… what’s a
mountain goat doing all
the way up here in a
cloud bank?!
Follow Us: #ITSMSummit!
NORMATIVE DECISION MAKING MODEL!
§  Limited Information Collection!
§  7 +/- 2!
§  Tendency to acquire manageable rather than optimal amounts
of information!
§  Difficulty identifying all possible options!
§  Judgmental Heuristics!
§  Judgmental heuristics - rules of thumb or shortcuts that people
use to reduce information processing demands!
§  Availability heuristic - tendency to base decisions on
information readily available in memory!
§  Representativeness heuristic - tendency to assess the
likelihood of an event occurring based on impressions about
similar occurrences!
§  Satisficing!
§  Choosing a solution that meets a minimum standard of
acceptance!
1. Adapted from Endsley, M.R. (1995b). Toward a theory of situation awareness in dynamic systems.
Human Factors 37(1), 32–64.!
!
Our systems are capable of producing a huge
amount of data, both on the status of their own
components and on the status of the
environment. The problem with today’s systems
is not a lack of information, but finding what is
needed when it is needed.!
Follow Us: #ITSMSummit!
Why does any of this matter?!
Follow Us: #ITSMSummit!
REQUIREMENTS FOR UNITY OF EFFORT!
1. Command
and Control!
2. Shared
Experience!
3. Situational
Awareness!
•  Command and control (No Leadership)!
•  The team lacks a clear direction!
•  Lots of activity, lack of progress!
•  Shared Experience (Poor Relationships)!
•  Us vs. Them mentality!
•  Unhealthy competition!
•  Situational Awareness (Poor Communication)!
•  Focused on cooperation, not collaboration!
•  Blame culture!
•  Infrequent or non-existent communication!
Symptoms of Missing Elements!
In the cloud, much of this will be federated or done by software!
Follow Us: #ITSMSummit!
CLOUD IS ASSISTED DECISION MAKING
§  Programmed Decision Making!
§  Collect evidence!
§  Identify the problem!
§  Select a solution!
§  Implement and evaluate the outcome!
§  Non-Programmed Decision Making!
§  Narrow evidence down to the ideal level!
§  Apply heuristics to limit the impact of cognitive bias!
§  Present options to a human for a decision!
Follow Us: #ITSMSummit!
DECISIONS BEING AUTOMATED IN THE CLOUD!
Packing! •  Compressing workloads to the fewest number of physical
servers!
•  Maximizing cost efficiencies!
Striping! •  Spreading workloads across as many physical servers as
possible!
•  Ensuring higher performance levels and reducing risk due to
component failure!
Load-
Awareness!
•  Allocating new workloads to the servers with the lowest load!
•  Maximizing the performance of the workloads!
HA-
Awareness!
•  Ensuring workloads are distributed across pods!
•  Matching availability levels with service requirements and
cost targets!
Energy
Awareness!
•  Placing workloads according to energy costs!
•  Ending workloads to reduce energy consumption or
rescheduling them for off-peak hours!
Affinity-
Awareness!
•  Placing workloads close to critical resource dependencies!
•  Collocating compatible workloads to maximize available
resources!
Platform
Awareness!
•  Allocate workloads to best platform!
•  Migrating workloads to least expensive platform still capable
of delivering required service levels!
Topology
Awareness!
•  Allocating resources within a service group near each other!
•  Isolate single-points-of-failure!
Follow Us: #ITSMSummit!
CLOUD OPERATION REQUIREMENT!
!
The perception of and reaction to a set of changing
events in terms of what can be done instead of merely
the recollection of a stimuli.1 !
Operating a cloud means enabling
good decision making!
1. Adapted from Endsley, M.R. (1995b). Toward a theory of situation
awareness in dynamic systems. Human Factors 37(1), 32–64.!
Follow Us: #ITSMSummit!
When decisions are not made based
on information, it’s called gambling.!
Follow Us: #ITSMSummit!
SOME THINGS NEVER CHANGE!
Corporate!
LANs & VPNs!
ISP!
Connection!
DNS & Internet!
Services!
Content Mgmt!
System!
Social Network!
Widgets!
Site Tracking!
& Analytics!
Banner Ads & !
Revenue Generators!
Multimedia &!
CDN Content!
Home Wireless!
& Broadband!
Mobile Broadband!
Is It My Cloud Provider?!
•  Configuration errors!
•  Application design issues!
•  Code defects!
•  Insufficient infrastructure!
•  Oversubscription Issues!
•  Poor routing optimization!
•  Low cache hit rate!
Is It a Service Provider Problem?!
•  Non-optimized mobile content!
•  Bad performance under load!
•  Blocking content delivery!
•  Incorrect geo-targeted content!
Is it an ISP Problem?!
•  Peering problems!
•  ISP Outages! Is it My Code or a Browser Problem?!
•  Missing content!
•  Poorly performing JavaScript!
•  Inconsistent CSS rendering!
•  Browser/device incompatibility!
•  Page size too big!
•  Conflicting HTML tag support!
•  Too many objects!
•  Content not optimized for device!
The Cloud!
Follow Us: #ITSMSummit!
OUR UNDERSTANDING OF YOUR GOALS!
§ Gaining visibility into and control of
an increasingly complex operating
environment in order to prevent
frequent and prolonged outages!
§ Evolving from fault monitoring to a
holistic approach to managing
application performance!
§ Increased focus on cloud makes
problem isolation and resolution
more complex.!
PROACTIVE OPERATIONS!
§ Optimizing the performance of
business processes to boost
productivity!
§ Providing cost transparency to
track, analyze, and manage
resources and control the costs
associated with highly-virtualized
and cloud environments!
§ Improving software asset
management to prevent over-
spending and under-licensing!
!
CONTROL COST!
!
§ Leveraging automation to facilitate
rapid growth and reduce the cost of
service delivery!
§ Maintaining OS and application
patch levels across all images
(active or dormant) to protect the
enterprise and enable compliance!
§ Automating application releases to
optimize service delivery and align
the Development and Operations
teams thereby increasing
innovation, reducing costs, and
accelerating time to value!
ELIMINATE HUMAN FACTORS!
Migrating to the cloud is disruptive to an IT organization. We have experienced that many of
our clients use this as an opportunity to re-evaluate the way they operate their environments
and the tools they leverage to deliver a quality service.!
We have identified three key goals driving the adoption of the cloud:!
OK.!
So now what?!
Starting the journey…!
Follow Us: #ITSMSummit!
WHAT THIS MEANS TO US…!
There are a few inescapable facts we face:!
1.  We needs reliable systems to store the promises it
makes to its customers !
2.  Our systems mirror the complexity of the
businesses they support!
3.  Our environments must be massive to scale to
handle the workload!
4.  There is too much activity for a single person to be
totally situationally aware!
5.  If the users can’t use it, it doesn’t work!
Follow Us: #ITSMSummit!
Monitoring & Capacity! Infrastructure as Code! Orchestration!
Backup & Recovery! Continuous Delivery! Storage Virtualization!
Cost Management! HA / DR!
Patch Mgmt! Dynamic Scheduling!
Bare Metal Provisioning!
Network Management!
Transaction Tracing!
App Provisioning! Performance Analytics!
App Perf Mgmnt! App Diagnostics! Service Visualization!
Monitoring & Capacity! App Perf Mgmt! Event Management!
Infrastructure!
Optimization!
Application !
Analytics!
Analytics Enabled !
Datacenter!
Virtualization !
Optimization!
DevOps!
Cloud Enabled !
Datacenter!
Cloud Optimized!
Analytics Empowered!
The building blocks on your Journey towards an agile, flexible and optimized environment!
ROADMAP TO MATURE CLOUD OPERATIONS!
Follow Us: #ITSMSummit!
REMEMBER THE OPS USE CASE!
•  Security!
•  Backups!
•  High Availability!
•  Upgradability!
•  Deployment Process!
•  Scaling and Elasticity!
•  Anticipated Performance Under Load!
•  Known Defects!
Follow Us: #ITSMSummit!
NEW OPERATIONAL REQUIREMENTS!
§  Keep the data moving!
§  Query on streams!
§  Handle stream imperfections!
§  Integrate stored and streaming data!
§  Guarantee data safety and availability!
§  Partition and scale applications automatically!
§  Process and respond instantaneously!
§  Drive Interoperability!
Follow Us: #ITSMSummit!
CLEANING UP THE LANDSCAPE!
Adapted from: Akella, Janaki. “IT Architecture: Cutting costs and complexity.” McKinsey Quarterly 13 Nov 2009
https://www.mckinseyquarterly.com/IT_architecture_Cutting_costs_and_complexity_2391!
Silo!
Monolithic
Framework!
Niche!
Launch Pad!
Information Bus!
Follow Us: #ITSMSummit!
CREATING A DIRECTED WORKFLOW!
Directed !
Non Directed!
Observe! Orient! Decide!
Launchpad!
Executive Dashboard!
Business Area!
Dashboards!
Application PAC!
Dashboards!
Command Center!
Dashboards!
Technology Owner!
Dashboard!
Application Owner!
Dashboard!
Problem
Isolation!
Workspace!
Problem
Diagnostics!
Workspace!
System Detail!
View!
Component
Detail!
View!
Follow Us: #ITSMSummit!
A TYPICAL ITIL CHANGE PROCESS!
Objectives:!
- What Changes are coming?

- Why is the change required?

- Has the existing configuration been reviewed?

- What is the risk & impact, low, medium, high?

- what is the plan B?!
Follow Us: #ITSMSummit!
Palette of library
assets enable easy
workflow composition
through drag and drop
Access to rich libraries
(toolkits) of reusable
automation assets that
enable to speed
automation creation
Rich set of actions types,
flow control, data handling
primitives that simplify
creation of complex
automations
Easy workflow action editing
for managing: data mapping,
error recovery options,
implementation details , etc.
Graphical editor for
composing and
connecting
workflows
Rich tooling
functions to edit,
version, debug,
optimize workflows
AUTOMATING ITIL PROCESSES!
Follow Us: #ITSMSummit!
FINDING METRICS THAT MATTER!
§  Will the metric be used in a report? If so, which one? How is it used in the
report?!
§  Will the metric be used in a dashboard? If so, which one? How will it be
used?!
§  What action(s) will be taken if an alert is generated? Who are the actors?
Will a ticket be generated? If so, what severity?!
§  How often is this event likely to occur? What is the impact if the event
occurs? What is the likelihood it can be detected by monitoring?!
§  Will the metric help identify the source of a problem? Is it a coincident /
symptomatic indicator?!
§  Is the metric always associated with a single problem? Could this metric
become a false indicator?!
§  What is the impact if this goes undetected?!
§  What is the lifespan for this metric? What is the potential for changes that
may reduce the efficacy of the metric?!
Evaluating the Effectiveness of a Metric!
Follow Us: #ITSMSummit!
PICKING BETTER MONITORS!
Itemize the
existing
monitors!
Brainstorm
potential gaps to
fill!
Deploy new
monitors!
Identify the
potential
risks!
Itemize the
existing
monitors!
Determine
if which
gaps exist!
Fill the
monitoring
gaps!
Current Approach!
Proposed Approach!
Follow Us: #ITSMSummit!
WHAT GOOD MONITORING LOOKS LIKE!
Corporate!
LANs & VPNs!
Load Balancer!
Load Balancer!
Firewall!
Switch!
Web Server Farm!
Database!
Data Power!
Mainframe!
Middleware!
Load Balancer!
1.  System Availability!
2.  Operating System Performance!
3.  Hardware Monitoring!
4.  Service/Daemon and Process Availability!
5.  Error Logs!
6.  Application Resource KPIs!
7.  End-to-End Transactions!
8.  Point of Failure Transactions!
9.  Fail-Over Success!
10. “Activity Monitors” and “Reverse Hockey Stick”!
Elements of Good Monitoring!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
3!2! 4! 5! 6!1!
!
!
!
!
7!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
8!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
9! !
!
!
!
!
!
10!
http://info.streamdatacenters.com/Portals/165393/Gallery/Album/6624/Richardson%20Aerial-01.png!
This is no longer the way we
should think about monitoring!
Monitoring Happens Here!
Cloud Monitoring Happens Here!
Follow Us: #ITSMSummit!
WHAT DO YOU WANT TO ACCOMPLISH?!
Your monitoring should help you answer:!
•  How will we know if the users are getting the experience they
are expecting?!
•  How much capacity do we need during normal and peak times
to ensure user expectations are met?!
•  How quickly can the provider we select ramp up to meet our
needs if we find that the service is underperforming?!
•  How fast do we need to be able to access additional capacity
once it is ready for us?!
Follow Us: #ITSMSummit!
69
Here comes the elevator pitch…
Follow Us: #ITSMSummit!
70!
THE IBM SOLUTION!
!IBM SmartCloud Suite offers essential management capabilities for
applications in complex cloud and hybrid environments. !
!
! !
•  At-a-glance status determination
via network topology graphs!
•  Proactively identify and respond to
compliance issues!
•  Monitor the performance of the
environment and the tenants living
inside of it!
•  Understand the current capacity
needs and forecast future needs!
•  Understand the costs associated
with providing the service and
enable “showback” and charge
back” reporting to the application
owners!
SINGLE POINT OF
MANAGEMENT!
!
•  Minimize service and system
outages!
•  Identify recurring incidents and
implement action to remediate
problems before they cause
impacts!
•  Assist troubleshooting by
suppressing “noise” events and
providing root cause determination!
MAXIMIZE SERVICE
AVAILABILITY!
!
•  Reduce the need for manual
action or intervention!
•  Automate for repeatability and
elimination of human error!
•  Develop standardized practices
for complex business processes!
•  Enable the development of APIs
to allow for self-service
management by the consumers!
IMPROVED OPERATIONAL
EFFICIENCY!
Follow Us: #ITSMSummit!
Understand the !
end-user experience !
Follow changing !
workloads!
Mobile devices & 

smart endpoints!
Private, public & 

hybrid clouds!
Highly virtualized applications,
storage & networks !
Discovery!
Visibility into
application
resources!
End User
Experience!
Transaction
performance
monitoring to
ensure SLA
compliance!
!
!
Transaction
Tracking!
Rapid problem
isolation through
transaction 

path analysis!
!
!
Diagnostics
!!
Domain-specific
operations tools
for diagnosis and
repair!
!
!
Predictive
Analytics!
Proactive
approach to
reduce outages
& improve
performance!
!
!
shared data & common services!
See steps !
across the cloud !
VISIBILITY, CONTROL AND AUTOMATION TO INTELLIGENTLY MANAGE
CRITICAL APPLICATIONS IN CLOUD AND HYBRID ENVIRONMENTS.!
APPLICATION PERFORMANCE MANAGEMENT!
Follow Us: #ITSMSummit!
COMPOSITE APPLICATIONS!
Site Content!
Search!
Session!
Information!
User Login!
& Identity Mgmt!
Content Mgmt!
System!
Social Network!
Widgets!
Site Tracking!
& Analytics!
Banner Ads & !
Revenue Generators!
Multimedia &!
CDN Content!
Follow Us: #ITSMSummit!
GAINING PERSPECTIVE REQUIRES BALANCE!
Packet Capture!
Synthetic Transactions!
Client Monitoring!
Client Monitoring!
Synthetic Transactions!
Server Probe!
1.  Client to the Server!
2.  Server to the Client!
3.  “3rd Party” Vantage Point!
4.  Synthetic Transactions!
Four Perspectives of User Experience!
Follow Us: #ITSMSummit!
Predic.ve	
  Outage	
  
Avoidance	
  
Ensure	
  availability	
  of	
  
applicaBons	
  and	
  services	
  
	
  
	
  
• Use learning tools to
augment custom best
practices
• Leverage statistical
methods to maximize
predictive warning
• Improve problem
detection across IT silos
Predict
Faster	
  Problem	
  
Resolu.on	
  
Find	
  &	
  correct	
  problems	
  faster	
  
with	
  tools	
  that	
  determine	
  acBons	
  
required	
  to	
  resolve	
  issues	
  
	
  
	
  
• Identify problems quicker
with insight to large
unstructured repositories
• Isolate problems quicker by
bringing relevant unstructured
data into problem
investigations
• Repair problems quicker with
the right details quickly to hand.
Resolve
Op.mized	
  
Performance	
  	
  
Track,	
  OpBmize,	
  and	
  Predict	
  
capacity	
  and	
  performance	
  needs	
  
over	
  Bme	
  
	
  
	
  
• Track capacity and
performance of applications
and services in classic and
cloud environments
• Optimize resource
deployment with what-if and
best fit planning tools
• Escalate capacity and
performance problems before
they cause critical failures
Perform
Improved	
  Insight	
  	
  
Enhance	
  visibility	
  into	
  systems	
  
resource	
  relaBonships	
  while	
  
increasing	
  customer	
  saBsfacBon	
  	
  
	
  
	
  
• Determine what resources
are interdependent to
assess impact of failures
• Gain insight into what is
important to your customer
• Decrease customer churn
and acquisition costs while
increasing customer
retention and satisfaction
Know
Automated Analytics helps lower IT Administration Costs:
• Performance and Capacity planning tools monitor appropriately and escalate, reducing time
consuming report browsing
• Learning tools reduce customization and best practices investment on initial deployment
• Log Analysis helps speed problem resolution to be able to do more with less
BUSINESS VALUE OF ADOPTING ANALYTICS!
Follow Us: #ITSMSummit!
That is great but we need more…
Follow Us: #ITSMSummit!
In addition to handling monitoring and performance alerts, it
helps drive improved availability.!
Our Formula:!
1.  Continually collect, categorize, and analyze all events from as many
sources as possible!
2.  Correlate events and analyze them using previous outages as
patterns to identify situations worth investigating!
3.  Notify a support team so the situation can be mitigated before
becoming an outage!
4.  Automate responses that have well established situational
fingerprints and proven resolution steps!
THE EVENT MANAGEMENT FOCUS!
Follow Us: #ITSMSummit!
ONE INTEGRATED ENVIRONMENT!
Distributed! Database!Mainframe! Network! Middleware! Storage!
Event Pool!
Operational!
Data Warehouse!
Predictive!
Enrichment & Correlation!
Service Desk!Paging!
CMDB!
Knowledge!
Asset Mgmt!
Event Catalog!
Event API!
Business Telemetry!
3rd Party Providers!
Presentation Framework!
Follow Us: #ITSMSummit!
Presentation!
Framework!
Asset Management
& Topology
Database!
Aggregation and
Analysis!
Security
Management!
Availability
Management!
Configuration
Management!
Change
Management!
Performance
Management!
Enterprise Data
Sources!
Business
Telemetry
Information!
Configuration Discrepancies!
Enrichment Data!
Business Activity Data!
Historical Data!
“Enriched” Events!
Change Activity!
Topology Snapshots!
Trend-RelatedFaults!
DiscoveredProblems!
Status Indications!
Incidents!
Audit Information and Suspicious Activity!
Enrichment Data! Business Activity Data!
Automated
Discovery!
Follow Us: #ITSMSummit!
CONCEPTUALIZING SITUATIONAL AWARENESS!
Situational
Awareness
Engine!
Adapted from http://www.slideshare.net/TimBassCEP/getting-started-in-cep-
how-to-build-an-event-processing-application-presentation-717795!
Real-Time
Event Streams!
Detected and
Predicted Situations!
Patterns from
Historical Data!
Causal Relationship
from Past RCAs!
Follow Us: #ITSMSummit!
CONCEPTUAL MODEL OF COMPLEX EVENT PROCESSING!
Adapted from http://www.slideshare.net/aparnachaudhary/esper-cep-engine!
Event Pipeline!
Event Queries!
Time Window!
Data Events!
Control Event!
Other Events!
Event Filter!
Scenarios!
A!
B!
C!
Feedback Loop!
Event Intelligence!
Action Events!
Follow Us: #ITSMSummit!
ITERATIVE DEVELOPMENT!
As you recognize opportunities to
capture knowledge, use it to improve
your Event Management System. !
Follow Us: #ITSMSummit!
The IT Culture is driven to technology for solutions. Leverage
your monitoring and testing tools to help practice failure
scenarios. Work on tracking potential points of failure by
creating monitoring and report the rate of occurrence to the
developers at the start of each new iteration.!
PLAYING TO OUR STRENGTHS!
Follow Us: #ITSMSummit!
LET’S KEEP THE CONVERSATION GOING…!
Andrew.P.White@Gmail.com!
ReverendDrew!
SystemsManagementZen.Wordpress.com!
systemsmanagementzen.wordpress.com/feed/!
@SystemsMgmtZen!
ReverendDrew!
APWhite@us.ibm.com!
614-306-3434!
Cloud Impacts on IT Operations

Más contenido relacionado

La actualidad más candente

State of on call report 2014
State of on call report 2014State of on call report 2014
State of on call report 2014Todd Vernon
 
Andrew Vermes: Major Incident Management
Andrew Vermes: Major Incident ManagementAndrew Vermes: Major Incident Management
Andrew Vermes: Major Incident ManagementitSMF UK
 
2011 09 18 United "Platitudes, reality and promise"
2011 09 18 United "Platitudes, reality and promise"2011 09 18 United "Platitudes, reality and promise"
2011 09 18 United "Platitudes, reality and promise"Gene Kim
 
Beyond the Knowledge Base: Turning Data into Wisdom - an ITSM Academy Webinar
Beyond the Knowledge Base: Turning Data into Wisdom - an ITSM Academy WebinarBeyond the Knowledge Base: Turning Data into Wisdom - an ITSM Academy Webinar
Beyond the Knowledge Base: Turning Data into Wisdom - an ITSM Academy WebinarKaren Skiles
 
Automated decision making with predictive applications – Big Data Amsterdam
Automated decision making with predictive applications – Big Data AmsterdamAutomated decision making with predictive applications – Big Data Amsterdam
Automated decision making with predictive applications – Big Data AmsterdamLars Trieloff
 
Marcus Ranum on Bad Idea Zombies
Marcus Ranum on Bad Idea Zombies Marcus Ranum on Bad Idea Zombies
Marcus Ranum on Bad Idea Zombies David Strom
 
2012 05 corp fin 1c
2012 05 corp fin 1c2012 05 corp fin 1c
2012 05 corp fin 1cGene Kim
 
Major Incident - make your NOC Rock
Major Incident - make your NOC RockMajor Incident - make your NOC Rock
Major Incident - make your NOC RockBob Fishman
 
World-Class Incident Response Management
World-Class Incident Response ManagementWorld-Class Incident Response Management
World-Class Incident Response ManagementKeith Smith
 
How Digital Trends Are Compressing Processes
How Digital Trends Are Compressing ProcessesHow Digital Trends Are Compressing Processes
How Digital Trends Are Compressing ProcessesSharon Richardson
 
Normal accidents and outpatient surgeries
Normal accidents and outpatient surgeriesNormal accidents and outpatient surgeries
Normal accidents and outpatient surgeriesJonathan Creasy
 
SecureWorld: Security is Dead, Rugged DevOps 1f
SecureWorld:  Security is Dead, Rugged DevOps 1fSecureWorld:  Security is Dead, Rugged DevOps 1f
SecureWorld: Security is Dead, Rugged DevOps 1fGene Kim
 
Kim IT Pro Forum Eugene: IT at Ludicrous Speeds - rugged dev ops
Kim IT Pro Forum Eugene: IT at Ludicrous Speeds - rugged dev opsKim IT Pro Forum Eugene: IT at Ludicrous Speeds - rugged dev ops
Kim IT Pro Forum Eugene: IT at Ludicrous Speeds - rugged dev opsGene Kim
 
SecureWorld Kim - Infosec at Ludicrous Speeds - Rugged DevOps 6a
SecureWorld   Kim - Infosec at Ludicrous Speeds - Rugged DevOps 6aSecureWorld   Kim - Infosec at Ludicrous Speeds - Rugged DevOps 6a
SecureWorld Kim - Infosec at Ludicrous Speeds - Rugged DevOps 6aGene Kim
 
Getting Started with Business Continuity
Getting Started with Business ContinuityGetting Started with Business Continuity
Getting Started with Business ContinuityStephen Cobb
 
Bad Advice, Unintended Consequences, and Broken Paradigms: Think & Act Di...
Bad Advice, Unintended Consequences, and Broken Paradigms: Think & Act Di...Bad Advice, Unintended Consequences, and Broken Paradigms: Think & Act Di...
Bad Advice, Unintended Consequences, and Broken Paradigms: Think & Act Di...Steve Werby
 
Covid 19: Understanding the context using systems thinking techniques webinar...
Covid 19: Understanding the context using systems thinking techniques webinar...Covid 19: Understanding the context using systems thinking techniques webinar...
Covid 19: Understanding the context using systems thinking techniques webinar...Association for Project Management
 
7 key problems Water Industry need to solve
7 key problems Water Industry need to solve7 key problems Water Industry need to solve
7 key problems Water Industry need to solveDaniel Cardelús
 
Dr Steve Goldman's Top Ten Business Continuity Predictions / Trends for 2014
Dr Steve Goldman's Top Ten Business Continuity Predictions / Trends for 2014Dr Steve Goldman's Top Ten Business Continuity Predictions / Trends for 2014
Dr Steve Goldman's Top Ten Business Continuity Predictions / Trends for 2014xMatters Inc
 

La actualidad más candente (20)

State of on call report 2014
State of on call report 2014State of on call report 2014
State of on call report 2014
 
Andrew Vermes: Major Incident Management
Andrew Vermes: Major Incident ManagementAndrew Vermes: Major Incident Management
Andrew Vermes: Major Incident Management
 
2011 09 18 United "Platitudes, reality and promise"
2011 09 18 United "Platitudes, reality and promise"2011 09 18 United "Platitudes, reality and promise"
2011 09 18 United "Platitudes, reality and promise"
 
Beyond the Knowledge Base: Turning Data into Wisdom - an ITSM Academy Webinar
Beyond the Knowledge Base: Turning Data into Wisdom - an ITSM Academy WebinarBeyond the Knowledge Base: Turning Data into Wisdom - an ITSM Academy Webinar
Beyond the Knowledge Base: Turning Data into Wisdom - an ITSM Academy Webinar
 
Automated decision making with predictive applications – Big Data Amsterdam
Automated decision making with predictive applications – Big Data AmsterdamAutomated decision making with predictive applications – Big Data Amsterdam
Automated decision making with predictive applications – Big Data Amsterdam
 
Dit yvol3iss41
Dit yvol3iss41Dit yvol3iss41
Dit yvol3iss41
 
Marcus Ranum on Bad Idea Zombies
Marcus Ranum on Bad Idea Zombies Marcus Ranum on Bad Idea Zombies
Marcus Ranum on Bad Idea Zombies
 
2012 05 corp fin 1c
2012 05 corp fin 1c2012 05 corp fin 1c
2012 05 corp fin 1c
 
Major Incident - make your NOC Rock
Major Incident - make your NOC RockMajor Incident - make your NOC Rock
Major Incident - make your NOC Rock
 
World-Class Incident Response Management
World-Class Incident Response ManagementWorld-Class Incident Response Management
World-Class Incident Response Management
 
How Digital Trends Are Compressing Processes
How Digital Trends Are Compressing ProcessesHow Digital Trends Are Compressing Processes
How Digital Trends Are Compressing Processes
 
Normal accidents and outpatient surgeries
Normal accidents and outpatient surgeriesNormal accidents and outpatient surgeries
Normal accidents and outpatient surgeries
 
SecureWorld: Security is Dead, Rugged DevOps 1f
SecureWorld:  Security is Dead, Rugged DevOps 1fSecureWorld:  Security is Dead, Rugged DevOps 1f
SecureWorld: Security is Dead, Rugged DevOps 1f
 
Kim IT Pro Forum Eugene: IT at Ludicrous Speeds - rugged dev ops
Kim IT Pro Forum Eugene: IT at Ludicrous Speeds - rugged dev opsKim IT Pro Forum Eugene: IT at Ludicrous Speeds - rugged dev ops
Kim IT Pro Forum Eugene: IT at Ludicrous Speeds - rugged dev ops
 
SecureWorld Kim - Infosec at Ludicrous Speeds - Rugged DevOps 6a
SecureWorld   Kim - Infosec at Ludicrous Speeds - Rugged DevOps 6aSecureWorld   Kim - Infosec at Ludicrous Speeds - Rugged DevOps 6a
SecureWorld Kim - Infosec at Ludicrous Speeds - Rugged DevOps 6a
 
Getting Started with Business Continuity
Getting Started with Business ContinuityGetting Started with Business Continuity
Getting Started with Business Continuity
 
Bad Advice, Unintended Consequences, and Broken Paradigms: Think & Act Di...
Bad Advice, Unintended Consequences, and Broken Paradigms: Think & Act Di...Bad Advice, Unintended Consequences, and Broken Paradigms: Think & Act Di...
Bad Advice, Unintended Consequences, and Broken Paradigms: Think & Act Di...
 
Covid 19: Understanding the context using systems thinking techniques webinar...
Covid 19: Understanding the context using systems thinking techniques webinar...Covid 19: Understanding the context using systems thinking techniques webinar...
Covid 19: Understanding the context using systems thinking techniques webinar...
 
7 key problems Water Industry need to solve
7 key problems Water Industry need to solve7 key problems Water Industry need to solve
7 key problems Water Industry need to solve
 
Dr Steve Goldman's Top Ten Business Continuity Predictions / Trends for 2014
Dr Steve Goldman's Top Ten Business Continuity Predictions / Trends for 2014Dr Steve Goldman's Top Ten Business Continuity Predictions / Trends for 2014
Dr Steve Goldman's Top Ten Business Continuity Predictions / Trends for 2014
 

Similar a Cloud Impacts on IT Operations

Brighttalk what should we be monitoring - final
Brighttalk   what should we be monitoring - finalBrighttalk   what should we be monitoring - final
Brighttalk what should we be monitoring - finalAndrew White
 
Bright talk if they cant use it, it doesnt work - final
Bright talk   if they cant use it, it doesnt work - finalBright talk   if they cant use it, it doesnt work - final
Bright talk if they cant use it, it doesnt work - finalAndrew White
 
The Times are a Changing
The Times are a ChangingThe Times are a Changing
The Times are a ChangingPeter Brewer
 
DevOps Roadtrip Minneapolis
DevOps Roadtrip Minneapolis DevOps Roadtrip Minneapolis
DevOps Roadtrip Minneapolis VictorOps
 
Leading the Digital Era @ Banking - Agile Organization
Leading the Digital Era @ Banking - Agile OrganizationLeading the Digital Era @ Banking - Agile Organization
Leading the Digital Era @ Banking - Agile OrganizationTathagat Varma
 
Real Estate Systems: 3X your business by systematizing what you're already doing
Real Estate Systems: 3X your business by systematizing what you're already doingReal Estate Systems: 3X your business by systematizing what you're already doing
Real Estate Systems: 3X your business by systematizing what you're already doingAaron Lewis "Modesto Real Estate"
 
7 Advanced Lead Nurturing Tips
7 Advanced Lead Nurturing Tips7 Advanced Lead Nurturing Tips
7 Advanced Lead Nurturing TipsPardot
 
AppDynamics the Missing Link to DevOps - AppSphere16
AppDynamics the Missing Link to DevOps - AppSphere16AppDynamics the Missing Link to DevOps - AppSphere16
AppDynamics the Missing Link to DevOps - AppSphere16AppDynamics
 
MacIT 2014 - Essential Security & Risk Fundamentals
MacIT 2014 - Essential Security & Risk FundamentalsMacIT 2014 - Essential Security & Risk Fundamentals
MacIT 2014 - Essential Security & Risk FundamentalsAlison Gianotto
 
7 Tips for Building a Well-Loved App with QuickBase
7 Tips for Building a Well-Loved App with QuickBase7 Tips for Building a Well-Loved App with QuickBase
7 Tips for Building a Well-Loved App with QuickBaseQuickBase, Inc.
 
Marketing Process - Go-to-Market Plans
Marketing Process - Go-to-Market PlansMarketing Process - Go-to-Market Plans
Marketing Process - Go-to-Market PlansFour Quadrant LLC
 
I'm Mad as Hell -- The CEO and Technology Strategy
I'm Mad as Hell -- The CEO and Technology StrategyI'm Mad as Hell -- The CEO and Technology Strategy
I'm Mad as Hell -- The CEO and Technology StrategyJohn Mancini
 
From Microfilm to Big Data - How Can One Brain Handle This Much Change Withou...
From Microfilm to Big Data - How Can One Brain Handle This Much Change Withou...From Microfilm to Big Data - How Can One Brain Handle This Much Change Withou...
From Microfilm to Big Data - How Can One Brain Handle This Much Change Withou...John Mancini
 
ChefConf 2013 Keynote Session – Opscode – Adam Jacob
ChefConf 2013 Keynote Session – Opscode – Adam JacobChefConf 2013 Keynote Session – Opscode – Adam Jacob
ChefConf 2013 Keynote Session – Opscode – Adam JacobChef Software, Inc.
 
Disruptive Innovation - the key drivers behind today's unprecedented rate of ...
Disruptive Innovation - the key drivers behind today's unprecedented rate of ...Disruptive Innovation - the key drivers behind today's unprecedented rate of ...
Disruptive Innovation - the key drivers behind today's unprecedented rate of ...Dino Talic
 
The Fragmented Enterprise: ECM in the Era of Social Business.
The Fragmented Enterprise: ECM in the Era of Social Business.The Fragmented Enterprise: ECM in the Era of Social Business.
The Fragmented Enterprise: ECM in the Era of Social Business.AIIM International
 
To make a working website
To make a working websiteTo make a working website
To make a working websiteandh
 
Squiz Scotland Seminar - Hot Topics for Web Experience Management - Feb 2012
Squiz Scotland Seminar - Hot Topics for Web Experience Management - Feb 2012Squiz Scotland Seminar - Hot Topics for Web Experience Management - Feb 2012
Squiz Scotland Seminar - Hot Topics for Web Experience Management - Feb 2012Squiz
 
Context, Chaos & Change - Why Content Strategy Is So Important For Content Ma...
Context, Chaos & Change - Why Content Strategy Is So Important For Content Ma...Context, Chaos & Change - Why Content Strategy Is So Important For Content Ma...
Context, Chaos & Change - Why Content Strategy Is So Important For Content Ma...The Content Advisory
 

Similar a Cloud Impacts on IT Operations (20)

Brighttalk what should we be monitoring - final
Brighttalk   what should we be monitoring - finalBrighttalk   what should we be monitoring - final
Brighttalk what should we be monitoring - final
 
Bright talk if they cant use it, it doesnt work - final
Bright talk   if they cant use it, it doesnt work - finalBright talk   if they cant use it, it doesnt work - final
Bright talk if they cant use it, it doesnt work - final
 
The Times are a Changing
The Times are a ChangingThe Times are a Changing
The Times are a Changing
 
DevOps Roadtrip Minneapolis
DevOps Roadtrip Minneapolis DevOps Roadtrip Minneapolis
DevOps Roadtrip Minneapolis
 
Leading the Digital Era @ Banking - Agile Organization
Leading the Digital Era @ Banking - Agile OrganizationLeading the Digital Era @ Banking - Agile Organization
Leading the Digital Era @ Banking - Agile Organization
 
Real Estate Systems: 3X your business by systematizing what you're already doing
Real Estate Systems: 3X your business by systematizing what you're already doingReal Estate Systems: 3X your business by systematizing what you're already doing
Real Estate Systems: 3X your business by systematizing what you're already doing
 
7 Advanced Lead Nurturing Tips
7 Advanced Lead Nurturing Tips7 Advanced Lead Nurturing Tips
7 Advanced Lead Nurturing Tips
 
AppDynamics the Missing Link to DevOps - AppSphere16
AppDynamics the Missing Link to DevOps - AppSphere16AppDynamics the Missing Link to DevOps - AppSphere16
AppDynamics the Missing Link to DevOps - AppSphere16
 
MacIT 2014 - Essential Security & Risk Fundamentals
MacIT 2014 - Essential Security & Risk FundamentalsMacIT 2014 - Essential Security & Risk Fundamentals
MacIT 2014 - Essential Security & Risk Fundamentals
 
7 Tips for Building a Well-Loved App with QuickBase
7 Tips for Building a Well-Loved App with QuickBase7 Tips for Building a Well-Loved App with QuickBase
7 Tips for Building a Well-Loved App with QuickBase
 
Marketing Process - Go-to-Market Plans
Marketing Process - Go-to-Market PlansMarketing Process - Go-to-Market Plans
Marketing Process - Go-to-Market Plans
 
I'm Mad as Hell -- The CEO and Technology Strategy
I'm Mad as Hell -- The CEO and Technology StrategyI'm Mad as Hell -- The CEO and Technology Strategy
I'm Mad as Hell -- The CEO and Technology Strategy
 
The Perfect Storm
The Perfect StormThe Perfect Storm
The Perfect Storm
 
From Microfilm to Big Data - How Can One Brain Handle This Much Change Withou...
From Microfilm to Big Data - How Can One Brain Handle This Much Change Withou...From Microfilm to Big Data - How Can One Brain Handle This Much Change Withou...
From Microfilm to Big Data - How Can One Brain Handle This Much Change Withou...
 
ChefConf 2013 Keynote Session – Opscode – Adam Jacob
ChefConf 2013 Keynote Session – Opscode – Adam JacobChefConf 2013 Keynote Session – Opscode – Adam Jacob
ChefConf 2013 Keynote Session – Opscode – Adam Jacob
 
Disruptive Innovation - the key drivers behind today's unprecedented rate of ...
Disruptive Innovation - the key drivers behind today's unprecedented rate of ...Disruptive Innovation - the key drivers behind today's unprecedented rate of ...
Disruptive Innovation - the key drivers behind today's unprecedented rate of ...
 
The Fragmented Enterprise: ECM in the Era of Social Business.
The Fragmented Enterprise: ECM in the Era of Social Business.The Fragmented Enterprise: ECM in the Era of Social Business.
The Fragmented Enterprise: ECM in the Era of Social Business.
 
To make a working website
To make a working websiteTo make a working website
To make a working website
 
Squiz Scotland Seminar - Hot Topics for Web Experience Management - Feb 2012
Squiz Scotland Seminar - Hot Topics for Web Experience Management - Feb 2012Squiz Scotland Seminar - Hot Topics for Web Experience Management - Feb 2012
Squiz Scotland Seminar - Hot Topics for Web Experience Management - Feb 2012
 
Context, Chaos & Change - Why Content Strategy Is So Important For Content Ma...
Context, Chaos & Change - Why Content Strategy Is So Important For Content Ma...Context, Chaos & Change - Why Content Strategy Is So Important For Content Ma...
Context, Chaos & Change - Why Content Strategy Is So Important For Content Ma...
 

Último

Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate AgentsRyan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate AgentsRyan Mahoney
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 

Último (20)

Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate AgentsRyan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 

Cloud Impacts on IT Operations

  • 1. http://img11.imageshack.us/img11/2017/skatingdownarollercoastw.jpg! Running a Cloud: How the Cloud Impacts Service Management and IT Operations!
  • 2. ! Mr. White has fifteen years of experience designing and managing the deployment of systems monitoring and Event Management software. Prior to joining IBM, Mr. White held various positions including the leader of the Monitoring and Event Management organization of a Fortune 100 company and developing solutions as a consultant for a wide variety of organizations, including the Mexican Secretaría de Hacienda y Crédito Público, Telmex, Wal- Mart of Mexico, JP Morgan Chase, Nationwide Insurance and the US Navy Facilities and Engineering Command.! ! Andrew White! Cloud and Smarter Infrastructure Solution Specialist! IBM Corporation!
  • 4. Follow Us: #ITSMSummit! GROUND RULES FOR THIS SESSION…! 1.  If you can’t tell if I am trying to be funny…! !GO AHEAD AND LAUGH!! 2.  Feel free to text, tweet, yammer, or whatever. Use! 3.  If you have a question, no need to wait until the end. Just interrupt me. Seriously… I don’t mind.!
  • 5. I have a lot of experience leading ! Systems and Event Management teams ! My name is Andrew White!
  • 6. Cloud Operations! I am here today to share some of what I have learned about!
  • 7. More importantly, I am here today to talk about how the cloud affects…!
  • 8. QUESTION:! What value does your IT organization create for your business?!
  • 9. If you can’t answer this question, how can you be sure you are doing the right things and doing them well…!
  • 10. HINT: “We provide infrastructure or applications the business uses” is not a value statement!
  • 11. Follow Us: #ITSMSummit! We are all here for one reason…!
  • 12. How does IT preserve the value it creates?! • 100% Uptime*! • Scalability*! • Performance*! • Agility*! • Good UX*! ! *To the best of our ability!
  • 13. How well would THEY say you are doing?!
  • 14. Follow Us: #ITSMSummit! CURRENT MARKET CONDITIONS! §  The velocity of change and the volume of data is increasing! §  Virtualization introduces complexity and increased consumption of resources! §  Shared services are forced to oversubscribe finite resources! §  Expertise is limited to functional silos and there is no understanding of how the system functions end-to-end! §  Supporting a cloud requires the ability to manage a large- scale dynamic infrastructure! §  Agile development and Continuous Delivery are in conflict with ITIL processes!
  • 15. We need to recognize when we have problems to solve!
  • 16. Follow Us: #ITSMSummit! To  solve  problems  quickly,  we  look  for  solu5ons  that   we  can  use  to  define  best  prac5ces  and  develop   processes  to  insert  a  measure  of  control.   THE TRADITIONAL APPROACH!
  • 17. Follow Us: #ITSMSummit! §  Solutions are driven by accepted conventions! §  Best practices are coveted and are usually adopted without understanding how and why they were developed! §  There must always be a right answer! §  No logical analysis is required! §  People are frequently seen as the “root cause”! §  The outcomes are enforced using “re-dos” and punitive actions (or the looming threat of these things)! THE PROBLEM WITH THIS APPROACH!
  • 19. Follow Us: #ITSMSummit! §  We receive feedback from our business partners that system performance and availability have been unacceptable for many of our critical business applications! §  Our productivity is impacted and we fail to meat delivery timelines! §  IT is not able to measure its impact on the business or the end user experience! §  There is a lack of clear communication during a problem! §  People are “hoarding” data and reports! §  IT lacks the information needed to prioritize performance issues and opportunities based on business need! §  We take a really long time to figure out what is wrong! §  The same old problems keep coming back! §  We never really get to the “true root cause”! HOW DO WE KNOW WE NEED TO CHANGE!
  • 20. Follow Us: #ITSMSummit! Our typical approach towards service improvement is a bit like attempting to put the toothpaste back in the tube! ! “Some  problems  are  so  complex   that  you  have  to  be  highly   intelligent  and  well  informed  just   to  be  undecided  about  them.”          -­‐  Laurence  J.  Peter   CONTROL IS AN ILLUSION!
  • 21. Organizations don’t fail because they take the wrong path, they fail because they can’t imagine a better path than the one they are on.! ! ! ! ! ! ! ! ! ! ! !-- Marty Neumeier!
  • 22. What is the next step in the evolution?!
  • 23. Is it the infrastructure or the application?! The perennial problem….!
  • 24. Follow Us: #ITSMSummit! DRIVING THE RIGHT KIND OF ACTION! Application! End User Experience! Gainesville! Transaction 1! Transaction 2! Transaction N! San Antonio! Transaction 1! Transaction 2! Transaction N! Des Moines! Transaction 1! Transaction 2! Transaction N! Columbus! Transaction 1! Transaction 2! Transaction N! Infrastructure! Network! KPI 1! KPI 2! KPI N! Mainframe! KPI 1! KPI 2! KPI N! Storage! KPI 1! KPI 2! KPI N! Linux! KPI 1! KPI 2! KPI N! Middleware! KPI 1! KPI 2! KPI N! Database! KPI 1! KPI 2! KPI N!
  • 25. Follow Us: #ITSMSummit! Application! End User Experience! Gainesville! Transaction 1! Transaction 2! Transaction N! San Antonio! Transaction 1! Transaction 2! Transaction N! Des Moines! Transaction 1! Transaction 2! Transaction N! Columbus! Transaction 1! Transaction 2! Transaction N! Infrastructure! Network! KPI 1! KPI 2! KPI N! Mainframe! KPI 1! KPI 2! KPI N! Storage! KPI 1! KPI 2! KPI N! Linux! KPI 1! KPI 2! KPI N! Middleware! KPI 1! KPI 2! KPI N! Database! KPI 1! KPI 2! KPI N! DRIVING THE RIGHT KIND OF ACTION!
  • 26. Follow Us: #ITSMSummit! Application! End User Experience! Gainesville! Transaction 1! Transaction 2! Transaction N! San Antonio! Transaction 1! Transaction 2! Transaction N! Des Moines! Transaction 1! Transaction 2! Transaction N! Columbus! Transaction 1! Transaction 2! Transaction N! Infrastructure! Network! KPI 1! KPI 2! KPI N! Mainframe! KPI 1! KPI 2! KPI N! Storage! KPI 1! KPI 2! KPI N! Linux! KPI 1! KPI 2! KPI N! Middleware! KPI 1! KPI 2! KPI N! Database! KPI 1! KPI 2! KPI N! DRIVING THE RIGHT KIND OF ACTION!
  • 27. Follow Us: #ITSMSummit! Application! End User Experience! Gainesville! Transaction 1! Transaction 2! Transaction N! San Antonio! Transaction 1! Transaction 2! Transaction N! Des Moines! Transaction 1! Transaction 2! Transaction N! Columbus! Transaction 1! Transaction 2! Transaction N! Infrastructure! Network! KPI 1! KPI 2! KPI N! Mainframe! KPI 1! KPI 2! KPI N! Storage! KPI 1! KPI 2! KPI N! Linux! KPI 1! KPI 2! KPI N! Middleware! KPI 1! KPI 2! KPI N! Database! KPI 1! KPI 2! KPI N! DRIVING THE RIGHT KIND OF ACTION!
  • 28. Follow Us: #ITSMSummit! Application! End User Experience! Gainesville! Transaction 1! Transaction 2! Transaction N! San Antonio! Transaction 1! Transaction 2! Transaction N! Des Moines! Transaction 1! Transaction 2! Transaction N! Columbus! Transaction 1! Transaction 2! Transaction N! Infrastructure! Network! KPI 1! KPI 2! KPI N! Mainframe! KPI 1! KPI 2! KPI N! Storage! KPI 1! KPI 2! KPI N! Linux! KPI 1! KPI 2! KPI N! Middleware! KPI 1! KPI 2! KPI N! Database! KPI 1! KPI 2! KPI N! DRIVING THE RIGHT KIND OF ACTION!
  • 30. Is it the infrastructure or the application?! The perennial problem….!
  • 31. Follow Us: #ITSMSummit! CLOUD PAIN POINTS! §  It takes too long to diagnose problems in the application and infrastructure! §  Existing management tools are outdated and don’t work at scale! §  Critical information is missed causing outages and poor user experiences! §  Most problems are managed reactively! Does any of this sound familiar?!
  • 32. Follow Us: #ITSMSummit! DRIVING THE RIGHT KIND OF ACTION! Application! End User Experience! Gainesville! Transaction 1! Transaction 2! Transaction N! San Antonio! Transaction 1! Transaction 2! Transaction N! Des Moines! Transaction 1! Transaction 2! Transaction N! Columbus! Transaction 1! Transaction 2! Transaction N! Infrastructure! Network! KPI 1! KPI 2! KPI N! Mainframe! KPI 1! KPI 2! KPI N! Storage! KPI 1! KPI 2! KPI N! Linux! KPI 1! KPI 2! KPI N! Middleware! KPI 1! KPI 2! KPI N! Database! KPI 1! KPI 2! KPI N! The Cloud!
  • 33. Follow Us: #ITSMSummit! REQUIREMENTS FOR UNITY OF EFFORT! 1. Command and Control! 2. Shared Experience! 3. Situational Awareness! •  Command and control (No Leadership)! •  The team lacks a clear direction! •  Lots of activity, lack of progress! •  Shared Experience (Poor Relationships)! •  Us vs. Them mentality! •  Unhealthy competition! •  Situational Awareness (Poor Communication)! •  Focused on cooperation, not collaboration! •  Blame culture! •  Infrequent or non-existent communication! Symptoms of Missing Elements!
  • 34. Follow Us: #ITSMSummit! TWO TYPES OF DECISION MAKING! §  Programmed Decisions! §  Routine! §  Repetitive! §  Well-Structured! §  Predetermined Decision Rules! §  Non-Programmed Decisions! §  Unique! §  Presence of Risk! §  Presence of Uncertainty! §  Black Swans!
  • 35. Follow Us: #ITSMSummit! BOYD’S OODA “LOOP”! Observation! Outside Information! Implicit Guidance & Control! Unfolding Interaction With Environment! Feedback! Feedback! Unfolding Circumstances! Cultural! Norms! Cognitive! Abilities! Knowledge ! Life Cycle! Prior! Wisdom! New ! Information! Feed Forward! Decision! (Hypothesis)! Feed Forward! Action (Test)! Feed Forward! •  Note how observation shapes orientation, shapes decision, shapes action, and in turn is shaped by the feedback and other phenomena coming into our sensing or observing window.! •  Also note how the entire “loop” (not just orientation) is an ongoing many-sided implicit cross-referencing process of projection, empathy, correlation, and rejection.! ! From “The Essence of Winning and Losing,” John R. Boyd, January 1996.! Observe! Orient! Decide! Act!
  • 36. Follow Us: #ITSMSummit! Down  Time   Detec5on  Time   Response  Time   Repair  Time   Recovery  Time  Outage   Detec5on   Diagnosis   Repair   Recover   Restore   Observe   Orient   Decide   Act   INCIDENT LIFE CYCLE!
  • 37. Follow Us: #ITSMSummit! ANATOMY OF AN OUTAGE! Corporate! LANs & VPNs! Load Balancer! Firewall! Web! Servers! Message! Queue! zOS! CICS! WAS! Database! WAS! Database! zOS! MQ! DB2! IM01109089: P0 - Affecting Multiple apps! ! ! ! ! 4! ! ! ! ! ! ! 3! ! ! ! ! ! ! 1! 5:45-ish pm: CICS ABENDS start flooding the console but not high enough to ticket! ! ! ! ! ! ! 2! 6:00-ish pm: MQ flows start are interrupted and are alerting in Flow Diagnostics! 6:04pm: Synthetic transactions fail at and 6:14 the Ops Center confirms the issue and creates a P0 Incident! 6:54pm: Support teams investigate the interrupted flows and determine it is a “back-end” problem! 10:29pm: Support teams investigate MQ and ultimately and rule it out and ultimately decide to reset CICS to resolve the issue! ! ! ! ! 5!
  • 39. Four Sources of Bad Decisions:! ! 1. Failure to frame the problem correctly! 2. Poor use of evidence! 3. Faulty decision making process! 4. No feedback for improvement!
  • 40. Follow Us: #ITSMSummit! WHERE THE BREAKDOWN OCCURS! Observe! Orient! Decide! Act! Situational Awareness! Perception of Elements in Current Situation! ! Level 1! Comprehension of Current Situation! ! Level 2! Projection of Future Status! ! ! Level 3! Decision! Performance of Actions! CurrentState! Feedback! • Goals & Objectives! • Preconceptions! • Expectations! • Abilities! • Experience! • Training! Long Term Memory! Automaticity! Cognitive Processes! • System Capability! • Interface Design! • Stress & Workload! • Complexity! • Automation! Adapted from Endsley, M.R. (1995b). Toward a theory of situation awareness in dynamic systems. Human Factors 37(1), 32–64.! Systemic Influences! Individual Influences!
  • 41. Follow Us: #ITSMSummit! SOMETIMES WE MISS WHAT IS GOING ON! Say… what’s a mountain goat doing all the way up here in a cloud bank?!
  • 42. Follow Us: #ITSMSummit! NORMATIVE DECISION MAKING MODEL! §  Limited Information Collection! §  7 +/- 2! §  Tendency to acquire manageable rather than optimal amounts of information! §  Difficulty identifying all possible options! §  Judgmental Heuristics! §  Judgmental heuristics - rules of thumb or shortcuts that people use to reduce information processing demands! §  Availability heuristic - tendency to base decisions on information readily available in memory! §  Representativeness heuristic - tendency to assess the likelihood of an event occurring based on impressions about similar occurrences! §  Satisficing! §  Choosing a solution that meets a minimum standard of acceptance!
  • 43. 1. Adapted from Endsley, M.R. (1995b). Toward a theory of situation awareness in dynamic systems. Human Factors 37(1), 32–64.! ! Our systems are capable of producing a huge amount of data, both on the status of their own components and on the status of the environment. The problem with today’s systems is not a lack of information, but finding what is needed when it is needed.!
  • 45. Why does any of this matter?!
  • 46. Follow Us: #ITSMSummit! REQUIREMENTS FOR UNITY OF EFFORT! 1. Command and Control! 2. Shared Experience! 3. Situational Awareness! •  Command and control (No Leadership)! •  The team lacks a clear direction! •  Lots of activity, lack of progress! •  Shared Experience (Poor Relationships)! •  Us vs. Them mentality! •  Unhealthy competition! •  Situational Awareness (Poor Communication)! •  Focused on cooperation, not collaboration! •  Blame culture! •  Infrequent or non-existent communication! Symptoms of Missing Elements! In the cloud, much of this will be federated or done by software!
  • 47. Follow Us: #ITSMSummit! CLOUD IS ASSISTED DECISION MAKING §  Programmed Decision Making! §  Collect evidence! §  Identify the problem! §  Select a solution! §  Implement and evaluate the outcome! §  Non-Programmed Decision Making! §  Narrow evidence down to the ideal level! §  Apply heuristics to limit the impact of cognitive bias! §  Present options to a human for a decision!
  • 48. Follow Us: #ITSMSummit! DECISIONS BEING AUTOMATED IN THE CLOUD! Packing! •  Compressing workloads to the fewest number of physical servers! •  Maximizing cost efficiencies! Striping! •  Spreading workloads across as many physical servers as possible! •  Ensuring higher performance levels and reducing risk due to component failure! Load- Awareness! •  Allocating new workloads to the servers with the lowest load! •  Maximizing the performance of the workloads! HA- Awareness! •  Ensuring workloads are distributed across pods! •  Matching availability levels with service requirements and cost targets! Energy Awareness! •  Placing workloads according to energy costs! •  Ending workloads to reduce energy consumption or rescheduling them for off-peak hours! Affinity- Awareness! •  Placing workloads close to critical resource dependencies! •  Collocating compatible workloads to maximize available resources! Platform Awareness! •  Allocate workloads to best platform! •  Migrating workloads to least expensive platform still capable of delivering required service levels! Topology Awareness! •  Allocating resources within a service group near each other! •  Isolate single-points-of-failure!
  • 49. Follow Us: #ITSMSummit! CLOUD OPERATION REQUIREMENT! ! The perception of and reaction to a set of changing events in terms of what can be done instead of merely the recollection of a stimuli.1 ! Operating a cloud means enabling good decision making! 1. Adapted from Endsley, M.R. (1995b). Toward a theory of situation awareness in dynamic systems. Human Factors 37(1), 32–64.!
  • 50. Follow Us: #ITSMSummit! When decisions are not made based on information, it’s called gambling.!
  • 51. Follow Us: #ITSMSummit! SOME THINGS NEVER CHANGE! Corporate! LANs & VPNs! ISP! Connection! DNS & Internet! Services! Content Mgmt! System! Social Network! Widgets! Site Tracking! & Analytics! Banner Ads & ! Revenue Generators! Multimedia &! CDN Content! Home Wireless! & Broadband! Mobile Broadband! Is It My Cloud Provider?! •  Configuration errors! •  Application design issues! •  Code defects! •  Insufficient infrastructure! •  Oversubscription Issues! •  Poor routing optimization! •  Low cache hit rate! Is It a Service Provider Problem?! •  Non-optimized mobile content! •  Bad performance under load! •  Blocking content delivery! •  Incorrect geo-targeted content! Is it an ISP Problem?! •  Peering problems! •  ISP Outages! Is it My Code or a Browser Problem?! •  Missing content! •  Poorly performing JavaScript! •  Inconsistent CSS rendering! •  Browser/device incompatibility! •  Page size too big! •  Conflicting HTML tag support! •  Too many objects! •  Content not optimized for device! The Cloud!
  • 52. Follow Us: #ITSMSummit! OUR UNDERSTANDING OF YOUR GOALS! § Gaining visibility into and control of an increasingly complex operating environment in order to prevent frequent and prolonged outages! § Evolving from fault monitoring to a holistic approach to managing application performance! § Increased focus on cloud makes problem isolation and resolution more complex.! PROACTIVE OPERATIONS! § Optimizing the performance of business processes to boost productivity! § Providing cost transparency to track, analyze, and manage resources and control the costs associated with highly-virtualized and cloud environments! § Improving software asset management to prevent over- spending and under-licensing! ! CONTROL COST! ! § Leveraging automation to facilitate rapid growth and reduce the cost of service delivery! § Maintaining OS and application patch levels across all images (active or dormant) to protect the enterprise and enable compliance! § Automating application releases to optimize service delivery and align the Development and Operations teams thereby increasing innovation, reducing costs, and accelerating time to value! ELIMINATE HUMAN FACTORS! Migrating to the cloud is disruptive to an IT organization. We have experienced that many of our clients use this as an opportunity to re-evaluate the way they operate their environments and the tools they leverage to deliver a quality service.! We have identified three key goals driving the adoption of the cloud:!
  • 55. Follow Us: #ITSMSummit! WHAT THIS MEANS TO US…! There are a few inescapable facts we face:! 1.  We needs reliable systems to store the promises it makes to its customers ! 2.  Our systems mirror the complexity of the businesses they support! 3.  Our environments must be massive to scale to handle the workload! 4.  There is too much activity for a single person to be totally situationally aware! 5.  If the users can’t use it, it doesn’t work!
  • 56. Follow Us: #ITSMSummit! Monitoring & Capacity! Infrastructure as Code! Orchestration! Backup & Recovery! Continuous Delivery! Storage Virtualization! Cost Management! HA / DR! Patch Mgmt! Dynamic Scheduling! Bare Metal Provisioning! Network Management! Transaction Tracing! App Provisioning! Performance Analytics! App Perf Mgmnt! App Diagnostics! Service Visualization! Monitoring & Capacity! App Perf Mgmt! Event Management! Infrastructure! Optimization! Application ! Analytics! Analytics Enabled ! Datacenter! Virtualization ! Optimization! DevOps! Cloud Enabled ! Datacenter! Cloud Optimized! Analytics Empowered! The building blocks on your Journey towards an agile, flexible and optimized environment! ROADMAP TO MATURE CLOUD OPERATIONS!
  • 57. Follow Us: #ITSMSummit! REMEMBER THE OPS USE CASE! •  Security! •  Backups! •  High Availability! •  Upgradability! •  Deployment Process! •  Scaling and Elasticity! •  Anticipated Performance Under Load! •  Known Defects!
  • 58. Follow Us: #ITSMSummit! NEW OPERATIONAL REQUIREMENTS! §  Keep the data moving! §  Query on streams! §  Handle stream imperfections! §  Integrate stored and streaming data! §  Guarantee data safety and availability! §  Partition and scale applications automatically! §  Process and respond instantaneously! §  Drive Interoperability!
  • 59. Follow Us: #ITSMSummit! CLEANING UP THE LANDSCAPE! Adapted from: Akella, Janaki. “IT Architecture: Cutting costs and complexity.” McKinsey Quarterly 13 Nov 2009 https://www.mckinseyquarterly.com/IT_architecture_Cutting_costs_and_complexity_2391! Silo! Monolithic Framework! Niche! Launch Pad! Information Bus!
  • 60. Follow Us: #ITSMSummit! CREATING A DIRECTED WORKFLOW! Directed ! Non Directed! Observe! Orient! Decide! Launchpad! Executive Dashboard! Business Area! Dashboards! Application PAC! Dashboards! Command Center! Dashboards! Technology Owner! Dashboard! Application Owner! Dashboard! Problem Isolation! Workspace! Problem Diagnostics! Workspace! System Detail! View! Component Detail! View!
  • 61. Follow Us: #ITSMSummit! A TYPICAL ITIL CHANGE PROCESS! Objectives:! - What Changes are coming?
 - Why is the change required?
 - Has the existing configuration been reviewed?
 - What is the risk & impact, low, medium, high?
 - what is the plan B?!
  • 62. Follow Us: #ITSMSummit! Palette of library assets enable easy workflow composition through drag and drop Access to rich libraries (toolkits) of reusable automation assets that enable to speed automation creation Rich set of actions types, flow control, data handling primitives that simplify creation of complex automations Easy workflow action editing for managing: data mapping, error recovery options, implementation details , etc. Graphical editor for composing and connecting workflows Rich tooling functions to edit, version, debug, optimize workflows AUTOMATING ITIL PROCESSES!
  • 63. Follow Us: #ITSMSummit! FINDING METRICS THAT MATTER! §  Will the metric be used in a report? If so, which one? How is it used in the report?! §  Will the metric be used in a dashboard? If so, which one? How will it be used?! §  What action(s) will be taken if an alert is generated? Who are the actors? Will a ticket be generated? If so, what severity?! §  How often is this event likely to occur? What is the impact if the event occurs? What is the likelihood it can be detected by monitoring?! §  Will the metric help identify the source of a problem? Is it a coincident / symptomatic indicator?! §  Is the metric always associated with a single problem? Could this metric become a false indicator?! §  What is the impact if this goes undetected?! §  What is the lifespan for this metric? What is the potential for changes that may reduce the efficacy of the metric?! Evaluating the Effectiveness of a Metric!
  • 64. Follow Us: #ITSMSummit! PICKING BETTER MONITORS! Itemize the existing monitors! Brainstorm potential gaps to fill! Deploy new monitors! Identify the potential risks! Itemize the existing monitors! Determine if which gaps exist! Fill the monitoring gaps! Current Approach! Proposed Approach!
  • 65. Follow Us: #ITSMSummit! WHAT GOOD MONITORING LOOKS LIKE! Corporate! LANs & VPNs! Load Balancer! Load Balancer! Firewall! Switch! Web Server Farm! Database! Data Power! Mainframe! Middleware! Load Balancer! 1.  System Availability! 2.  Operating System Performance! 3.  Hardware Monitoring! 4.  Service/Daemon and Process Availability! 5.  Error Logs! 6.  Application Resource KPIs! 7.  End-to-End Transactions! 8.  Point of Failure Transactions! 9.  Fail-Over Success! 10. “Activity Monitors” and “Reverse Hockey Stick”! Elements of Good Monitoring! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! 3!2! 4! 5! 6!1! ! ! ! ! 7! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! 8! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! 9! ! ! ! ! ! ! 10!
  • 66. http://info.streamdatacenters.com/Portals/165393/Gallery/Album/6624/Richardson%20Aerial-01.png! This is no longer the way we should think about monitoring! Monitoring Happens Here!
  • 68. Follow Us: #ITSMSummit! WHAT DO YOU WANT TO ACCOMPLISH?! Your monitoring should help you answer:! •  How will we know if the users are getting the experience they are expecting?! •  How much capacity do we need during normal and peak times to ensure user expectations are met?! •  How quickly can the provider we select ramp up to meet our needs if we find that the service is underperforming?! •  How fast do we need to be able to access additional capacity once it is ready for us?!
  • 69. Follow Us: #ITSMSummit! 69 Here comes the elevator pitch…
  • 70. Follow Us: #ITSMSummit! 70! THE IBM SOLUTION! !IBM SmartCloud Suite offers essential management capabilities for applications in complex cloud and hybrid environments. ! ! ! ! •  At-a-glance status determination via network topology graphs! •  Proactively identify and respond to compliance issues! •  Monitor the performance of the environment and the tenants living inside of it! •  Understand the current capacity needs and forecast future needs! •  Understand the costs associated with providing the service and enable “showback” and charge back” reporting to the application owners! SINGLE POINT OF MANAGEMENT! ! •  Minimize service and system outages! •  Identify recurring incidents and implement action to remediate problems before they cause impacts! •  Assist troubleshooting by suppressing “noise” events and providing root cause determination! MAXIMIZE SERVICE AVAILABILITY! ! •  Reduce the need for manual action or intervention! •  Automate for repeatability and elimination of human error! •  Develop standardized practices for complex business processes! •  Enable the development of APIs to allow for self-service management by the consumers! IMPROVED OPERATIONAL EFFICIENCY!
  • 71. Follow Us: #ITSMSummit! Understand the ! end-user experience ! Follow changing ! workloads! Mobile devices & 
 smart endpoints! Private, public & 
 hybrid clouds! Highly virtualized applications, storage & networks ! Discovery! Visibility into application resources! End User Experience! Transaction performance monitoring to ensure SLA compliance! ! ! Transaction Tracking! Rapid problem isolation through transaction 
 path analysis! ! ! Diagnostics !! Domain-specific operations tools for diagnosis and repair! ! ! Predictive Analytics! Proactive approach to reduce outages & improve performance! ! ! shared data & common services! See steps ! across the cloud ! VISIBILITY, CONTROL AND AUTOMATION TO INTELLIGENTLY MANAGE CRITICAL APPLICATIONS IN CLOUD AND HYBRID ENVIRONMENTS.! APPLICATION PERFORMANCE MANAGEMENT!
  • 72. Follow Us: #ITSMSummit! COMPOSITE APPLICATIONS! Site Content! Search! Session! Information! User Login! & Identity Mgmt! Content Mgmt! System! Social Network! Widgets! Site Tracking! & Analytics! Banner Ads & ! Revenue Generators! Multimedia &! CDN Content!
  • 73. Follow Us: #ITSMSummit! GAINING PERSPECTIVE REQUIRES BALANCE! Packet Capture! Synthetic Transactions! Client Monitoring! Client Monitoring! Synthetic Transactions! Server Probe! 1.  Client to the Server! 2.  Server to the Client! 3.  “3rd Party” Vantage Point! 4.  Synthetic Transactions! Four Perspectives of User Experience!
  • 74. Follow Us: #ITSMSummit! Predic.ve  Outage   Avoidance   Ensure  availability  of   applicaBons  and  services       • Use learning tools to augment custom best practices • Leverage statistical methods to maximize predictive warning • Improve problem detection across IT silos Predict Faster  Problem   Resolu.on   Find  &  correct  problems  faster   with  tools  that  determine  acBons   required  to  resolve  issues       • Identify problems quicker with insight to large unstructured repositories • Isolate problems quicker by bringing relevant unstructured data into problem investigations • Repair problems quicker with the right details quickly to hand. Resolve Op.mized   Performance     Track,  OpBmize,  and  Predict   capacity  and  performance  needs   over  Bme       • Track capacity and performance of applications and services in classic and cloud environments • Optimize resource deployment with what-if and best fit planning tools • Escalate capacity and performance problems before they cause critical failures Perform Improved  Insight     Enhance  visibility  into  systems   resource  relaBonships  while   increasing  customer  saBsfacBon         • Determine what resources are interdependent to assess impact of failures • Gain insight into what is important to your customer • Decrease customer churn and acquisition costs while increasing customer retention and satisfaction Know Automated Analytics helps lower IT Administration Costs: • Performance and Capacity planning tools monitor appropriately and escalate, reducing time consuming report browsing • Learning tools reduce customization and best practices investment on initial deployment • Log Analysis helps speed problem resolution to be able to do more with less BUSINESS VALUE OF ADOPTING ANALYTICS!
  • 75. Follow Us: #ITSMSummit! That is great but we need more…
  • 76. Follow Us: #ITSMSummit! In addition to handling monitoring and performance alerts, it helps drive improved availability.! Our Formula:! 1.  Continually collect, categorize, and analyze all events from as many sources as possible! 2.  Correlate events and analyze them using previous outages as patterns to identify situations worth investigating! 3.  Notify a support team so the situation can be mitigated before becoming an outage! 4.  Automate responses that have well established situational fingerprints and proven resolution steps! THE EVENT MANAGEMENT FOCUS!
  • 77. Follow Us: #ITSMSummit! ONE INTEGRATED ENVIRONMENT! Distributed! Database!Mainframe! Network! Middleware! Storage! Event Pool! Operational! Data Warehouse! Predictive! Enrichment & Correlation! Service Desk!Paging! CMDB! Knowledge! Asset Mgmt! Event Catalog! Event API! Business Telemetry! 3rd Party Providers! Presentation Framework!
  • 78. Follow Us: #ITSMSummit! Presentation! Framework! Asset Management & Topology Database! Aggregation and Analysis! Security Management! Availability Management! Configuration Management! Change Management! Performance Management! Enterprise Data Sources! Business Telemetry Information! Configuration Discrepancies! Enrichment Data! Business Activity Data! Historical Data! “Enriched” Events! Change Activity! Topology Snapshots! Trend-RelatedFaults! DiscoveredProblems! Status Indications! Incidents! Audit Information and Suspicious Activity! Enrichment Data! Business Activity Data! Automated Discovery!
  • 79. Follow Us: #ITSMSummit! CONCEPTUALIZING SITUATIONAL AWARENESS! Situational Awareness Engine! Adapted from http://www.slideshare.net/TimBassCEP/getting-started-in-cep- how-to-build-an-event-processing-application-presentation-717795! Real-Time Event Streams! Detected and Predicted Situations! Patterns from Historical Data! Causal Relationship from Past RCAs!
  • 80. Follow Us: #ITSMSummit! CONCEPTUAL MODEL OF COMPLEX EVENT PROCESSING! Adapted from http://www.slideshare.net/aparnachaudhary/esper-cep-engine! Event Pipeline! Event Queries! Time Window! Data Events! Control Event! Other Events! Event Filter! Scenarios! A! B! C! Feedback Loop! Event Intelligence! Action Events!
  • 81. Follow Us: #ITSMSummit! ITERATIVE DEVELOPMENT! As you recognize opportunities to capture knowledge, use it to improve your Event Management System. !
  • 82. Follow Us: #ITSMSummit! The IT Culture is driven to technology for solutions. Leverage your monitoring and testing tools to help practice failure scenarios. Work on tracking potential points of failure by creating monitoring and report the rate of occurrence to the developers at the start of each new iteration.! PLAYING TO OUR STRENGTHS!
  • 83. Follow Us: #ITSMSummit! LET’S KEEP THE CONVERSATION GOING…! Andrew.P.White@Gmail.com! ReverendDrew! SystemsManagementZen.Wordpress.com! systemsmanagementzen.wordpress.com/feed/! @SystemsMgmtZen! ReverendDrew! APWhite@us.ibm.com! 614-306-3434!