SlideShare una empresa de Scribd logo
1 de 20
Descargar para leer sin conexión
www.differ.cz                 Story of support and maintenance according ITIL v3, part I.




Story of support and maintenance
according ITIL v3
Part I. Operational activities




                                                                                 Jaroslav Procházka
                                                                                       www.differ.cz
                                                                                         version 1.0
                                                                                        August 2011




© 2011 Jaroslav Procházka, www.differ.cz                                                           Page 1
www.differ.cz                    Story of support and maintenance according ITIL v3, part I.




Story motivation
We are nowadays driven by strong rationality (logical, rational, scientific, verifiable facts matter) and forget
irrational aspects and emotions in human decision making. If humans are rational, why the hell they buy
Apple products? ;) The same statement is valid for stories and their power. Stories are part of our cultures
for many thousand years and are the best way to transfer the knowledge, see sociological, psychological
or cognitive studies, e.g. Campbell: The hero with a Thousand Faces or Turner: The Literary Mind. You
know, all the old epics, Bible or story of Buddha are stories that are attractive for us, we would like to hear
the same variations of hero’s journey again and again. And that’s stories what can differ us, our service,
product or company from many other vendors providing the same. Stories matter.

Other application of stories in business is in knowledge management and sharing domain. Be honest,
how often do you use your logical-structured-fact-based Knowledge base? How easy is to remember such
record content, steps and outcomes in longer term? And now, compare it with story of your colleague
dramatically describing the same situation (you could hear it in kitchen, during lunch, in the pub)? Which
one is easier to remember and follow? Big and respected companies like XEROX1, 3M or NASA use
stories as the approach to store and share knowledge inside the company. Story telling is also part of
modern leadership.

Next motivation factor that was trigger for me to write this e-book is hard understanding of process
frameworks like IBM RUP® or ITIL®. Such misunderstanding causes problems with support, operations
and maintenance of IT infrastructure leading to weak quality, revenue, dissatisfied teams and customers.

Goal of this e-book is to spread service-driven (ITSM) philosophy and service thinking using stories. The
story is focused on principles and concepts described by ITIL v3. We start with end user affected by some
issue and solve also hidden root cause (in ITIL terms Incident and Problem Management). Proactive
investigation of root causes is weak point of many teams and companies. We’ll emphasize key ideas of
this approach, doesn’t matter if you call it Problem management, Kaizen, TQM, CMMI. Part of the story is
also
Configuration Management that’s taking care of IT infrastructure items, Change Management processing
change requests and Release and Deploy Management building the change.

The second part of this story that would follow soon is focused on tactical and strategic activities of ITSM,
namely service thinking, connection to business and its scenarios, predictions, proactive thinking,
contracting and service measurement (so called SLA). This is the core of ITSM/ITIL thinking.




1
 Více viz http://choo.fis.utoronto.ca/mgt/KM.xeroxCase.html i
http://www.kmworld.com/Articles/Editorial/Feature/Best-Practices-Eureka!-Xerox-discovers-way-to-grow-
community-knowledge.-.-And-customer-satisfaction-9140.aspx

© 2011 Jaroslav Procházka, www.differ.cz                                                                Page 2
www.differ.cz                    Story of support and maintenance according ITIL v3, part I.


Note:
This e-book does not replace ITIL training or certification. You will neither set up the right environment
based on it. The meaning and goal of it is to raise awareness about ITIL version 3. What is it, how can it
help with solving my issues and what are the differences from version 2. This material could bring insight
for busy people to study ITIL, typically sales people, customer representatives, customers, architects,
higher managers and other key people.

If you have any comments, improvement proposals or ideas how to improve this e-book or you would like
to cover also your domain in the story (only application management as incident contributor is covered),
send it please to me via email (jarek@differ.cz). I also hope that this short story inspires you to write your
team/unit knowledge base in form of short stories! It is more memorable, writing it is fun and thus they
bring better value to its creators and consumers ;)




© 2011 Jaroslav Procházka, www.differ.cz                                                                Page 3
www.differ.cz                   Story of support and maintenance according ITIL v3, part I.



A short introduction
ITSM means IT Service Management, thus the story covers mostly introduction of the concept of IT
service thinking and operational activities connected to this concept. The most known and used ITSM
framework nowadays is called ITIL (IT Infrastructure Library) – the library guiding us in IT infrastructure
management covering software, hardware, networking, people etc. ITIL brings process approach to ITSM
and its key benefit is definition of common terminology that is very important for communication between
IT and business and among different vendors in the chain. Nowadays (July 2011), ITIL exists in version 3,
but new refresh is prepared for release and it would be called ITIL 2011. Key difference between version 2
and 3 is newly introduced lifecycle of the service (see picture), starting with its idea, strategy (Service
Strategy phase) and ending with daily use and support (Service Operation). Story described in this e-book
covers concepts and processes of version 3, specifically Service Transition and Service Operation part.




Daily operations and support deal with necessary activities such as monitoring, data back- ups,
implementation of law amendments (e.g. ERP applications) or reflecting changes in assembly process of
production and assembly lines. Change or new functionality recording, assessing, implementing, testing
and integrating is one part of those necessary activities. But specific actions need to be performed also in
case of application/service incident that affects end users and thus the value we provide to the customer.
Depending on number of users affected or importance of application, the cost of incident can be really
huge:
        e.g. 30 minutes of stopped assembly line can mean 20 cars not assembled and delivered = 20
        cars x 15.000 Eur / 1 car = 300.000 EUR losses just in 30 minutes!

Specific domain that needs our attention and automation is physical changes in IT infrastructure: hardware
or network. If not secured and automated properly, it can cause severe incidents with huge financial
impact. More described example calculation of the cost of incident impact shows following box:




© 2011 Jaroslav Procházka, www.differ.cz                                                              Page 4
www.differ.cz                      Story of support and maintenance according ITIL v3, part I.



      Simple Incident cost calculation:

      Employee cost ………......                         100 EUR / h
      Headcount ........................              200 in total
      Incident length ..................                  3h

      30 people cannot work for 3 hours because of system incident. The cost of impact
      can be simply calculated: 30 employees x 3h x 100 = 9000 EUR of costs in one day
      not generating any value to the customer! If you multiply this number by total
      amount of incidents per year, you could get pretty high number that could cover
      e.g. year budget for IT or the cost of totally new system or assembly line.


Due to this fact, we need also early identification and uncovering of incidents with high level of automation.
Jang part of this Jin is built-in proactive root cause identification and solution (so called Problem
Management). Necessary backend functionality supporting efficient monitoring and problem management
is providing knowledge about infrastructure: hardware and software configurations, software versions,
licenses, people locations, access right politics etc. Advanced teams use (semi)automated knowledge
base storing and proposing already solved issues, incidents, problems or complicated changes with many
dependencies.


        Starring:
        Mary ….…            business user affected by system incident,
        Pete …….            application programmer,
        John ….…            system administrator,
        Adam …...           Service Desk support specialist,

        And other starts…




© 2011 Jaroslav Procházka, www.differ.cz                                                                 Page 5
www.differ.cz                       Story of support and maintenance according ITIL v3, part I.



       Typical scenario of daily operations
Mary used paper evidence of incoming orders until now. Although her company had implemented
information system for assembly line and economic agenda, order processing was not part of the
project. Paper evidence is not very efficient and brings problem if some order needs to be find
quickly. Also archiving is a bit problematic. Orders fade and their readability is harder and
harder. Mary is happy, because order processing was recently automated by software program
and integrated to assembly line information system. Rework, searching and archiving issues are
limited to almost zero now and Mary can enjoy her work.

It’s Monday morning. Mary uses application called WarehouseAndOrders v1.1 to process orders
to assembly line, but after one hour of work software client crashes and she’s not able to run it
again. So she calls her friend Pete, application programmer, to help her with solving this
incident2. Pete is employed by IT company delivering and operating this application and knows
Mary from university times. Actually, they are still friends and meet regularly. Pete is happy to
hear from Mary again, so they have little chat and by the way Mary also mentions the incident.
Pete makes some note, but forgets it immediately because of heavy load caused by upcoming
release. Mary is awaiting resolution from Pete and performs some unimportant tasks not to be
bored. She reminds herself at lunch on Monday when they used to go at lunch with all old
university group. Pete asks about some symptoms observed by Mary in the morning (any error
message, behavior of system etc.), not to be ashamed. But after few Mary’s comments Pete
immediately talks about something else. You know, it is few hours since incident happens, so
Mary doesn’t remember anything significant and Pete is annoyed by it. It is no surprise that Pete
continues with development tasks after the lunch and forgets about Mary and her issue. Mary still
could not use the application and process incoming orders.




2
 We’ll start to differentiate between the key terms incident and problem. The reason is totally different meaning in
ITIL terminology:
Incident is an event causing availability or quality problems of IT service or its part perceived by end user. It could
be response time, number of processed transactions, volume, no accessibility to service etc. Incident can be usually
solved by so called workaround (typically server or application restart), but this solution or process doesn’t remove
hidden root cause! It only allows service to operate again under agreed quality.
Problem is hidden root cause of one or more incidents that can be already evident or cannot. Problem can be solved
only by structural solution, e.g. change in IT infrastructure or bugfix of software application source code.

© 2011 Jaroslav Procházka, www.differ.cz                                                                         Page 6
www.differ.cz                 Story of support and maintenance according ITIL v3, part I.


It is Monday afternoon and Mary is calling Pete again to hear more about the progress. Pete gets
angry because Mary interrupts him repeatedly. He needs to finish build testing for upcoming
deployment. Pete wants to be freed of Mary so he stops testing and starts incident investigation.
Mary is still waiting and performing not important tasks and incoming orders are not processed.
Mary realizes that this day will not bring the solution and goes home earlier. Pete stays until 8
pm busy with infrastructure identification: what are the parts of this nasty program? Which
servers are used to operate it? What middleware, databases and other connectors does it use?

Tuesday morning is an important deadline for Pete, he needs to finish new release package for
deployment. This is the reason why he comes earlier in the morning even though he finished late
the day before. Mary arrives later this morning to be secured that incident is already solved and
her time is not wasted. Pete focuses on finishing build testing and packaging. When ready, he
continues with incident investigation. Finally he realizes what servers are used to operate
WarehouseAndOrders application and both are Linux servers! Thanks IT God, Pete is Linux fan
and skilled Linux programmer, so he wants to start investigation but missing account
immediately stops him. Pete is proactive and calls John (system admin) to get any account to get
in. John as good friend shares root account with an assumption that Pete will create his personal
one and will upload there some new movies and mp3s. Why the hell would he otherwise ask for
access to this server?




Pete skips lunch today because of his heavy load. Tuesday afternoon brings following steps. Pete
logs in Linux server as root and searches for WarehouseAndOrders program directory and other
underlying applications and database servers. He plans to investigate logs to learn more about
the situation, but accidentally when starting MC (Midnight Commander) he notices full server
hard drive. Because Pete is busy but wants to help Mary at the same time, he does not care with
creation of his account and setting the rights but just deletes some temp and log files as root. He
calls John to restart Oracle DB and also Apache Tomcat web server, both were down and are
used by WarehouseAndOrders application. In fact, Pete does not want to waste time by looking
for admin interface. What more, it’s John’s responsibility anyway. John is confused by this
request (Pete is not usually working for the customer using those servers), but he does what is
asked for without any notice to end users. John informs Pete after restart to check what was
expected.




© 2011 Jaroslav Procházka, www.differ.cz                                                           Page 7
www.differ.cz                 Story of support and maintenance according ITIL v3, part I.


Pete can now call Mary that WarehouseAndOrders application v1.1 is running again. Mary is
very grateful, thanks to Pete and starts to process orders waiting in queue. Pete forgets the whole
story and continues with his assignment. Build needs to be tested and packaged for tomorrow’s
deployment. Pete stays in office again until 8pm to finish all required steps.

Wednesday morning looks like ordinary day when Mary processes the orders in queue. After 2
hours the same incident occurs again and it makes Mary angry. She calls Pete if he knows
anything about the issue; maybe he’s improving the application she assumes. But nobody replies
to office call. The reason is obvious for our reader, but not for Mary. Pete travels to the customer
premises to install new release, because it cannot be done remotely. Mary is not doomed to
waiting, since she calls Pete’s mobile phone and explains the situation. Pete contacts John and
quickly synchronizes about the issue and its context. John finally gets the point why Pete asked
for Linux account and server restart. The reason was not mp3s or new movies but incident! But
thanks to this John knows some context, servers used and symptoms of the issue. Pete continues
with installation of customer release, finally without any disturbances. John starts IT
environment investigation and notices full server disk. He backs up chosen log files in different
server for further analysis, deletes original ones and tries to restart Oracle DB and only failed
instances of Apache Tomcat web server. He tries WarehouseAndOrders application and sees
everything working but he still does not contact Mary before he’s sure incident will not occur
again.

John as system admin is surprised by full server disk. There cannot be so many movies and mp3s
stored on server, he thinks loudly… He postpones lunch and starts investigation of incident’s
deeper root cause. How can be server disk full? He writes workaround script that will back up
chosen log and temp files in different server regularly and remove the original files after this
procedure. John wanted to download log files to his computer for further investigation and
analysis and notices accidentally so big Oracle DB log (only just because long download time)!
How the hack can today’s Oracle log have almost 3 GB? He opens the log in original server
folder and after few minutes of investigation notices programmer’s error reports. He updates
workaround script with this log as well after this finding. Then the script is quickly tested with
expected result, so nothing hinders its deployment. Only after this action John calls Mary to use
the application again. John still wonders what error can cause such a huge Oracle log and if this
is only contributor to full disk. He searches Internet forums if somebody already tackled similar
issue, but founds nothing. He reports this defect to Oracle Corporation and waits for any reply.
Finally he can go for a Wednesday’s lunch.




© 2011 Jaroslav Procházka, www.differ.cz                                                           Page 8
www.differ.cz                  Story of support and maintenance according ITIL v3, part I.


Scenario conclusion: Albeit some actions described in this scenario can be striking and funny, many IT
and non-IT organizations follow this setup. And if you discuss the topic with them and emphasize some
anti-patterns, they are not aware about anything weird and are surprised by your statement about
efficiency and potential risks. Moreover, this story is our personal experience from previous assignments.
Let’s conclude the story:
         Mary could not process orders for almost 2 days = it could affect company’s cash flow and name
         or even generate losses but nobody cared.
         Pete was frequently disturbed, switched context and was overloaded.
         John as system administrator started to investigate hidden root cause of incident (doing his job)
         only after 2 days from first incident discovery.
         Due to disturbances and Pete’s tiredness build could contain unnoticed defects.
         Pete accessed restricted production servers as root and deleted files there as root.
         Same incident occurred again in short time and affected end user.
         Hidden root cause generating incident is still not uncovered and resolved.




© 2011 Jaroslav Procházka, www.differ.cz                                                             Page 9
www.differ.cz                      Story of support and maintenance according ITIL v3, part I.



       ITIL v3 scenario
Let’s discuss same story following ITSM principles. This is how it looked like after 3 month of
implementation effort. Same stars perform this story, but the approach to incident resolution is different.
We focus on Service Transition and Service Operation activities again.

Situation with IT systems is the same as described in the first scenario. We start the story on
Monday morning again when Mary enters office and starts to use Orders&Warehousing IT
Service3, not WarehouseAndOrders application anymore. She does not need to care about
different parts of the service, start program client or prolong licenses. She just uses her browser
and link to run Orders&Warehousing IT service. IT service works as expected, no warning
symptoms occur. Standard monitoring and event reporting4 is set up and working at the same
time. Business users, Mary as one of them, do not even know about this monitoring. IT specialists
together with Service Desk specialists set the thresholds for specific components, servers and
their events. These events can trigger deeper investigation by specialist or can automatically
report an incident. Monitoring system started to report several “lack of free disk space” events of
Orders&Warehousing IT service server this morning5. Service desk specialists started to
investigate those events but meanwhile Orders&Warehousing IT service has frozen and had not
responded.

Mary reports an incident using Service desk (SD) tool. SD is the only single point of contact
(SPOC), together with phone, to be used for communication with IT service vendor. Such a
reported incident record contains incident description (observed symptoms), priority for user
(e.g. only one using the service and being affected, department or team affected or whole
company affected) and the name of service chosen from list of provided services. This action
causes automatic notification of the incident to relevant service (and/or customer) Incident
Manager. Incident Manager does the first incident record check, assigns expected category (e.g.
hardware, network, application, premises, licenses) and priority in the context of end user
perception but also other services and business impact. Resulting priority in this case is high
although only Mary uses the service. But the service supports processing of incoming orders, and
its unavailability can stop assembly line and affect company’s business and cash flow. Adam is
assigned to this incident because he is marked as free in Service Desk dashboard and is
automatically notified about it, the same is Mary. All these steps happen just in few minutes,
approximately the same time as reading this page.




3
  IT service is a mean for customer value delivery using IT resources. Customer gets specific outcomes needed to run
the business without owning and managing costs and risks connected to IT. Customer does not care about software,
hardware, networks, licenses, premises, people, upgrades and patches or monitoring. Customer just buys IT service
as commodity and external or internal vendor takes care about operations, support and maintenance.
4
  Typically log changes, state monitoring or user events are processed for incident triggering.
5
  These activities are performed as part of Event Management and are tightly connected with monitoring and
monitoring systems.

© 2011 Jaroslav Procházka, www.differ.cz                                                                   Page 10
www.differ.cz                 Story of support and maintenance according ITIL v3, part I.




                               Incident reporting example using Outlook




© 2011 Jaroslav Procházka, www.differ.cz                                                      Page 11
www.differ.cz                 Story of support and maintenance according ITIL v3, part I.




                              Incident reporting example using Jira tool

Adam reads obtained notification and immediately starts incident investigation. The first steps
performed are following checks:
       Checking Knowledge Base (KB) – it contains solutions to existing problems and incidents.
       If well structured, readable and user friendly then KB can ease and speed up incident
       resolution as well as knowledge sharing among the team at the same time.
       Checking Configuration Management System (CMS) – it contains description, version,
       location and bindings of IT infrastructure components (end user stations, servers,
       accessories). Such system can significantly help with incident localization (which server
       or station is used by this service and what is the configuration, versions) and root cause
       identification.
       And checking automated monitoring tool records and events (Event management
       records). These functions are often performed by specific team or department called
       Control Desk.

Mentioned tools allow quicker incident resolution but also require less technically skilled Service
Desk specialists (needed information is stored in the tool and does not need to be mined in
complicated way). Adam knows what components are used to operate Orders&Warehousing IT
service thanks to CMS and IT service catalogue (see following table and figure).

© 2011 Jaroslav Procházka, www.differ.cz                                                      Page 12
www.differ.cz                   Story of support and maintenance according ITIL v3, part I.


IT service name            Users          Responsibilities                        Configuration Items (CI)
Orders&Warehousing                       Users:                                   WarehouseAndOrders v1.1
                                                                                  Tomcat 6
                                                   Reporting incident using
                                                                                  Oracle 9i
                                                   Service Desk (tool or phone)
                          Mary                                                    Red Hat Enterprise Linux 5
                                                   Participating regular monthly
                          Management                                              HW Server Prague
                                                   SLA reviews
                                                                                  Net Switch S1
                                                   …                              Net Switch S2
                                                                                  Intranet
Internet                                 See internal rules for using Internet    Internet Service Provider
                          All users
                                         (link to intranet document)              Firewall Zone v3.2
       Example IT service catalogue records. Configuration items column is visible only for IT vendor




   CMS part: visual information about IT service infrastructure (basically visualized Configuration Items
                                         column in table above)

Events in IT infrastructure show insufficient (no) free server disk space onto core IT service
operational server. Adam backs-up temp and log files and starts to investigate Oracle database
and Tomcat web server logs only, because he knows from IT service catalogue that these are
used by the service. Thanks to monitoring tools Adam also knows that only this service is down.
He notices too big Oracle DB log consuming several GB of disk space. He backs-up and deletes
Oracle log, restarts the service and tries its functionality. At the same time he also creates
automatic script that backs-up and deletes original Oracle DB log file in regular interval (so
called workaround solution). He verifies and installs the script, restarts Oracle DB and relevant
Tomcat instance, checks monitoring tools, IT service functionality and backed-up file. Everything
works, so Adam creates problem record in SD tool and assigns it, together with link to Oracle
log file, to Oracle group that solves Oracle related problems. Problem ticket is raised to solve
deeper root cause. Adam only used interim workaround solution for incident that allows running

© 2011 Jaroslav Procházka, www.differ.cz                                                            Page 13
www.differ.cz                  Story of support and maintenance according ITIL v3, part I.


IT service again. But why is Oracle log so huge? What causes this? How to fix this? These
questions are still not answered. As final step, Adam updates incident record (Work Log and
solution) and closes it. Mary is notified about solved incident via e-mail, so she knows she can
start to use the service again. Mary needs to try the service and if the solution is ok, she needs to
accept incident solution (or it can be done automatically after some period of time, not to annoy
end user). Mary accepts the solution because everything works well.

It’s still Monday but already after the lunch. Adam creates Knowledge base record describing
this incident and symptoms and encloses solution workaround (script). This KB record is linked
to original incident record and to created problem record too. The goal of KB record is to speed
up solution of similar incidents in the future.

We used Service Desk function, or tool, and Event and Incident Management processes to register and
process incident record. Only incident was solved in the story, root cause is still unclear. Reader could
notice how appropriate tools and monitoring can make incident management process much more efficient
and quick. Thanks to this is incident processed in several minutes and resolved in tens of minutes. Mary
could continue with her work and there is no significant impact on company’s business (at least not 3 days
as in previous story). But our job is not done yet. We need to uncover and solve the problem (ITIL term for
unknown root cause) causing the incident. Let’s continue with the story then to uncover hidden problem
using Problem Management process and implement the change using Change and Release and Deploy
Management. The whole lifecycle and process relations are depicted in following figure:




       Relations of ITSM Service Operation and Service Transition processes introduced in our story

Since Adam created problem ticket in Service Desk related to Oracle database group, Problem
Management team is formed on demand. This team consists of skilled and experienced
administrators and database programmers that are involved only in more complicated issues
(Level 2 and 3 in Service Desk hierarchy model). The reason is labor cost of those professionals.
Rachel, Oracle specialist is notified as Problem Manager and starts to investigate problem
record as well as incident record with workaround, Knowledge Base description and mainly
linked Oracle log file. Thanks to her knowledge of “standard” Oracle log, she uncovers quickly
© 2011 Jaroslav Procházka, www.differ.cz                                                          Page 14
www.differ.cz                    Story of support and maintenance according ITIL v3, part I.


programmer’s error reports being part of this log. She’s surprised how this could happen,
because she’s never experienced this before. Rachel logs in Oracle defect reporting tool
(maintenance fee grants access to this database) and searches for this issue, but founds nothing.
She is allowed to create a defect in Oracle defect tool, so she does, describes the log issue and
attaches log snapshot to demonstrate it. Rachel receives reply from Oracle after several days
informing about new patch released by Oracle to fix this defect. Rachel creates request for
change (RfC) to implement this patch to operational environment. Part of this RfC is description,
reason, importance and impact of this new patch.

Now we get to the moment when root cause was identified and solution exists. Before releasing to
production environment we need to approve the request (there could be upcoming conflicting or
depending changes), test it (there can be other contributors to this root cause) and finally deploy. For
these steps are responsible Change, Release and Deploy Management processes and roles. Change
assessment, testing and deployment could look like activities in following chapter.

Change request is assessed and approved by Change Manager Mike because no conflict or
dependency with upcoming changes was found, implementation costs are very low and we save
backups disk space when remove workaround. Uncovered root cause and proposed solution is
structural one, solves the issue at low cost and allows removing workaround solution. Oracle
patch is first installed and tested in testing environment (mirror copy of production environment)
and is ready for production deployment only after all tests are finished and no other symptoms
are observed. It seems that Release and Deploy team can now finally distribute and deploy patch
to production environment. But before they proceed with this step they need to prepare strategy
plan called rollback plan. Orders&Warehousing IT service is so important so IT vendor cannot
afford another incident in a row (definitely it would affect SLA6). Rollback plan secures the team
with strategy used if patch deployment fails. If it happens we have to be able restore previous
working version and configuration. Necessary input for rollback plan is again CMS system
containing information about current versions of software and hardware systems, their
configurations and provides information about authorized storage of source, configuration and
executable files.

Now we are finally ready to deploy patch to production environment (really done-done).
Deployment is done during agreed so called maintenance window. IT vendor can do changes and
stop services for maintenance purposes only during this time. It is from 2.00 am to 3.00 am in
this case. When team deploys the patch and runs verification production tests, they remove
existing workaround (backup and delete script) together with Rachel. After check Mike closes
this RfC as successfully implemented. Rachel now updates problem solution (Oracle patch) and
closes problem as successfully implemented as well. She still needs to update Knowledge Base
record to have all information synchronized. After that she’s done.

Bit this is not the end of the story yet. Now there exist discrepancy between real production
environment (Oracle database patch – micro version change) and information about it in CMS.
We need to update this information in CMS and IT service catalogue to keep these tools useful.



6
 SLA – Service Level Agreement – defines agreed quality parameters and conditions under which is service
provided. It is usually contract appendix, because it is not a formal contract.

© 2011 Jaroslav Procházka, www.differ.cz                                                                   Page 15
www.differ.cz                    Story of support and maintenance according ITIL v3, part I.


Update can be done manually7 or using automated tool8 depending on vendor’s automation
maturity.




                                        Simple Rollback plan example


If we compare the first and second scenario, we can see big difference. Using more formal ITSM/ITIL
procedures supported by automated tools allowed processing all necessary activities more efficiently and
without needless emotions. We solved also deeper root case with structural, not just interim solution
causing more complex IT infrastructure and its support and maintenance. But do not take these
statements as a rule or the only truth. There is a hidden trap when shifting our way of working from
informal, ad hoc to process oriented way of working. The trap is omitting or suppressing human aspect
and becoming only ticket driven machine so commonly seen in big corporations.

Anyway, we can conclude the scenario as following:

        Incident was closed much earlier than in the first scenario.
        People responsible for incident solving did the job, no other IT roles, e.g. programmers, were
        disturbed.
        People involved in ITSM activities knew what and how to do (it was also boosted by proper
        process automation).

7
 We recommend simple checklist being part of change record (or work log) that will enforce/remind manual update.
8
 Update can happen without any manual action (monitoring system inform about this change in infrastructure and
updates the information) or semi-automatically (manual trigger for automatic IT infrastructure audit).

© 2011 Jaroslav Procházka, www.differ.cz                                                                Page 16
www.differ.cz                  Story of support and maintenance according ITIL v3, part I.


       Incident root cause investigation and structural solution design (not just accepting workaround
       solution) started very first day with the aim to prevent recurring incidents.
       Proper automated tools (have you noticed, no Excel was mentioned ;)) speeded up diagnoses,
       information gathering and incident resolution process. Every step is recorded in Service Desk tool
       and it’s easy to track or report all steps and actions performed.
       Updated user friendly knowledge base (KB) could help with similar incident/problem solution.




© 2011 Jaroslav Procházka, www.differ.cz                                                         Page 17
www.differ.cz                    Story of support and maintenance according ITIL v3, part I.


The whole story following ITSM/ITILv3 processes is depicted in following picture:




  Flow starting with identified incident (also reported events) and ending with implementation of structural
                    solution to identified problem (workaround is not a final destination)



Story conclusion
As the result of this significant incident affecting Orders&Warehousing IT service was conducted extra
SLA review meeting between IT and business. Scope of this meeting was not only to follow thresholds
and actual values of service quality attributes but also possible financial losses caused by this significant
incident. It triggered additional actions on IT vendor side that should lead to better understanding of
business, improved capacity and load predictions and proactive steps uncovering potential problems (in
terms of ITIL terminology). But these steps are already a trailer of upcoming second part of this ITILv3
story ;)




© 2011 Jaroslav Procházka, www.differ.cz                                                              Page 18
www.differ.cz                 Story of support and maintenance according ITIL v3, part I.


Change history

Version          Date          Author                  Change history

V1.0             August 2011   Jarek Procházka         First English version created




© 2011 Jaroslav Procházka, www.differ.cz                                                         Page 19
www.differ.cz                 Story of support and maintenance according ITIL v3, part I.




Differ!                                                                    www.differ.cz
Improve your IT development, support, maintenance and operation
using Agile and Lean practices

   Articles and experience
           Agile and Lean IT
      development, support
          and maintenance
        Human aspect in IT
             Agile and Lean
                management

   Practical templates and
                checklists

                 Books review

                 Free e-books
               ITIL in practice
              Experience from
                       projects

                     Services
           Creative workshop
              Lean workshop
               Consultations




© 2011 Jaroslav Procházka, www.differ.cz                                                      Page 20

Más contenido relacionado

Destacado

Be2Awards and Be2Talks 2013 - event slides
Be2Awards and Be2Talks 2013 - event slidesBe2Awards and Be2Talks 2013 - event slides
Be2Awards and Be2Talks 2013 - event slidesBe2camp Admin
 
The Django Book - Chapter 6 the django admin site
The Django Book - Chapter 6  the django admin siteThe Django Book - Chapter 6  the django admin site
The Django Book - Chapter 6 the django admin siteVincent Chien
 
Nca career wise detailer edition march 2010
Nca career wise detailer edition march 2010Nca career wise detailer edition march 2010
Nca career wise detailer edition march 2010guest9c4d5d
 
For Self-Published Authors. Creative Content Opps. Bookexpo America uPublishU...
For Self-Published Authors. Creative Content Opps. Bookexpo America uPublishU...For Self-Published Authors. Creative Content Opps. Bookexpo America uPublishU...
For Self-Published Authors. Creative Content Opps. Bookexpo America uPublishU...Susannah Greenberg
 
Condor overview - glideinWMS Training Jan 2012
Condor overview - glideinWMS Training Jan 2012Condor overview - glideinWMS Training Jan 2012
Condor overview - glideinWMS Training Jan 2012Igor Sfiligoi
 
Securing the e health cloud
Securing the e health cloudSecuring the e health cloud
Securing the e health cloudBong Young Sung
 
saic annual reports 2003
saic annual reports 2003saic annual reports 2003
saic annual reports 2003finance42
 
2345014 unix-linux-bsd-cheat-sheets-i
2345014 unix-linux-bsd-cheat-sheets-i2345014 unix-linux-bsd-cheat-sheets-i
2345014 unix-linux-bsd-cheat-sheets-iLogesh Kumar Anandhan
 
ISC West 2014 Korea Pavilion Directory
ISC West 2014 Korea Pavilion DirectoryISC West 2014 Korea Pavilion Directory
ISC West 2014 Korea Pavilion DirectoryCindy Moon
 
Adobe Marketing Cloud Integration with Adobe AEM
Adobe Marketing Cloud Integration with Adobe AEMAdobe Marketing Cloud Integration with Adobe AEM
Adobe Marketing Cloud Integration with Adobe AEMDeepak Narisety
 
Dedo talk-2014-flat
Dedo talk-2014-flatDedo talk-2014-flat
Dedo talk-2014-flat23rd & 5th
 
120000 trang edu urls
120000 trang edu urls120000 trang edu urls
120000 trang edu urlssieuthi68
 
Data Mining With R
Data Mining With RData Mining With R
Data Mining With RAjay Ohri
 
Kony - End-to-End Proof of Technology
Kony - End-to-End Proof of TechnologyKony - End-to-End Proof of Technology
Kony - End-to-End Proof of TechnologyDipesh Mukerji
 
RailsAdmin - Overview and Best practices
RailsAdmin - Overview and Best practicesRailsAdmin - Overview and Best practices
RailsAdmin - Overview and Best practicesBenoit Bénézech
 

Destacado (19)

EdCamp News & UpDates
EdCamp News & UpDatesEdCamp News & UpDates
EdCamp News & UpDates
 
Be2Awards and Be2Talks 2013 - event slides
Be2Awards and Be2Talks 2013 - event slidesBe2Awards and Be2Talks 2013 - event slides
Be2Awards and Be2Talks 2013 - event slides
 
The Django Book - Chapter 6 the django admin site
The Django Book - Chapter 6  the django admin siteThe Django Book - Chapter 6  the django admin site
The Django Book - Chapter 6 the django admin site
 
Nca career wise detailer edition march 2010
Nca career wise detailer edition march 2010Nca career wise detailer edition march 2010
Nca career wise detailer edition march 2010
 
Discover the Baltic states for studies
Discover the Baltic states for studiesDiscover the Baltic states for studies
Discover the Baltic states for studies
 
For Self-Published Authors. Creative Content Opps. Bookexpo America uPublishU...
For Self-Published Authors. Creative Content Opps. Bookexpo America uPublishU...For Self-Published Authors. Creative Content Opps. Bookexpo America uPublishU...
For Self-Published Authors. Creative Content Opps. Bookexpo America uPublishU...
 
Condor overview - glideinWMS Training Jan 2012
Condor overview - glideinWMS Training Jan 2012Condor overview - glideinWMS Training Jan 2012
Condor overview - glideinWMS Training Jan 2012
 
Securing the e health cloud
Securing the e health cloudSecuring the e health cloud
Securing the e health cloud
 
saic annual reports 2003
saic annual reports 2003saic annual reports 2003
saic annual reports 2003
 
2345014 unix-linux-bsd-cheat-sheets-i
2345014 unix-linux-bsd-cheat-sheets-i2345014 unix-linux-bsd-cheat-sheets-i
2345014 unix-linux-bsd-cheat-sheets-i
 
ISC West 2014 Korea Pavilion Directory
ISC West 2014 Korea Pavilion DirectoryISC West 2014 Korea Pavilion Directory
ISC West 2014 Korea Pavilion Directory
 
Uk norway ib directory
Uk norway ib directoryUk norway ib directory
Uk norway ib directory
 
Adobe Marketing Cloud Integration with Adobe AEM
Adobe Marketing Cloud Integration with Adobe AEMAdobe Marketing Cloud Integration with Adobe AEM
Adobe Marketing Cloud Integration with Adobe AEM
 
Dedo talk-2014-flat
Dedo talk-2014-flatDedo talk-2014-flat
Dedo talk-2014-flat
 
120000 trang edu urls
120000 trang edu urls120000 trang edu urls
120000 trang edu urls
 
Data Mining With R
Data Mining With RData Mining With R
Data Mining With R
 
Kony - End-to-End Proof of Technology
Kony - End-to-End Proof of TechnologyKony - End-to-End Proof of Technology
Kony - End-to-End Proof of Technology
 
RailsAdmin - Overview and Best practices
RailsAdmin - Overview and Best practicesRailsAdmin - Overview and Best practices
RailsAdmin - Overview and Best practices
 
2009 04.s10-admin-topics1
2009 04.s10-admin-topics12009 04.s10-admin-topics1
2009 04.s10-admin-topics1
 

Similar a ITIL v3 story

ITIL Service Desk
ITIL Service DeskITIL Service Desk
ITIL Service Deskjmansur1
 
RealOps IOA Editorial for BM Mag - FINAL
RealOps IOA Editorial for BM Mag - FINALRealOps IOA Editorial for BM Mag - FINAL
RealOps IOA Editorial for BM Mag - FINALJohn Scott
 
Why and How Modern IT Departments Will Use AI in 2018
Why and How Modern IT Departments Will Use AI in 2018 Why and How Modern IT Departments Will Use AI in 2018
Why and How Modern IT Departments Will Use AI in 2018 SymphonySummit
 
Multimodal IT and Orchestration for Digital Transformation
Multimodal IT and Orchestration for Digital TransformationMultimodal IT and Orchestration for Digital Transformation
Multimodal IT and Orchestration for Digital TransformationLeon Dohmen
 
Information technology for management (6th edition)
Information technology for management (6th edition)Information technology for management (6th edition)
Information technology for management (6th edition)MShuibMJ
 
Information technology for management (6th edition)
Information technology for management (6th edition)Information technology for management (6th edition)
Information technology for management (6th edition)MShuibMJ
 
Enterprise architecture-for-healthcare
Enterprise architecture-for-healthcareEnterprise architecture-for-healthcare
Enterprise architecture-for-healthcareMaheswara Reddy N
 
A103 information technology for management 7th edition
A103 information technology for management 7th editionA103 information technology for management 7th edition
A103 information technology for management 7th editionrpvgb
 
Enterprise architecture
Enterprise architecture Enterprise architecture
Enterprise architecture Hamzazafeer
 
Case Study 11.1 It’s an Agile WorldThis case illustrates a common.docx
Case Study 11.1 It’s an Agile WorldThis case illustrates a common.docxCase Study 11.1 It’s an Agile WorldThis case illustrates a common.docx
Case Study 11.1 It’s an Agile WorldThis case illustrates a common.docxmoggdede
 

Similar a ITIL v3 story (20)

ITIL Service Desk
ITIL Service DeskITIL Service Desk
ITIL Service Desk
 
Dit yvol2iss39
Dit yvol2iss39Dit yvol2iss39
Dit yvol2iss39
 
RealOps IOA Editorial for BM Mag - FINAL
RealOps IOA Editorial for BM Mag - FINALRealOps IOA Editorial for BM Mag - FINAL
RealOps IOA Editorial for BM Mag - FINAL
 
Dit yvol2iss30
Dit yvol2iss30Dit yvol2iss30
Dit yvol2iss30
 
Dit yvol2iss43
Dit yvol2iss43Dit yvol2iss43
Dit yvol2iss43
 
Why and How Modern IT Departments Will Use AI in 2018
Why and How Modern IT Departments Will Use AI in 2018 Why and How Modern IT Departments Will Use AI in 2018
Why and How Modern IT Departments Will Use AI in 2018
 
Systems analysis and design lecture 1
Systems analysis and design lecture 1Systems analysis and design lecture 1
Systems analysis and design lecture 1
 
Dit yvol3iss2
Dit yvol3iss2Dit yvol3iss2
Dit yvol3iss2
 
Dit yvol4iss03
Dit yvol4iss03Dit yvol4iss03
Dit yvol4iss03
 
Multimodal IT and Orchestration for Digital Transformation
Multimodal IT and Orchestration for Digital TransformationMultimodal IT and Orchestration for Digital Transformation
Multimodal IT and Orchestration for Digital Transformation
 
Dit yvol2iss50
Dit yvol2iss50Dit yvol2iss50
Dit yvol2iss50
 
Dit yvol2iss12
Dit yvol2iss12Dit yvol2iss12
Dit yvol2iss12
 
Information technology for management (6th edition)
Information technology for management (6th edition)Information technology for management (6th edition)
Information technology for management (6th edition)
 
Information technology for management (6th edition)
Information technology for management (6th edition)Information technology for management (6th edition)
Information technology for management (6th edition)
 
Enterprise architecture-for-healthcare
Enterprise architecture-for-healthcareEnterprise architecture-for-healthcare
Enterprise architecture-for-healthcare
 
A103 information technology for management 7th edition
A103 information technology for management 7th editionA103 information technology for management 7th edition
A103 information technology for management 7th edition
 
Enterprise architecture
Enterprise architecture Enterprise architecture
Enterprise architecture
 
Case Study 11.1 It’s an Agile WorldThis case illustrates a common.docx
Case Study 11.1 It’s an Agile WorldThis case illustrates a common.docxCase Study 11.1 It’s an Agile WorldThis case illustrates a common.docx
Case Study 11.1 It’s an Agile WorldThis case illustrates a common.docx
 
Dit yvol3iss6
Dit yvol3iss6Dit yvol3iss6
Dit yvol3iss6
 
Dit yvol3iss27
Dit yvol3iss27Dit yvol3iss27
Dit yvol3iss27
 

Más de Jaroslav Procházka

Numbers are not facts or reality
Numbers are not facts or realityNumbers are not facts or reality
Numbers are not facts or realityJaroslav Procházka
 
4 phases of Agile evolution in your organization
4 phases of Agile evolution in your organization4 phases of Agile evolution in your organization
4 phases of Agile evolution in your organizationJaroslav Procházka
 
I dalajláma a Steve Jobs měli své mentory
I dalajláma a Steve Jobs měli své mentoryI dalajláma a Steve Jobs měli své mentory
I dalajláma a Steve Jobs měli své mentoryJaroslav Procházka
 
Don't bother me with product vision I'm just coding!
Don't bother me with product vision I'm just coding!Don't bother me with product vision I'm just coding!
Don't bother me with product vision I'm just coding!Jaroslav Procházka
 
5 steps towards sustainable change - printable card
5 steps towards sustainable change - printable card5 steps towards sustainable change - printable card
5 steps towards sustainable change - printable cardJaroslav Procházka
 
5 steps to get more cookies with less effort
5 steps to get more cookies with less effort5 steps to get more cookies with less effort
5 steps to get more cookies with less effortJaroslav Procházka
 
Jak Agile a Lean pomahaji ke stesti
Jak Agile a Lean pomahaji ke stestiJak Agile a Lean pomahaji ke stesti
Jak Agile a Lean pomahaji ke stestiJaroslav Procházka
 
Our approach to kaizen, lean it summit, prochazka, chmelar
Our approach to kaizen, lean it summit, prochazka, chmelarOur approach to kaizen, lean it summit, prochazka, chmelar
Our approach to kaizen, lean it summit, prochazka, chmelarJaroslav Procházka
 
Coaching in distributed environment
Coaching in distributed environmentCoaching in distributed environment
Coaching in distributed environmentJaroslav Procházka
 
Keeping the spin – from idea to cash in 6 weeks
Keeping the spin – from idea to cash in 6 weeksKeeping the spin – from idea to cash in 6 weeks
Keeping the spin – from idea to cash in 6 weeksJaroslav Procházka
 
Experience from Agile adoption in distributed environment
Experience from Agile adoption in distributed environmentExperience from Agile adoption in distributed environment
Experience from Agile adoption in distributed environmentJaroslav Procházka
 
Agile and Lean support and maintenance of IT Services and Information systems
Agile and Lean support and maintenance of IT Services and Information systemsAgile and Lean support and maintenance of IT Services and Information systems
Agile and Lean support and maintenance of IT Services and Information systemsJaroslav Procházka
 

Más de Jaroslav Procházka (16)

Numbers are not facts or reality
Numbers are not facts or realityNumbers are not facts or reality
Numbers are not facts or reality
 
4 phases of Agile evolution in your organization
4 phases of Agile evolution in your organization4 phases of Agile evolution in your organization
4 phases of Agile evolution in your organization
 
I dalajláma a Steve Jobs měli své mentory
I dalajláma a Steve Jobs měli své mentoryI dalajláma a Steve Jobs měli své mentory
I dalajláma a Steve Jobs měli své mentory
 
Don't bother me with product vision I'm just coding!
Don't bother me with product vision I'm just coding!Don't bother me with product vision I'm just coding!
Don't bother me with product vision I'm just coding!
 
So, you think you are rational
So, you think you are rationalSo, you think you are rational
So, you think you are rational
 
5 steps towards sustainable change - printable card
5 steps towards sustainable change - printable card5 steps towards sustainable change - printable card
5 steps towards sustainable change - printable card
 
5 steps to get more cookies with less effort
5 steps to get more cookies with less effort5 steps to get more cookies with less effort
5 steps to get more cookies with less effort
 
Jak Agile a Lean pomahaji ke stesti
Jak Agile a Lean pomahaji ke stestiJak Agile a Lean pomahaji ke stesti
Jak Agile a Lean pomahaji ke stesti
 
Money and happiness
Money and happinessMoney and happiness
Money and happiness
 
Human IT
Human ITHuman IT
Human IT
 
Our approach to kaizen, lean it summit, prochazka, chmelar
Our approach to kaizen, lean it summit, prochazka, chmelarOur approach to kaizen, lean it summit, prochazka, chmelar
Our approach to kaizen, lean it summit, prochazka, chmelar
 
Booklet for IT coaches
Booklet for IT coachesBooklet for IT coaches
Booklet for IT coaches
 
Coaching in distributed environment
Coaching in distributed environmentCoaching in distributed environment
Coaching in distributed environment
 
Keeping the spin – from idea to cash in 6 weeks
Keeping the spin – from idea to cash in 6 weeksKeeping the spin – from idea to cash in 6 weeks
Keeping the spin – from idea to cash in 6 weeks
 
Experience from Agile adoption in distributed environment
Experience from Agile adoption in distributed environmentExperience from Agile adoption in distributed environment
Experience from Agile adoption in distributed environment
 
Agile and Lean support and maintenance of IT Services and Information systems
Agile and Lean support and maintenance of IT Services and Information systemsAgile and Lean support and maintenance of IT Services and Information systems
Agile and Lean support and maintenance of IT Services and Information systems
 

Último

Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 

Último (20)

Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 

ITIL v3 story

  • 1. www.differ.cz Story of support and maintenance according ITIL v3, part I. Story of support and maintenance according ITIL v3 Part I. Operational activities Jaroslav Procházka www.differ.cz version 1.0 August 2011 © 2011 Jaroslav Procházka, www.differ.cz Page 1
  • 2. www.differ.cz Story of support and maintenance according ITIL v3, part I. Story motivation We are nowadays driven by strong rationality (logical, rational, scientific, verifiable facts matter) and forget irrational aspects and emotions in human decision making. If humans are rational, why the hell they buy Apple products? ;) The same statement is valid for stories and their power. Stories are part of our cultures for many thousand years and are the best way to transfer the knowledge, see sociological, psychological or cognitive studies, e.g. Campbell: The hero with a Thousand Faces or Turner: The Literary Mind. You know, all the old epics, Bible or story of Buddha are stories that are attractive for us, we would like to hear the same variations of hero’s journey again and again. And that’s stories what can differ us, our service, product or company from many other vendors providing the same. Stories matter. Other application of stories in business is in knowledge management and sharing domain. Be honest, how often do you use your logical-structured-fact-based Knowledge base? How easy is to remember such record content, steps and outcomes in longer term? And now, compare it with story of your colleague dramatically describing the same situation (you could hear it in kitchen, during lunch, in the pub)? Which one is easier to remember and follow? Big and respected companies like XEROX1, 3M or NASA use stories as the approach to store and share knowledge inside the company. Story telling is also part of modern leadership. Next motivation factor that was trigger for me to write this e-book is hard understanding of process frameworks like IBM RUP® or ITIL®. Such misunderstanding causes problems with support, operations and maintenance of IT infrastructure leading to weak quality, revenue, dissatisfied teams and customers. Goal of this e-book is to spread service-driven (ITSM) philosophy and service thinking using stories. The story is focused on principles and concepts described by ITIL v3. We start with end user affected by some issue and solve also hidden root cause (in ITIL terms Incident and Problem Management). Proactive investigation of root causes is weak point of many teams and companies. We’ll emphasize key ideas of this approach, doesn’t matter if you call it Problem management, Kaizen, TQM, CMMI. Part of the story is also Configuration Management that’s taking care of IT infrastructure items, Change Management processing change requests and Release and Deploy Management building the change. The second part of this story that would follow soon is focused on tactical and strategic activities of ITSM, namely service thinking, connection to business and its scenarios, predictions, proactive thinking, contracting and service measurement (so called SLA). This is the core of ITSM/ITIL thinking. 1 Více viz http://choo.fis.utoronto.ca/mgt/KM.xeroxCase.html i http://www.kmworld.com/Articles/Editorial/Feature/Best-Practices-Eureka!-Xerox-discovers-way-to-grow- community-knowledge.-.-And-customer-satisfaction-9140.aspx © 2011 Jaroslav Procházka, www.differ.cz Page 2
  • 3. www.differ.cz Story of support and maintenance according ITIL v3, part I. Note: This e-book does not replace ITIL training or certification. You will neither set up the right environment based on it. The meaning and goal of it is to raise awareness about ITIL version 3. What is it, how can it help with solving my issues and what are the differences from version 2. This material could bring insight for busy people to study ITIL, typically sales people, customer representatives, customers, architects, higher managers and other key people. If you have any comments, improvement proposals or ideas how to improve this e-book or you would like to cover also your domain in the story (only application management as incident contributor is covered), send it please to me via email (jarek@differ.cz). I also hope that this short story inspires you to write your team/unit knowledge base in form of short stories! It is more memorable, writing it is fun and thus they bring better value to its creators and consumers ;) © 2011 Jaroslav Procházka, www.differ.cz Page 3
  • 4. www.differ.cz Story of support and maintenance according ITIL v3, part I. A short introduction ITSM means IT Service Management, thus the story covers mostly introduction of the concept of IT service thinking and operational activities connected to this concept. The most known and used ITSM framework nowadays is called ITIL (IT Infrastructure Library) – the library guiding us in IT infrastructure management covering software, hardware, networking, people etc. ITIL brings process approach to ITSM and its key benefit is definition of common terminology that is very important for communication between IT and business and among different vendors in the chain. Nowadays (July 2011), ITIL exists in version 3, but new refresh is prepared for release and it would be called ITIL 2011. Key difference between version 2 and 3 is newly introduced lifecycle of the service (see picture), starting with its idea, strategy (Service Strategy phase) and ending with daily use and support (Service Operation). Story described in this e-book covers concepts and processes of version 3, specifically Service Transition and Service Operation part. Daily operations and support deal with necessary activities such as monitoring, data back- ups, implementation of law amendments (e.g. ERP applications) or reflecting changes in assembly process of production and assembly lines. Change or new functionality recording, assessing, implementing, testing and integrating is one part of those necessary activities. But specific actions need to be performed also in case of application/service incident that affects end users and thus the value we provide to the customer. Depending on number of users affected or importance of application, the cost of incident can be really huge: e.g. 30 minutes of stopped assembly line can mean 20 cars not assembled and delivered = 20 cars x 15.000 Eur / 1 car = 300.000 EUR losses just in 30 minutes! Specific domain that needs our attention and automation is physical changes in IT infrastructure: hardware or network. If not secured and automated properly, it can cause severe incidents with huge financial impact. More described example calculation of the cost of incident impact shows following box: © 2011 Jaroslav Procházka, www.differ.cz Page 4
  • 5. www.differ.cz Story of support and maintenance according ITIL v3, part I. Simple Incident cost calculation: Employee cost ………...... 100 EUR / h Headcount ........................ 200 in total Incident length .................. 3h 30 people cannot work for 3 hours because of system incident. The cost of impact can be simply calculated: 30 employees x 3h x 100 = 9000 EUR of costs in one day not generating any value to the customer! If you multiply this number by total amount of incidents per year, you could get pretty high number that could cover e.g. year budget for IT or the cost of totally new system or assembly line. Due to this fact, we need also early identification and uncovering of incidents with high level of automation. Jang part of this Jin is built-in proactive root cause identification and solution (so called Problem Management). Necessary backend functionality supporting efficient monitoring and problem management is providing knowledge about infrastructure: hardware and software configurations, software versions, licenses, people locations, access right politics etc. Advanced teams use (semi)automated knowledge base storing and proposing already solved issues, incidents, problems or complicated changes with many dependencies. Starring: Mary ….… business user affected by system incident, Pete ……. application programmer, John ….… system administrator, Adam …... Service Desk support specialist, And other starts… © 2011 Jaroslav Procházka, www.differ.cz Page 5
  • 6. www.differ.cz Story of support and maintenance according ITIL v3, part I. Typical scenario of daily operations Mary used paper evidence of incoming orders until now. Although her company had implemented information system for assembly line and economic agenda, order processing was not part of the project. Paper evidence is not very efficient and brings problem if some order needs to be find quickly. Also archiving is a bit problematic. Orders fade and their readability is harder and harder. Mary is happy, because order processing was recently automated by software program and integrated to assembly line information system. Rework, searching and archiving issues are limited to almost zero now and Mary can enjoy her work. It’s Monday morning. Mary uses application called WarehouseAndOrders v1.1 to process orders to assembly line, but after one hour of work software client crashes and she’s not able to run it again. So she calls her friend Pete, application programmer, to help her with solving this incident2. Pete is employed by IT company delivering and operating this application and knows Mary from university times. Actually, they are still friends and meet regularly. Pete is happy to hear from Mary again, so they have little chat and by the way Mary also mentions the incident. Pete makes some note, but forgets it immediately because of heavy load caused by upcoming release. Mary is awaiting resolution from Pete and performs some unimportant tasks not to be bored. She reminds herself at lunch on Monday when they used to go at lunch with all old university group. Pete asks about some symptoms observed by Mary in the morning (any error message, behavior of system etc.), not to be ashamed. But after few Mary’s comments Pete immediately talks about something else. You know, it is few hours since incident happens, so Mary doesn’t remember anything significant and Pete is annoyed by it. It is no surprise that Pete continues with development tasks after the lunch and forgets about Mary and her issue. Mary still could not use the application and process incoming orders. 2 We’ll start to differentiate between the key terms incident and problem. The reason is totally different meaning in ITIL terminology: Incident is an event causing availability or quality problems of IT service or its part perceived by end user. It could be response time, number of processed transactions, volume, no accessibility to service etc. Incident can be usually solved by so called workaround (typically server or application restart), but this solution or process doesn’t remove hidden root cause! It only allows service to operate again under agreed quality. Problem is hidden root cause of one or more incidents that can be already evident or cannot. Problem can be solved only by structural solution, e.g. change in IT infrastructure or bugfix of software application source code. © 2011 Jaroslav Procházka, www.differ.cz Page 6
  • 7. www.differ.cz Story of support and maintenance according ITIL v3, part I. It is Monday afternoon and Mary is calling Pete again to hear more about the progress. Pete gets angry because Mary interrupts him repeatedly. He needs to finish build testing for upcoming deployment. Pete wants to be freed of Mary so he stops testing and starts incident investigation. Mary is still waiting and performing not important tasks and incoming orders are not processed. Mary realizes that this day will not bring the solution and goes home earlier. Pete stays until 8 pm busy with infrastructure identification: what are the parts of this nasty program? Which servers are used to operate it? What middleware, databases and other connectors does it use? Tuesday morning is an important deadline for Pete, he needs to finish new release package for deployment. This is the reason why he comes earlier in the morning even though he finished late the day before. Mary arrives later this morning to be secured that incident is already solved and her time is not wasted. Pete focuses on finishing build testing and packaging. When ready, he continues with incident investigation. Finally he realizes what servers are used to operate WarehouseAndOrders application and both are Linux servers! Thanks IT God, Pete is Linux fan and skilled Linux programmer, so he wants to start investigation but missing account immediately stops him. Pete is proactive and calls John (system admin) to get any account to get in. John as good friend shares root account with an assumption that Pete will create his personal one and will upload there some new movies and mp3s. Why the hell would he otherwise ask for access to this server? Pete skips lunch today because of his heavy load. Tuesday afternoon brings following steps. Pete logs in Linux server as root and searches for WarehouseAndOrders program directory and other underlying applications and database servers. He plans to investigate logs to learn more about the situation, but accidentally when starting MC (Midnight Commander) he notices full server hard drive. Because Pete is busy but wants to help Mary at the same time, he does not care with creation of his account and setting the rights but just deletes some temp and log files as root. He calls John to restart Oracle DB and also Apache Tomcat web server, both were down and are used by WarehouseAndOrders application. In fact, Pete does not want to waste time by looking for admin interface. What more, it’s John’s responsibility anyway. John is confused by this request (Pete is not usually working for the customer using those servers), but he does what is asked for without any notice to end users. John informs Pete after restart to check what was expected. © 2011 Jaroslav Procházka, www.differ.cz Page 7
  • 8. www.differ.cz Story of support and maintenance according ITIL v3, part I. Pete can now call Mary that WarehouseAndOrders application v1.1 is running again. Mary is very grateful, thanks to Pete and starts to process orders waiting in queue. Pete forgets the whole story and continues with his assignment. Build needs to be tested and packaged for tomorrow’s deployment. Pete stays in office again until 8pm to finish all required steps. Wednesday morning looks like ordinary day when Mary processes the orders in queue. After 2 hours the same incident occurs again and it makes Mary angry. She calls Pete if he knows anything about the issue; maybe he’s improving the application she assumes. But nobody replies to office call. The reason is obvious for our reader, but not for Mary. Pete travels to the customer premises to install new release, because it cannot be done remotely. Mary is not doomed to waiting, since she calls Pete’s mobile phone and explains the situation. Pete contacts John and quickly synchronizes about the issue and its context. John finally gets the point why Pete asked for Linux account and server restart. The reason was not mp3s or new movies but incident! But thanks to this John knows some context, servers used and symptoms of the issue. Pete continues with installation of customer release, finally without any disturbances. John starts IT environment investigation and notices full server disk. He backs up chosen log files in different server for further analysis, deletes original ones and tries to restart Oracle DB and only failed instances of Apache Tomcat web server. He tries WarehouseAndOrders application and sees everything working but he still does not contact Mary before he’s sure incident will not occur again. John as system admin is surprised by full server disk. There cannot be so many movies and mp3s stored on server, he thinks loudly… He postpones lunch and starts investigation of incident’s deeper root cause. How can be server disk full? He writes workaround script that will back up chosen log and temp files in different server regularly and remove the original files after this procedure. John wanted to download log files to his computer for further investigation and analysis and notices accidentally so big Oracle DB log (only just because long download time)! How the hack can today’s Oracle log have almost 3 GB? He opens the log in original server folder and after few minutes of investigation notices programmer’s error reports. He updates workaround script with this log as well after this finding. Then the script is quickly tested with expected result, so nothing hinders its deployment. Only after this action John calls Mary to use the application again. John still wonders what error can cause such a huge Oracle log and if this is only contributor to full disk. He searches Internet forums if somebody already tackled similar issue, but founds nothing. He reports this defect to Oracle Corporation and waits for any reply. Finally he can go for a Wednesday’s lunch. © 2011 Jaroslav Procházka, www.differ.cz Page 8
  • 9. www.differ.cz Story of support and maintenance according ITIL v3, part I. Scenario conclusion: Albeit some actions described in this scenario can be striking and funny, many IT and non-IT organizations follow this setup. And if you discuss the topic with them and emphasize some anti-patterns, they are not aware about anything weird and are surprised by your statement about efficiency and potential risks. Moreover, this story is our personal experience from previous assignments. Let’s conclude the story: Mary could not process orders for almost 2 days = it could affect company’s cash flow and name or even generate losses but nobody cared. Pete was frequently disturbed, switched context and was overloaded. John as system administrator started to investigate hidden root cause of incident (doing his job) only after 2 days from first incident discovery. Due to disturbances and Pete’s tiredness build could contain unnoticed defects. Pete accessed restricted production servers as root and deleted files there as root. Same incident occurred again in short time and affected end user. Hidden root cause generating incident is still not uncovered and resolved. © 2011 Jaroslav Procházka, www.differ.cz Page 9
  • 10. www.differ.cz Story of support and maintenance according ITIL v3, part I. ITIL v3 scenario Let’s discuss same story following ITSM principles. This is how it looked like after 3 month of implementation effort. Same stars perform this story, but the approach to incident resolution is different. We focus on Service Transition and Service Operation activities again. Situation with IT systems is the same as described in the first scenario. We start the story on Monday morning again when Mary enters office and starts to use Orders&Warehousing IT Service3, not WarehouseAndOrders application anymore. She does not need to care about different parts of the service, start program client or prolong licenses. She just uses her browser and link to run Orders&Warehousing IT service. IT service works as expected, no warning symptoms occur. Standard monitoring and event reporting4 is set up and working at the same time. Business users, Mary as one of them, do not even know about this monitoring. IT specialists together with Service Desk specialists set the thresholds for specific components, servers and their events. These events can trigger deeper investigation by specialist or can automatically report an incident. Monitoring system started to report several “lack of free disk space” events of Orders&Warehousing IT service server this morning5. Service desk specialists started to investigate those events but meanwhile Orders&Warehousing IT service has frozen and had not responded. Mary reports an incident using Service desk (SD) tool. SD is the only single point of contact (SPOC), together with phone, to be used for communication with IT service vendor. Such a reported incident record contains incident description (observed symptoms), priority for user (e.g. only one using the service and being affected, department or team affected or whole company affected) and the name of service chosen from list of provided services. This action causes automatic notification of the incident to relevant service (and/or customer) Incident Manager. Incident Manager does the first incident record check, assigns expected category (e.g. hardware, network, application, premises, licenses) and priority in the context of end user perception but also other services and business impact. Resulting priority in this case is high although only Mary uses the service. But the service supports processing of incoming orders, and its unavailability can stop assembly line and affect company’s business and cash flow. Adam is assigned to this incident because he is marked as free in Service Desk dashboard and is automatically notified about it, the same is Mary. All these steps happen just in few minutes, approximately the same time as reading this page. 3 IT service is a mean for customer value delivery using IT resources. Customer gets specific outcomes needed to run the business without owning and managing costs and risks connected to IT. Customer does not care about software, hardware, networks, licenses, premises, people, upgrades and patches or monitoring. Customer just buys IT service as commodity and external or internal vendor takes care about operations, support and maintenance. 4 Typically log changes, state monitoring or user events are processed for incident triggering. 5 These activities are performed as part of Event Management and are tightly connected with monitoring and monitoring systems. © 2011 Jaroslav Procházka, www.differ.cz Page 10
  • 11. www.differ.cz Story of support and maintenance according ITIL v3, part I. Incident reporting example using Outlook © 2011 Jaroslav Procházka, www.differ.cz Page 11
  • 12. www.differ.cz Story of support and maintenance according ITIL v3, part I. Incident reporting example using Jira tool Adam reads obtained notification and immediately starts incident investigation. The first steps performed are following checks: Checking Knowledge Base (KB) – it contains solutions to existing problems and incidents. If well structured, readable and user friendly then KB can ease and speed up incident resolution as well as knowledge sharing among the team at the same time. Checking Configuration Management System (CMS) – it contains description, version, location and bindings of IT infrastructure components (end user stations, servers, accessories). Such system can significantly help with incident localization (which server or station is used by this service and what is the configuration, versions) and root cause identification. And checking automated monitoring tool records and events (Event management records). These functions are often performed by specific team or department called Control Desk. Mentioned tools allow quicker incident resolution but also require less technically skilled Service Desk specialists (needed information is stored in the tool and does not need to be mined in complicated way). Adam knows what components are used to operate Orders&Warehousing IT service thanks to CMS and IT service catalogue (see following table and figure). © 2011 Jaroslav Procházka, www.differ.cz Page 12
  • 13. www.differ.cz Story of support and maintenance according ITIL v3, part I. IT service name Users Responsibilities Configuration Items (CI) Orders&Warehousing Users: WarehouseAndOrders v1.1 Tomcat 6 Reporting incident using Oracle 9i Service Desk (tool or phone) Mary Red Hat Enterprise Linux 5 Participating regular monthly Management HW Server Prague SLA reviews Net Switch S1 … Net Switch S2 Intranet Internet See internal rules for using Internet Internet Service Provider All users (link to intranet document) Firewall Zone v3.2 Example IT service catalogue records. Configuration items column is visible only for IT vendor CMS part: visual information about IT service infrastructure (basically visualized Configuration Items column in table above) Events in IT infrastructure show insufficient (no) free server disk space onto core IT service operational server. Adam backs-up temp and log files and starts to investigate Oracle database and Tomcat web server logs only, because he knows from IT service catalogue that these are used by the service. Thanks to monitoring tools Adam also knows that only this service is down. He notices too big Oracle DB log consuming several GB of disk space. He backs-up and deletes Oracle log, restarts the service and tries its functionality. At the same time he also creates automatic script that backs-up and deletes original Oracle DB log file in regular interval (so called workaround solution). He verifies and installs the script, restarts Oracle DB and relevant Tomcat instance, checks monitoring tools, IT service functionality and backed-up file. Everything works, so Adam creates problem record in SD tool and assigns it, together with link to Oracle log file, to Oracle group that solves Oracle related problems. Problem ticket is raised to solve deeper root cause. Adam only used interim workaround solution for incident that allows running © 2011 Jaroslav Procházka, www.differ.cz Page 13
  • 14. www.differ.cz Story of support and maintenance according ITIL v3, part I. IT service again. But why is Oracle log so huge? What causes this? How to fix this? These questions are still not answered. As final step, Adam updates incident record (Work Log and solution) and closes it. Mary is notified about solved incident via e-mail, so she knows she can start to use the service again. Mary needs to try the service and if the solution is ok, she needs to accept incident solution (or it can be done automatically after some period of time, not to annoy end user). Mary accepts the solution because everything works well. It’s still Monday but already after the lunch. Adam creates Knowledge base record describing this incident and symptoms and encloses solution workaround (script). This KB record is linked to original incident record and to created problem record too. The goal of KB record is to speed up solution of similar incidents in the future. We used Service Desk function, or tool, and Event and Incident Management processes to register and process incident record. Only incident was solved in the story, root cause is still unclear. Reader could notice how appropriate tools and monitoring can make incident management process much more efficient and quick. Thanks to this is incident processed in several minutes and resolved in tens of minutes. Mary could continue with her work and there is no significant impact on company’s business (at least not 3 days as in previous story). But our job is not done yet. We need to uncover and solve the problem (ITIL term for unknown root cause) causing the incident. Let’s continue with the story then to uncover hidden problem using Problem Management process and implement the change using Change and Release and Deploy Management. The whole lifecycle and process relations are depicted in following figure: Relations of ITSM Service Operation and Service Transition processes introduced in our story Since Adam created problem ticket in Service Desk related to Oracle database group, Problem Management team is formed on demand. This team consists of skilled and experienced administrators and database programmers that are involved only in more complicated issues (Level 2 and 3 in Service Desk hierarchy model). The reason is labor cost of those professionals. Rachel, Oracle specialist is notified as Problem Manager and starts to investigate problem record as well as incident record with workaround, Knowledge Base description and mainly linked Oracle log file. Thanks to her knowledge of “standard” Oracle log, she uncovers quickly © 2011 Jaroslav Procházka, www.differ.cz Page 14
  • 15. www.differ.cz Story of support and maintenance according ITIL v3, part I. programmer’s error reports being part of this log. She’s surprised how this could happen, because she’s never experienced this before. Rachel logs in Oracle defect reporting tool (maintenance fee grants access to this database) and searches for this issue, but founds nothing. She is allowed to create a defect in Oracle defect tool, so she does, describes the log issue and attaches log snapshot to demonstrate it. Rachel receives reply from Oracle after several days informing about new patch released by Oracle to fix this defect. Rachel creates request for change (RfC) to implement this patch to operational environment. Part of this RfC is description, reason, importance and impact of this new patch. Now we get to the moment when root cause was identified and solution exists. Before releasing to production environment we need to approve the request (there could be upcoming conflicting or depending changes), test it (there can be other contributors to this root cause) and finally deploy. For these steps are responsible Change, Release and Deploy Management processes and roles. Change assessment, testing and deployment could look like activities in following chapter. Change request is assessed and approved by Change Manager Mike because no conflict or dependency with upcoming changes was found, implementation costs are very low and we save backups disk space when remove workaround. Uncovered root cause and proposed solution is structural one, solves the issue at low cost and allows removing workaround solution. Oracle patch is first installed and tested in testing environment (mirror copy of production environment) and is ready for production deployment only after all tests are finished and no other symptoms are observed. It seems that Release and Deploy team can now finally distribute and deploy patch to production environment. But before they proceed with this step they need to prepare strategy plan called rollback plan. Orders&Warehousing IT service is so important so IT vendor cannot afford another incident in a row (definitely it would affect SLA6). Rollback plan secures the team with strategy used if patch deployment fails. If it happens we have to be able restore previous working version and configuration. Necessary input for rollback plan is again CMS system containing information about current versions of software and hardware systems, their configurations and provides information about authorized storage of source, configuration and executable files. Now we are finally ready to deploy patch to production environment (really done-done). Deployment is done during agreed so called maintenance window. IT vendor can do changes and stop services for maintenance purposes only during this time. It is from 2.00 am to 3.00 am in this case. When team deploys the patch and runs verification production tests, they remove existing workaround (backup and delete script) together with Rachel. After check Mike closes this RfC as successfully implemented. Rachel now updates problem solution (Oracle patch) and closes problem as successfully implemented as well. She still needs to update Knowledge Base record to have all information synchronized. After that she’s done. Bit this is not the end of the story yet. Now there exist discrepancy between real production environment (Oracle database patch – micro version change) and information about it in CMS. We need to update this information in CMS and IT service catalogue to keep these tools useful. 6 SLA – Service Level Agreement – defines agreed quality parameters and conditions under which is service provided. It is usually contract appendix, because it is not a formal contract. © 2011 Jaroslav Procházka, www.differ.cz Page 15
  • 16. www.differ.cz Story of support and maintenance according ITIL v3, part I. Update can be done manually7 or using automated tool8 depending on vendor’s automation maturity. Simple Rollback plan example If we compare the first and second scenario, we can see big difference. Using more formal ITSM/ITIL procedures supported by automated tools allowed processing all necessary activities more efficiently and without needless emotions. We solved also deeper root case with structural, not just interim solution causing more complex IT infrastructure and its support and maintenance. But do not take these statements as a rule or the only truth. There is a hidden trap when shifting our way of working from informal, ad hoc to process oriented way of working. The trap is omitting or suppressing human aspect and becoming only ticket driven machine so commonly seen in big corporations. Anyway, we can conclude the scenario as following: Incident was closed much earlier than in the first scenario. People responsible for incident solving did the job, no other IT roles, e.g. programmers, were disturbed. People involved in ITSM activities knew what and how to do (it was also boosted by proper process automation). 7 We recommend simple checklist being part of change record (or work log) that will enforce/remind manual update. 8 Update can happen without any manual action (monitoring system inform about this change in infrastructure and updates the information) or semi-automatically (manual trigger for automatic IT infrastructure audit). © 2011 Jaroslav Procházka, www.differ.cz Page 16
  • 17. www.differ.cz Story of support and maintenance according ITIL v3, part I. Incident root cause investigation and structural solution design (not just accepting workaround solution) started very first day with the aim to prevent recurring incidents. Proper automated tools (have you noticed, no Excel was mentioned ;)) speeded up diagnoses, information gathering and incident resolution process. Every step is recorded in Service Desk tool and it’s easy to track or report all steps and actions performed. Updated user friendly knowledge base (KB) could help with similar incident/problem solution. © 2011 Jaroslav Procházka, www.differ.cz Page 17
  • 18. www.differ.cz Story of support and maintenance according ITIL v3, part I. The whole story following ITSM/ITILv3 processes is depicted in following picture: Flow starting with identified incident (also reported events) and ending with implementation of structural solution to identified problem (workaround is not a final destination) Story conclusion As the result of this significant incident affecting Orders&Warehousing IT service was conducted extra SLA review meeting between IT and business. Scope of this meeting was not only to follow thresholds and actual values of service quality attributes but also possible financial losses caused by this significant incident. It triggered additional actions on IT vendor side that should lead to better understanding of business, improved capacity and load predictions and proactive steps uncovering potential problems (in terms of ITIL terminology). But these steps are already a trailer of upcoming second part of this ITILv3 story ;) © 2011 Jaroslav Procházka, www.differ.cz Page 18
  • 19. www.differ.cz Story of support and maintenance according ITIL v3, part I. Change history Version Date Author Change history V1.0 August 2011 Jarek Procházka First English version created © 2011 Jaroslav Procházka, www.differ.cz Page 19
  • 20. www.differ.cz Story of support and maintenance according ITIL v3, part I. Differ! www.differ.cz Improve your IT development, support, maintenance and operation using Agile and Lean practices Articles and experience Agile and Lean IT development, support and maintenance Human aspect in IT Agile and Lean management Practical templates and checklists Books review Free e-books ITIL in practice Experience from projects Services Creative workshop Lean workshop Consultations © 2011 Jaroslav Procházka, www.differ.cz Page 20