SlideShare una empresa de Scribd logo
1 de 27
Descargar para leer sin conexión
NonStop monitoring and
automation
Wolfgang Breidbach

Seite 1 | 29.01.2014 | Bank-Verlag GmbH
Bank-Verlag
■ Founded in 1961 as the publishing house of the magazine „Die Bank“.
■ Running on IBM Systems /1 and /370 the first Authorisation Center in Germany for ATMtransactions was founded at the Bank-Verlag in 1986.
■ In 1988 authorisation was migrated to Tandem creating the first active-active application.
■ In the following years we took our way through Cyclone, CLX, CLX2000, K10000, K20000,
S7000, S70000, S72000 to at last S86000
■ 2005 we moved to Integrity NonStop
■ 2010 the secondary datacentre was moved to a new location
■ 2012 we migrated our production systems to NonStop blades
■ Today wer are the IT-service provider for the Private Banks in Germany
Seite 2 | 29.01.2014 | Bank-Verlag GmbH
The start
■ Bank-Verlag was using a commercial monitoring tool
■ Management decided to replace that tool by open source Nagios for all Windows, Unix and
Linux systems
■ Nagios should be used for NonStop systems as well

■ Problem: No open source monitoring tool for NonStop available that fullfilled our needs
■ Decision: We will have to create something ourselves!

Seite 3 | 29.01.2014 | Bank-Verlag GmbH
Some basic decisions
■ The main purpose is monitoring our NonStop systems
■ Feeding Nagios with information should be a result of that
■ The open source world is changing quickly, we should be able to support any other tool with
little changes

■ The NonStop monitoring should not depend on any external tool
■ The messages should not require in-depth NonStop knowledge
■ Avoid manual configuration whereever possible

Seite 4 | 29.01.2014 | Bank-Verlag GmbH
Our approach
■ We have a bunch of „subsystems“ like CPU, Pathway, Lines, NetBatch and so on
■ Every subsystem has ist own monitoring module
■ Every module collects all available configuration information automatically like
■ NetBatch module collects all information concerning NetBatch jobs and calenders
■ Line Module collects all lines
■ Some modules need additional configuration data:
■ File module needs the filesets to check
■ EMS module needs the messages to look for

Seite 5 | 29.01.2014 | Bank-Verlag GmbH
Our approach
■ Every module has a „refresh configuration“ function
■ Every module is configurable with parameters, every parameter has a default
■ If an event is found that could be handled by the toolbox it should handled by the toolbox
■ File is getting full => perform a reload or increase maxextents
■ A static Pathway server is down => issue a START command
■ A process is consuming too many CPU cycles => reduce priority

Seite 6 | 29.01.2014 | Bank-Verlag GmbH
Our approach
■ Another goal was avoiding manual taks we do not like
■ Regular reloads
■ Checking Backups
■ Checking database contents
■ Collect statistical data
■
■
■
■

Line usage
File sizes
CPU usage
TMF rate

■ Create documentation about the configuration of the system
Seite 7 | 29.01.2014 | Bank-Verlag GmbH
Our approach

■ We want to make information available to people not familiar with NonStop systems

■ The X.25 line with the calling address 12345678 is connected to the SWAN-box with the
„S77“ sticker on Clip 1 line 0
■ The TCP/IP connection with the addrsss 192.168.77.77 is configured on the controller in
slot 2.4 on „D“ and the port has the MAC address 08.00.12.34.56
■ This should be database information accessible and usable without any detailed NonStop
knowledge
■ Reports of installed hardware should be understandable without the knowledge of HP product
numbers

Seite 8 | 29.01.2014 | Bank-Verlag GmbH
The Start
■ First subsystem was „CPU and processes“
■ Development based on some already available programs
■ The CPU- and processmonitoring program should not write any diskfiles
■ Create the tools to maintain the appropiate tables including the long-term data collection
■ Create a central message collector reading the tables and formatting the messages
■ Continue with the other subsystems

Seite 9 | 29.01.2014 | Bank-Verlag GmbH
The next steps
■ Decision to build the software like a product
■ Great advantages distributing the software on our 4 (at the moment 6) systems
■ Design of a central message handling program
■ Avoid any hard-coded messages
■ A side-effect: The toolbox supports multiple languages

Seite 10 | 29.01.2014 | Bank-Verlag GmbH
Available subsystems
■ CPU- and Processes (incl. automatic restart of processes *)
■ Lines
■ Pathway
■ Files incl. automatic reload *
■ TMF
■ RDF
■ Netbatch
■ Devices
■ TCP/IP
■ Spooler
■ EMS-messages *
■ Message collector
■ Backups *
* = configuration required
Seite 11 | 29.01.2014 | Bank-Verlag GmbH
CPU- and processmonitoring

Restart monitor
Subsystem modules

Database-interface

Configuration
tables

Message
templates

Event
tables

Message collector

Message
table

TCP/IP interface

Seite 12 | 29.01.2014 | Bank-Verlag GmbH
Some additional information
■ The original monitoring toolbox is based on SQL tables
■ An Enscribe version is in progress
■ The toolbox in not depending on Measure, Measure is only used to find the originator of a heavy
diskload
■ The toolbox is causing very little CPU-load,
■ Collected statistical data allows lots of reports using standard tools like Excel

Seite 13 | 29.01.2014 | Bank-Verlag GmbH
Advantages
■ Keep track of hardware changes like exchange of disks
■ No need for additional software like Measure
■ Software is running „out of the box“ without a need for additional configuration
■ Lots of parameters and table entries for configuration available
■ The software supports multiple languages, at the moment the messages are available in German
and English
■ Bank-Verlag is not a vendor but a user, we are using the software ourselves
■ Very limited commercial interest in selling the software

Seite 14 | 29.01.2014 | Bank-Verlag GmbH
Advantages during daily life
■ Reloads are carried out automatically if needed
■ Processes causing heavy diskload are found (Measure required!)
■ The priority of processes using too many CPU cycles can be automatically reduced
■ Pathway-servers can be automatically restarted
■ Missing processes can be restarted automatically
■ Existence of required processes can be checked

■ The whole system including all the applications can be started this way!

Seite 15 | 29.01.2014 | Bank-Verlag GmbH
Advantages during daily life
■ Batchjobs and Calendars are checked periodically.
■ If a calendar is expiring, a message if issued a few days before expiration
■ The outcome of all backup jobs is checked
■ Disk problems are checked periodically including
■ Number of ZZSA files
■ Status of OSS-filesets

Seite 16 | 29.01.2014 | Bank-Verlag GmbH
Advantages during daily life
■ Files matching predefined filesets are checked for files running full
■ If a file is too full it is automatically checked for a possible reload or the maxextents are increased
■ All configured files are periodically reloaded if necessary
■ Necessary reload is decided depending on slack and fragmentation
■ All needed parameters can be defined globally, for a fileset or even for a single file.
■ The need for manual reloads has been reduced to zero

Seite 17 | 29.01.2014 | Bank-Verlag GmbH
Interesting problems
■ The status of TCP/IP connections can be checked
■ You need 2 established connections from your $ZB000 (192.168.77.77) to 192.168.88.88
port 1234.
■ If at least one of these connections is down, a message is created
■ The cause for that might be an erroneously changed firewall configuration
■ The same feature has been implemented for X.25 connections

Seite 18 | 29.01.2014 | Bank-Verlag GmbH
A real life case concerning TCP/IP
■ Our NonStop is accessing another server though a firewall
■ There have to be 2 established connections on port 4711
■ A rule within the firewall was erroneously changed
■ The NonStop could no longer establish a new connection to the server
■ The already established connections were not affected
■ The real problem we had weeks later when one of the connections had to be reestablished

■ The monitoring tool found the missing connection immediately

Seite 19 | 29.01.2014 | Bank-Verlag GmbH
Another problem
■ We have a leased line to another provider
■ Line is using X.25 protocol
■ During peak hours we had some problems on the line
■ Using the statistical data we found out that the capacity of the line was exceeded
■ Increasing the speed immediately solved all problems

Seite 20 | 29.01.2014 | Bank-Verlag GmbH
Security issues
■ Safeguard reports erroneous logons
■ Safeguard does not report the external origin of this logon like the IP-address
■ We read the Safeguard log and add that information
■ So the question „From where did the logon with Administrator to the NonStop come“ can be
answered by a look at our table

Seite 21 | 29.01.2014 | Bank-Verlag GmbH
Application monitoring
■ There are 2 kinds of application monitoring:
■ Checking database contents
■ Checking application messages
■ The database contents are checked using SQL-statements of the type „SELECT COUNT(*) from
… WHERE… BROWSE ACCESS;“
■ The result is compared against given values and a message is created if necessary
■ The severity of the messages can be set depending on the result like:
■ 1 found => Warning
■ 2 found => Error

Seite 22 | 29.01.2014 | Bank-Verlag GmbH
Checking EMS-messages
■ Our applications are using EMS collectors to report any errors
■ We are able to check the number of messages per type per time period
■ A sample message would be „Timeout process $ABCD“, process $ABCD is routing messages to
XY-Bank
■ We define the message be „Timeout“ and „$ABCD“ as „Timeout to XY-BANK“ and count those
messages per period
■ A messages is created depending on the configured theshold for this type of message

Seite 23 | 29.01.2014 | Bank-Verlag GmbH
An idea for EMS message handling
■ We are handling authorisation requests for credit and debit cards, most of these requests are
send to the card-issuing banks

■ We are creating minute-based statistics of those requests per issuer
■ If an issuer has problems we can create a message like
60% of the requests unsuccessfull
■ Now the message handling gets this information and handles it according to the configuration:
■ 1 message within 10 minutes
■ 10 messages within 10 minutes

 no need for action
 create an alarm

Seite 24 | 29.01.2014 | Bank-Verlag GmbH
Our main Nagios screen for NonStop

Seite 25 | 29.01.2014 | Bank-Verlag GmbH
Our main Nagios screen for NonStop with error message

Seite 26 | 29.01.2014 | Bank-Verlag GmbH
Any questions???
Wolfgang Breidbach
Bank-Verlag GmbH
IT-Services
Wendelinstr. 1
50933 Köln
E-Mail: Wolfgang.Breidbach@Bank-Verlag.de
www.Bank-Verlag.de

Seite 27 | 29.01.2014 | Bank-Verlag GmbH

Más contenido relacionado

Destacado

WebCT presentation 007
WebCT presentation 007WebCT presentation 007
WebCT presentation 007kylebb7
 
Kåre Rude Andersen - Create a scombot – automate and monitor azure
Kåre Rude Andersen - Create a scombot – automate and monitor azureKåre Rude Andersen - Create a scombot – automate and monitor azure
Kåre Rude Andersen - Create a scombot – automate and monitor azureNordic Infrastructure Conference
 
김피디ㅋ 3월 3호 "당장 뉴스를 멈춰"
김피디ㅋ 3월 3호 "당장 뉴스를 멈춰"김피디ㅋ 3월 3호 "당장 뉴스를 멈춰"
김피디ㅋ 3월 3호 "당장 뉴스를 멈춰"김 피디
 
Tata Tiscon Part II- Matrix Rewards
Tata Tiscon Part II-  Matrix RewardsTata Tiscon Part II-  Matrix Rewards
Tata Tiscon Part II- Matrix Rewardsmatrikrewards
 
Customer Care - Matrix Rewards
Customer Care - Matrix RewardsCustomer Care - Matrix Rewards
Customer Care - Matrix Rewardsmatrikrewards
 
Ottimizzazione non lineare,Teorema di Lagrange e applicazione economica
Ottimizzazione non lineare,Teorema di Lagrange e applicazione economicaOttimizzazione non lineare,Teorema di Lagrange e applicazione economica
Ottimizzazione non lineare,Teorema di Lagrange e applicazione economicaAngela Berardinelli
 
Tata Shaktee - Matrix Rewards
Tata Shaktee -  Matrix RewardsTata Shaktee -  Matrix Rewards
Tata Shaktee - Matrix Rewardsmatrikrewards
 
Kuidas õppida keeli efektiivselt
Kuidas õppida keeli efektiivseltKuidas õppida keeli efektiivselt
Kuidas õppida keeli efektiivseltKeelestuudio
 
Мобифорс - система управления мобильными сотрудниками
Мобифорс - система управления мобильными сотрудникамиМобифорс - система управления мобильными сотрудниками
Мобифорс - система управления мобильными сотрудникамиСергей Вассерман
 
Campus SaVE Act 2014 Regulatory Updates
Campus SaVE Act 2014 Regulatory UpdatesCampus SaVE Act 2014 Regulatory Updates
Campus SaVE Act 2014 Regulatory UpdatesLiz Williams
 

Destacado (20)

WebCT presentation 007
WebCT presentation 007WebCT presentation 007
WebCT presentation 007
 
Kåre Rude Andersen - Create a scombot – automate and monitor azure
Kåre Rude Andersen - Create a scombot – automate and monitor azureKåre Rude Andersen - Create a scombot – automate and monitor azure
Kåre Rude Andersen - Create a scombot – automate and monitor azure
 
Geopolitica stefanelli
Geopolitica stefanelliGeopolitica stefanelli
Geopolitica stefanelli
 
김피디ㅋ 3월 3호 "당장 뉴스를 멈춰"
김피디ㅋ 3월 3호 "당장 뉴스를 멈춰"김피디ㅋ 3월 3호 "당장 뉴스를 멈춰"
김피디ㅋ 3월 3호 "당장 뉴스를 멈춰"
 
Research Into Digipaks
Research Into DigipaksResearch Into Digipaks
Research Into Digipaks
 
Tata Tiscon Part II- Matrix Rewards
Tata Tiscon Part II-  Matrix RewardsTata Tiscon Part II-  Matrix Rewards
Tata Tiscon Part II- Matrix Rewards
 
Can i get covered outside of open enrollment
Can i get covered outside of open enrollmentCan i get covered outside of open enrollment
Can i get covered outside of open enrollment
 
Customer Care - Matrix Rewards
Customer Care - Matrix RewardsCustomer Care - Matrix Rewards
Customer Care - Matrix Rewards
 
Ottimizzazione non lineare,Teorema di Lagrange e applicazione economica
Ottimizzazione non lineare,Teorema di Lagrange e applicazione economicaOttimizzazione non lineare,Teorema di Lagrange e applicazione economica
Ottimizzazione non lineare,Teorema di Lagrange e applicazione economica
 
Tata Shaktee - Matrix Rewards
Tata Shaktee -  Matrix RewardsTata Shaktee -  Matrix Rewards
Tata Shaktee - Matrix Rewards
 
Kuidas õppida keeli efektiivselt
Kuidas õppida keeli efektiivseltKuidas õppida keeli efektiivselt
Kuidas õppida keeli efektiivselt
 
Мобифорс - система управления мобильными сотрудниками
Мобифорс - система управления мобильными сотрудникамиМобифорс - система управления мобильными сотрудниками
Мобифорс - система управления мобильными сотрудниками
 
Uk assignments
Uk assignmentsUk assignments
Uk assignments
 
My Music Video Timeline
My Music Video TimelineMy Music Video Timeline
My Music Video Timeline
 
Summer Shape-Up Guide (infographic)
Summer Shape-Up Guide (infographic)Summer Shape-Up Guide (infographic)
Summer Shape-Up Guide (infographic)
 
Campus SaVE Act 2014 Regulatory Updates
Campus SaVE Act 2014 Regulatory UpdatesCampus SaVE Act 2014 Regulatory Updates
Campus SaVE Act 2014 Regulatory Updates
 
Evaluation Question 6
Evaluation Question 6Evaluation Question 6
Evaluation Question 6
 
Hardware luis suarez 3
Hardware luis suarez 3Hardware luis suarez 3
Hardware luis suarez 3
 
My Life
My Life My Life
My Life
 
Question 2
Question 2Question 2
Question 2
 

Similar a Non stop monitoring and automation

Windows Debugging Tools - JavaOne 2013
Windows Debugging Tools - JavaOne 2013Windows Debugging Tools - JavaOne 2013
Windows Debugging Tools - JavaOne 2013MattKilner
 
SA114 - Virtual Notesiality! - How the Notes client and Browser Plugin can ex...
SA114 - Virtual Notesiality! - How the Notes client and Browser Plugin can ex...SA114 - Virtual Notesiality! - How the Notes client and Browser Plugin can ex...
SA114 - Virtual Notesiality! - How the Notes client and Browser Plugin can ex...Daniel Reimann
 
ICONUK 2018 - IBM Notes V10 Performance Boost
ICONUK 2018 - IBM Notes V10 Performance BoostICONUK 2018 - IBM Notes V10 Performance Boost
ICONUK 2018 - IBM Notes V10 Performance BoostChristoph Adler
 
AdminCamp 2018 - ApplicationInsights für Administratoren
AdminCamp 2018 - ApplicationInsights für AdministratorenAdminCamp 2018 - ApplicationInsights für Administratoren
AdminCamp 2018 - ApplicationInsights für AdministratorenChristoph Adler
 
Operational and business monitoring with IBM Integration Bus-Sanjay Nagchowdhury
Operational and business monitoring with IBM Integration Bus-Sanjay NagchowdhuryOperational and business monitoring with IBM Integration Bus-Sanjay Nagchowdhury
Operational and business monitoring with IBM Integration Bus-Sanjay NagchowdhuryKaren Broughton-Mabbitt
 
Virtual,Faster,Better! How To Virtualize the IBM Notes Client and IBM Client ...
Virtual,Faster,Better! How To Virtualize the IBM Notes Client and IBM Client ...Virtual,Faster,Better! How To Virtualize the IBM Notes Client and IBM Client ...
Virtual,Faster,Better! How To Virtualize the IBM Notes Client and IBM Client ...Christoph Adler
 
PROJECT REPORT ON COMPUTER SHOP SYSTEM IN C++
PROJECT REPORT ON COMPUTER SHOP SYSTEM IN C++PROJECT REPORT ON COMPUTER SHOP SYSTEM IN C++
PROJECT REPORT ON COMPUTER SHOP SYSTEM IN C++vikram mahendra
 
Top 10 Tricks and Tools of an Oracle EPM Administrator
Top 10 Tricks and Tools of an Oracle EPM AdministratorTop 10 Tricks and Tools of an Oracle EPM Administrator
Top 10 Tricks and Tools of an Oracle EPM Administratornking821
 
Operating System Unit 1
Operating System Unit 1Operating System Unit 1
Operating System Unit 1SanthiNivas
 
Pcm to unifier migration considerations - Oracle Primavera P6 Collaborate 14
Pcm to unifier migration considerations  - Oracle Primavera P6 Collaborate 14Pcm to unifier migration considerations  - Oracle Primavera P6 Collaborate 14
Pcm to unifier migration considerations - Oracle Primavera P6 Collaborate 14p6academy
 
Iib v10 performance problem determination examples
Iib v10 performance problem determination examplesIib v10 performance problem determination examples
Iib v10 performance problem determination examplesMartinRoss_IBM
 
Virtual, Faster, Better! How to Virtualize the Rich Client and Browser Plugin...
Virtual, Faster, Better! How to Virtualize the Rich Client and Browser Plugin...Virtual, Faster, Better! How to Virtualize the Rich Client and Browser Plugin...
Virtual, Faster, Better! How to Virtualize the Rich Client and Browser Plugin...ICS User Group
 
BP1491: Virtual, Faster, Better - How to Virtualize the Rich Client and Brows...
BP1491: Virtual, Faster, Better - How to Virtualize the Rich Client and Brows...BP1491: Virtual, Faster, Better - How to Virtualize the Rich Client and Brows...
BP1491: Virtual, Faster, Better - How to Virtualize the Rich Client and Brows...panagenda
 
DNUG 2015 - Notes Browser Clients, Client Upgrades und beste Startzeiten!
DNUG 2015 - Notes Browser Clients, Client Upgrades und beste Startzeiten!DNUG 2015 - Notes Browser Clients, Client Upgrades und beste Startzeiten!
DNUG 2015 - Notes Browser Clients, Client Upgrades und beste Startzeiten!Christoph Adler
 
Scaling FreeSWITCH Performance
Scaling FreeSWITCH PerformanceScaling FreeSWITCH Performance
Scaling FreeSWITCH PerformanceMoises Silva
 

Similar a Non stop monitoring and automation (20)

c programming 1-1.pptx
c programming 1-1.pptxc programming 1-1.pptx
c programming 1-1.pptx
 
Windows Debugging Tools - JavaOne 2013
Windows Debugging Tools - JavaOne 2013Windows Debugging Tools - JavaOne 2013
Windows Debugging Tools - JavaOne 2013
 
SA114 - Virtual Notesiality! - How the Notes client and Browser Plugin can ex...
SA114 - Virtual Notesiality! - How the Notes client and Browser Plugin can ex...SA114 - Virtual Notesiality! - How the Notes client and Browser Plugin can ex...
SA114 - Virtual Notesiality! - How the Notes client and Browser Plugin can ex...
 
ICONUK 2018 - IBM Notes V10 Performance Boost
ICONUK 2018 - IBM Notes V10 Performance BoostICONUK 2018 - IBM Notes V10 Performance Boost
ICONUK 2018 - IBM Notes V10 Performance Boost
 
Apache flink
Apache flinkApache flink
Apache flink
 
AdminCamp 2018 - ApplicationInsights für Administratoren
AdminCamp 2018 - ApplicationInsights für AdministratorenAdminCamp 2018 - ApplicationInsights für Administratoren
AdminCamp 2018 - ApplicationInsights für Administratoren
 
Mini Project- USB Temperature Logging
Mini Project- USB Temperature LoggingMini Project- USB Temperature Logging
Mini Project- USB Temperature Logging
 
Operational and business monitoring with IBM Integration Bus-Sanjay Nagchowdhury
Operational and business monitoring with IBM Integration Bus-Sanjay NagchowdhuryOperational and business monitoring with IBM Integration Bus-Sanjay Nagchowdhury
Operational and business monitoring with IBM Integration Bus-Sanjay Nagchowdhury
 
Virtual,Faster,Better! How To Virtualize the IBM Notes Client and IBM Client ...
Virtual,Faster,Better! How To Virtualize the IBM Notes Client and IBM Client ...Virtual,Faster,Better! How To Virtualize the IBM Notes Client and IBM Client ...
Virtual,Faster,Better! How To Virtualize the IBM Notes Client and IBM Client ...
 
PROJECT REPORT ON COMPUTER SHOP SYSTEM IN C++
PROJECT REPORT ON COMPUTER SHOP SYSTEM IN C++PROJECT REPORT ON COMPUTER SHOP SYSTEM IN C++
PROJECT REPORT ON COMPUTER SHOP SYSTEM IN C++
 
Top 10 Tricks and Tools of an Oracle EPM Administrator
Top 10 Tricks and Tools of an Oracle EPM AdministratorTop 10 Tricks and Tools of an Oracle EPM Administrator
Top 10 Tricks and Tools of an Oracle EPM Administrator
 
Chapter 1 - Prog101.ppt
Chapter 1 - Prog101.pptChapter 1 - Prog101.ppt
Chapter 1 - Prog101.ppt
 
Operating System Unit 1
Operating System Unit 1Operating System Unit 1
Operating System Unit 1
 
Pcm to unifier migration considerations - Oracle Primavera P6 Collaborate 14
Pcm to unifier migration considerations  - Oracle Primavera P6 Collaborate 14Pcm to unifier migration considerations  - Oracle Primavera P6 Collaborate 14
Pcm to unifier migration considerations - Oracle Primavera P6 Collaborate 14
 
Iib v10 performance problem determination examples
Iib v10 performance problem determination examplesIib v10 performance problem determination examples
Iib v10 performance problem determination examples
 
Virtual, Faster, Better! How to Virtualize the Rich Client and Browser Plugin...
Virtual, Faster, Better! How to Virtualize the Rich Client and Browser Plugin...Virtual, Faster, Better! How to Virtualize the Rich Client and Browser Plugin...
Virtual, Faster, Better! How to Virtualize the Rich Client and Browser Plugin...
 
3 types of monitoring for 2020
3 types of monitoring for 20203 types of monitoring for 2020
3 types of monitoring for 2020
 
BP1491: Virtual, Faster, Better - How to Virtualize the Rich Client and Brows...
BP1491: Virtual, Faster, Better - How to Virtualize the Rich Client and Brows...BP1491: Virtual, Faster, Better - How to Virtualize the Rich Client and Brows...
BP1491: Virtual, Faster, Better - How to Virtualize the Rich Client and Brows...
 
DNUG 2015 - Notes Browser Clients, Client Upgrades und beste Startzeiten!
DNUG 2015 - Notes Browser Clients, Client Upgrades und beste Startzeiten!DNUG 2015 - Notes Browser Clients, Client Upgrades und beste Startzeiten!
DNUG 2015 - Notes Browser Clients, Client Upgrades und beste Startzeiten!
 
Scaling FreeSWITCH Performance
Scaling FreeSWITCH PerformanceScaling FreeSWITCH Performance
Scaling FreeSWITCH Performance
 

Último

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 

Último (20)

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 

Non stop monitoring and automation

  • 1. NonStop monitoring and automation Wolfgang Breidbach Seite 1 | 29.01.2014 | Bank-Verlag GmbH
  • 2. Bank-Verlag ■ Founded in 1961 as the publishing house of the magazine „Die Bank“. ■ Running on IBM Systems /1 and /370 the first Authorisation Center in Germany for ATMtransactions was founded at the Bank-Verlag in 1986. ■ In 1988 authorisation was migrated to Tandem creating the first active-active application. ■ In the following years we took our way through Cyclone, CLX, CLX2000, K10000, K20000, S7000, S70000, S72000 to at last S86000 ■ 2005 we moved to Integrity NonStop ■ 2010 the secondary datacentre was moved to a new location ■ 2012 we migrated our production systems to NonStop blades ■ Today wer are the IT-service provider for the Private Banks in Germany Seite 2 | 29.01.2014 | Bank-Verlag GmbH
  • 3. The start ■ Bank-Verlag was using a commercial monitoring tool ■ Management decided to replace that tool by open source Nagios for all Windows, Unix and Linux systems ■ Nagios should be used for NonStop systems as well ■ Problem: No open source monitoring tool for NonStop available that fullfilled our needs ■ Decision: We will have to create something ourselves! Seite 3 | 29.01.2014 | Bank-Verlag GmbH
  • 4. Some basic decisions ■ The main purpose is monitoring our NonStop systems ■ Feeding Nagios with information should be a result of that ■ The open source world is changing quickly, we should be able to support any other tool with little changes ■ The NonStop monitoring should not depend on any external tool ■ The messages should not require in-depth NonStop knowledge ■ Avoid manual configuration whereever possible Seite 4 | 29.01.2014 | Bank-Verlag GmbH
  • 5. Our approach ■ We have a bunch of „subsystems“ like CPU, Pathway, Lines, NetBatch and so on ■ Every subsystem has ist own monitoring module ■ Every module collects all available configuration information automatically like ■ NetBatch module collects all information concerning NetBatch jobs and calenders ■ Line Module collects all lines ■ Some modules need additional configuration data: ■ File module needs the filesets to check ■ EMS module needs the messages to look for Seite 5 | 29.01.2014 | Bank-Verlag GmbH
  • 6. Our approach ■ Every module has a „refresh configuration“ function ■ Every module is configurable with parameters, every parameter has a default ■ If an event is found that could be handled by the toolbox it should handled by the toolbox ■ File is getting full => perform a reload or increase maxextents ■ A static Pathway server is down => issue a START command ■ A process is consuming too many CPU cycles => reduce priority Seite 6 | 29.01.2014 | Bank-Verlag GmbH
  • 7. Our approach ■ Another goal was avoiding manual taks we do not like ■ Regular reloads ■ Checking Backups ■ Checking database contents ■ Collect statistical data ■ ■ ■ ■ Line usage File sizes CPU usage TMF rate ■ Create documentation about the configuration of the system Seite 7 | 29.01.2014 | Bank-Verlag GmbH
  • 8. Our approach ■ We want to make information available to people not familiar with NonStop systems ■ The X.25 line with the calling address 12345678 is connected to the SWAN-box with the „S77“ sticker on Clip 1 line 0 ■ The TCP/IP connection with the addrsss 192.168.77.77 is configured on the controller in slot 2.4 on „D“ and the port has the MAC address 08.00.12.34.56 ■ This should be database information accessible and usable without any detailed NonStop knowledge ■ Reports of installed hardware should be understandable without the knowledge of HP product numbers Seite 8 | 29.01.2014 | Bank-Verlag GmbH
  • 9. The Start ■ First subsystem was „CPU and processes“ ■ Development based on some already available programs ■ The CPU- and processmonitoring program should not write any diskfiles ■ Create the tools to maintain the appropiate tables including the long-term data collection ■ Create a central message collector reading the tables and formatting the messages ■ Continue with the other subsystems Seite 9 | 29.01.2014 | Bank-Verlag GmbH
  • 10. The next steps ■ Decision to build the software like a product ■ Great advantages distributing the software on our 4 (at the moment 6) systems ■ Design of a central message handling program ■ Avoid any hard-coded messages ■ A side-effect: The toolbox supports multiple languages Seite 10 | 29.01.2014 | Bank-Verlag GmbH
  • 11. Available subsystems ■ CPU- and Processes (incl. automatic restart of processes *) ■ Lines ■ Pathway ■ Files incl. automatic reload * ■ TMF ■ RDF ■ Netbatch ■ Devices ■ TCP/IP ■ Spooler ■ EMS-messages * ■ Message collector ■ Backups * * = configuration required Seite 11 | 29.01.2014 | Bank-Verlag GmbH
  • 12. CPU- and processmonitoring Restart monitor Subsystem modules Database-interface Configuration tables Message templates Event tables Message collector Message table TCP/IP interface Seite 12 | 29.01.2014 | Bank-Verlag GmbH
  • 13. Some additional information ■ The original monitoring toolbox is based on SQL tables ■ An Enscribe version is in progress ■ The toolbox in not depending on Measure, Measure is only used to find the originator of a heavy diskload ■ The toolbox is causing very little CPU-load, ■ Collected statistical data allows lots of reports using standard tools like Excel Seite 13 | 29.01.2014 | Bank-Verlag GmbH
  • 14. Advantages ■ Keep track of hardware changes like exchange of disks ■ No need for additional software like Measure ■ Software is running „out of the box“ without a need for additional configuration ■ Lots of parameters and table entries for configuration available ■ The software supports multiple languages, at the moment the messages are available in German and English ■ Bank-Verlag is not a vendor but a user, we are using the software ourselves ■ Very limited commercial interest in selling the software Seite 14 | 29.01.2014 | Bank-Verlag GmbH
  • 15. Advantages during daily life ■ Reloads are carried out automatically if needed ■ Processes causing heavy diskload are found (Measure required!) ■ The priority of processes using too many CPU cycles can be automatically reduced ■ Pathway-servers can be automatically restarted ■ Missing processes can be restarted automatically ■ Existence of required processes can be checked ■ The whole system including all the applications can be started this way! Seite 15 | 29.01.2014 | Bank-Verlag GmbH
  • 16. Advantages during daily life ■ Batchjobs and Calendars are checked periodically. ■ If a calendar is expiring, a message if issued a few days before expiration ■ The outcome of all backup jobs is checked ■ Disk problems are checked periodically including ■ Number of ZZSA files ■ Status of OSS-filesets Seite 16 | 29.01.2014 | Bank-Verlag GmbH
  • 17. Advantages during daily life ■ Files matching predefined filesets are checked for files running full ■ If a file is too full it is automatically checked for a possible reload or the maxextents are increased ■ All configured files are periodically reloaded if necessary ■ Necessary reload is decided depending on slack and fragmentation ■ All needed parameters can be defined globally, for a fileset or even for a single file. ■ The need for manual reloads has been reduced to zero Seite 17 | 29.01.2014 | Bank-Verlag GmbH
  • 18. Interesting problems ■ The status of TCP/IP connections can be checked ■ You need 2 established connections from your $ZB000 (192.168.77.77) to 192.168.88.88 port 1234. ■ If at least one of these connections is down, a message is created ■ The cause for that might be an erroneously changed firewall configuration ■ The same feature has been implemented for X.25 connections Seite 18 | 29.01.2014 | Bank-Verlag GmbH
  • 19. A real life case concerning TCP/IP ■ Our NonStop is accessing another server though a firewall ■ There have to be 2 established connections on port 4711 ■ A rule within the firewall was erroneously changed ■ The NonStop could no longer establish a new connection to the server ■ The already established connections were not affected ■ The real problem we had weeks later when one of the connections had to be reestablished ■ The monitoring tool found the missing connection immediately Seite 19 | 29.01.2014 | Bank-Verlag GmbH
  • 20. Another problem ■ We have a leased line to another provider ■ Line is using X.25 protocol ■ During peak hours we had some problems on the line ■ Using the statistical data we found out that the capacity of the line was exceeded ■ Increasing the speed immediately solved all problems Seite 20 | 29.01.2014 | Bank-Verlag GmbH
  • 21. Security issues ■ Safeguard reports erroneous logons ■ Safeguard does not report the external origin of this logon like the IP-address ■ We read the Safeguard log and add that information ■ So the question „From where did the logon with Administrator to the NonStop come“ can be answered by a look at our table Seite 21 | 29.01.2014 | Bank-Verlag GmbH
  • 22. Application monitoring ■ There are 2 kinds of application monitoring: ■ Checking database contents ■ Checking application messages ■ The database contents are checked using SQL-statements of the type „SELECT COUNT(*) from … WHERE… BROWSE ACCESS;“ ■ The result is compared against given values and a message is created if necessary ■ The severity of the messages can be set depending on the result like: ■ 1 found => Warning ■ 2 found => Error Seite 22 | 29.01.2014 | Bank-Verlag GmbH
  • 23. Checking EMS-messages ■ Our applications are using EMS collectors to report any errors ■ We are able to check the number of messages per type per time period ■ A sample message would be „Timeout process $ABCD“, process $ABCD is routing messages to XY-Bank ■ We define the message be „Timeout“ and „$ABCD“ as „Timeout to XY-BANK“ and count those messages per period ■ A messages is created depending on the configured theshold for this type of message Seite 23 | 29.01.2014 | Bank-Verlag GmbH
  • 24. An idea for EMS message handling ■ We are handling authorisation requests for credit and debit cards, most of these requests are send to the card-issuing banks ■ We are creating minute-based statistics of those requests per issuer ■ If an issuer has problems we can create a message like 60% of the requests unsuccessfull ■ Now the message handling gets this information and handles it according to the configuration: ■ 1 message within 10 minutes ■ 10 messages within 10 minutes  no need for action  create an alarm Seite 24 | 29.01.2014 | Bank-Verlag GmbH
  • 25. Our main Nagios screen for NonStop Seite 25 | 29.01.2014 | Bank-Verlag GmbH
  • 26. Our main Nagios screen for NonStop with error message Seite 26 | 29.01.2014 | Bank-Verlag GmbH
  • 27. Any questions??? Wolfgang Breidbach Bank-Verlag GmbH IT-Services Wendelinstr. 1 50933 Köln E-Mail: Wolfgang.Breidbach@Bank-Verlag.de www.Bank-Verlag.de Seite 27 | 29.01.2014 | Bank-Verlag GmbH