1. Anti-corruption How to prevent Firebird database corruption Alexey Kovyazin, IBSurgeon, ak@ib-aid.com
2. IBSurgeon – 8 years! Products IBFirstAID/FBFirstAID, etc FBScanner FBDataGuard Clients Carl Zeiss Meditec, USA Vneshtorgbank, Russia Wells FargoBank, USA Watermark Software, UK Bas-X, Australia Victoria University, New Zealand Kingsway Management, UK Team Dmitry Kuzmenko Alexey Kovyazin Sergey Nikitin Oleg Mateveev& team Consultants Dmitry Yemanov, VladKhorsun, Alex Peshkoff Partners IBPhoenix
3. Alexey Kovyazin Hosting & cloud partners Yes, it’s me! http://ru.linkedin.com/in/kovyazin 2006 2008 2009 2010 In 2007 we sold 3mln of Delphi to all Russian schools
4. Agenda Why bother? Why corruption happens? Reasons Symptoms What things are to monitor to recognize problem? Problems with server Problems with environment Problems with database Maintenance improvements to prevent corruptions. Backups Why we created FBDataGuad?
5. Why bother? Firebird databases become bigger and bigger every year Information inside Firebird can costs $XXXXXX Outage (corruption, backup/restore breaks) can costs $XXXXX too Real-world examples Bas-X, Australia – Firebird 2.x, 250Gb, no BLOBs, 250 users Watermark Software, UK – Firebird 2.x, up to 400Gb, with BLOBs Profitmed, Russia, medical distribution, Firebird 1.5, 65Gb, 250 users 1 Terabyte Firebird 2.1 database http://www.ib-aid.com/articles/item104 3.8Billions of records in the biggest table
6. Corruption reasons & symptons Reasons Misadministration Hardware failures Bugs The actual reason often remains undiscovered Symptoms Repeatable Trackable Complimentary We can prevent corruption if see its symptoms.
8. What to monitor at Firebird instance level Is Server online? General parameters How much RAM? Mb # Temp files? Records to analyze 6 levels Logs Size of logs Is it recommended? Bugs, issues Server version
9. Firebird instance key parameters Server availability Consumed RAM Temp files size Temp files quantity Records in logs Logs’ size Server version related issues Need to watch for 7 key parameters which can indicate possible or actual problems
10. What to monitor-1 General database checks Database availability -> Outages, firewalls, stability Log records related with database in firebird.log -> early symptoms Check metadata – validate all metadata -> early showings Transactions Transaction markers monitoring (garbage problems) Limit (2 billions between backup/restore) Users Min/max/avg users –> peaks problems, design of application
11. What to monitor-2 Database files Single volume and multi-volume -> Volumes in bin Paths – where to stored (not at the same drive with temp files and backups!) Sizes and growth limits -> Warnings about growth Delta-files (nbackup) Life-time and sizes -> Huge/aged delta problems Backup files Existence, sizes and growth limits -> Backup could kill database
12. What to monitor-3 Number of formats per table No more than 255 -> corruption Less formats in production -> performance problem Non-activated and deactivated indices Deactivated – explicitly deactivated (why deactivated?) Non-activated – indicates problems during restore
13. What to monitor-4 Periodical statistics (gstat) -> deep look into database Firebird server version Examples - problems with nbackup Latest patches are recommended Firebird fbclient.dll version If fbclient.dll <> fbserver - > Problems (disconnects, 10054, errors) Firebird installation size Default database place is %Firebird%in Firebird logs size and paths Big logs quickly exhaust space -> corruption
14. Maintenance-1 Backups Revolver (days, week, month copies) backups Backup depth Checking restore (need to check results) Growth prognosis (if not enough space, backup should be canceled) Control backup time (too long backup indicates problems) Today 5..7 Yesterday Weekly
15. Big database requires individual maintenance plan Maintenance plan depends on size of database and work mode (8x5, 24x7) Backups scheme is not simple Perform test restores separately To be checked Errors – in firebird.log and run error checking quries on live database Metadata – check integrity Metadata limits Data & BLOBs – walk through data, check segmentation Indices – check indices health Transactions – any gaps, garbage growth, other problems
17. Example of backup plan for big Firebird database Maintenance server Main server Firebird database Nbackup copy Checking restore Gbak-b And each step should be confirmed and reported.
18. Maintenance-2 Indices Recalculate indices statistics -> Performance Selected or excluded Check index status – active/in-active/non-activated -> Problems, corruptions Check physical index health Early showings of corruptions
19. Maintenance-3 Validate database with gfix Don’t forget to shutdown database Analysis (includingfirebird.log) Metadata validation Check important system tables Firebird.log maintenance When log becomes very big, copy it to backup log files And some more things….
20. And this is not enough! Business wants to have warranty - even if hardware fails data should be recovered!
21. A big job Implement scripts Check them in the test environment Explore errors messages and codes of Firebird We spent 6 years getting the necessary information…
23. FBDataGuarddoes all above things… Watches database files, volumes, deltas, performs and checks backups in the right way Verifies metadata, data and indices Watches for errors, limits and wrong versions Sends alerts and recommendations
24. Example with TEMP FBDataGuard found the temp files size = N Not enough space M – N<X Not enough space – administrator will have alert and recommendation to increase TEMP Free space at TEMP- locations= M
25. Example alert Too big temporary files Total size of all temporary files 3 Gb is more than recommended: 500 Mb Firebird creates temporary files for some SQL queries (PLAN SORT). Too big size of temporary files can indicate performance problems. This is not a strictly defined number, so this threshold depends on particular database and application.
26. Index problem example non-activatedindices usually indicates corruption (missed Foreign Keys) FBDataGuard found non-activated index after restore Administrator will get alert and recommendation to check indices Possible perfomance problem prevented!
27. Example of backup problem resolution FBDataGuard found the backup size =M Not enough space M>=N Backup cancelled, database status is set toCritical, administrator gotalert FBDataGuard found free space at backups’ disk = N Corruption of backup was prevented!
28. Example of backup problem alert Job backup@[ server-0000 / db-0000 ] malfunction Unexpected job backup@[ server-0000 / db-0000 ] error: There is not enough space on the disk
30. Hardware and UNDELETE failures HDD corruption Flash-drive corruption UNDELETE problem
31. And even more – protects from hardware failures Metadata repository FBDataGuard Extractor extracts data from corrupted database and inserts to the new New DB Tables data BLOBs
32. Firebird DataGuard Watch for 26 important database and server parameters Alerts for potential and real problem by email Proper automation of database maintenance Windows, Linux, MacOS, Firebird 1.5-2.1 (not 2.5 yet) Special licensing for ISV (Independent Software Vendors) Firebird developers
33. Get FBDataGuard 1 year Free 1 year license for all attendees Send request to dataguard@ib-aid.com