SlideShare una empresa de Scribd logo
1 de 43
Descargar para leer sin conexión
Recovery of lost or corrupted
InnoDB tables
Yet another reason to use InnoDB
OpenSQL Camp 2010, Sankt Augustin
Aleksandr.Kuzminsky@Percona.com
Percona Inc.
http://MySQLPerformanceBlog.com
Agenda
1. InnoDB format overview
2. Internal system tables SYS_INDEXES and SYS_TABLES
3. InnoDB Primary and Secondary keys
4. Typical failure scenarios
5. InnoDB recovery tool
-2-/43
Three things are certain:
Death, taxes and lost data.
Guess which has occurred?
1. InnoDB format overview
-3-/43
Recovery of lost or corrupted InnoDB tables
How MySQL stores data in InnoDB
1. A table space (ibdata1)
• System tablespace(data dictionary, undo, insert buffer, etc.)
• PRIMARY indices (PK + data)
• SECONDARY indices (SK + PK) If the key is (f1, f2) it is
stored as (f1, f2, PK)
1. file per table (.ibd)
• PRIMARY index
• SECONDARY indices
1. InnoDB pages size 16k (uncompressed)
2. Every index is identified by index_id
-4-/43
Recovery of lost or corrupted InnoDB tables
-5-/43
Recovery of lost or corrupted InnoDB tables
How MySQL stores data in InnoDB
Page identifier index_id
TABLE: name test/site_folders, id 0 119, columns 9, indexes 1, appr.rows 1
COLUMNS: id: DATA_INT len 4 prec 0; name: type 12 len 765 prec 0; sites_count: DATA_INT
len 4 prec 0;
created_at: DATA_INT len 8 prec 0; updated_at: DATA_INT len 8 prec 0;
DB_ROW_ID: DATA_SYS prtype 256 len 6 prec 0; DB_TRX_ID: DATA_SYS prtype 257
len 6 prec 0;
DB_ROLL_PTR: DATA_SYS prtype 258 len 7 prec 0;
INDEX: name PRIMARY, id 0 254, fields 1/7, type 3
root page 271, appr.key vals 1, leaf pages 1, size pages 1
FIELDS: id DB_TRX_ID DB_ROLL_PTR name sites_count created_at updated_at
mysql> CREATE TABLE innodb_table_monitor(x int)
engine=innodb
Error log:
-6-/43
Recovery of lost or corrupted InnoDB tables
InnoDB page format
FIL HEADER
PAGE_HEADER
INFINUM+SUPREMUM RECORDS
USER RECORDS
FREE SPACE
Page Directory
Fil Trailer
-7-/43
Recovery of lost or corrupted InnoDB tables
InnoDB page format
Fil Header
Name Si
ze
Remarks
FIL_PAGE_SPACE 4 4 ID of the space the page is in
FIL_PAGE_OFFSET 4 ordinal page number from start of space
FIL_PAGE_PREV 4 offset of previous page in key order
FIL_PAGE_NEXT 4 offset of next page in key order
FIL_PAGE_LSN 8 log serial number of page's latest log record
FIL_PAGE_TYPE 2 current defined types are: FIL_PAGE_INDEX, FIL_PAGE_UNDO_LOG,
FIL_PAGE_INODE, FIL_PAGE_IBUF_FREE_LIST
FIL_PAGE_FILE_FLU
SH_LSN
8 "the file has been flushed to disk at least up to this lsn" (log serial number), valid
only on the first page of the file
FIL_PAGE_ARCH_LO
G_NO
4 the latest archived log file number at the time that FIL_PAGE_FILE_FLUSH_LSN was
written (in the log)
Data are stored in
FIL_PAGE_INDEX
-8-/43
Recovery of lost or corrupted InnoDB tables
InnoDB page format
Page Header
Name Size Remarks
PAGE_N_DIR_SLOTS 2 number of directory slots in the Page Directory part; initial value = 2
PAGE_HEAP_TOP 2 record pointer to first record in heap
PAGE_N_HEAP 2 number of heap records; initial value = 2
PAGE_FREE 2 record pointer to first free record
PAGE_GARBAGE 2 "number of bytes in deleted records"
PAGE_LAST_INSERT 2 record pointer to the last inserted record
PAGE_DIRECTION 2 either PAGE_LEFT, PAGE_RIGHT, or PAGE_NO_DIRECTION
PAGE_N_DIRECTION 2 number of consecutive inserts in the same direction, e.g. "last 5 were all to the left"
PAGE_N_RECS 2 number of user records
PAGE_MAX_TRX_ID 8 the highest ID of a transaction which might have changed a record on the page (only set for secondary
indexes)
PAGE_LEVEL 2 level within the index (0 for a leaf page)
PAGE_INDEX_ID 8 identifier of the index the page belongs to
PAGE_BTR_SEG_LEA
F
10 "file segment header for the leaf pages in a B-tree" (this is irrelevant here)
PAGE_BTR_SEG_TOP 10 "file segment header for the non-leaf pages in a B-tree" (this is irrelevant here)
index_id
Highest bit is row format(1
-COMPACT, 0 - REDUNDANT )
-9-/43
Recovery of lost or corrupted InnoDB tables
How to check row format?
• The highest bit of the PAGE_N_HEAP from the
page header
• 0 stands for version REDUNDANT, 1 - for COMACT
• dc -e "2o `hexdump –C d pagefile | grep
00000020 | awk '{ print $12}'` p" | sed
's/./& /g' | awk '{ print $1}'
-10-/43
Recovery of lost or corrupted InnoDB tables
Rows in an InnoDB page
•Rows in a single pages is a linked list
•The first record INFIMUM
•The last record SUPREMUM
•Sorted by Primary key
next infimum
0 supremu
mnext 10
0
data...
next 10
1
data...
next 10
2
data...
next 10
3
data...
-11-/43
Recovery of lost or corrupted InnoDB tables
Records are saved in insert order
insert into t1 values(10, 'aaa');
insert into t1 values(30, 'ccc');
insert into t1 values(20, 'bbb');
JG....................N<E.......
................................
.............................2..
...infimum......supremum......6.
........)....2..aaa.............
...*....2..ccc.... ...........+.
...2..bbb.......................
................................
-12-/43
Recovery of lost or corrupted InnoDB tables
Row format
EXAMPLE:
CREATE TABLE `t1` (
`ID` int(11) unsigned NOT NULL,
`NAME` varchar(120),
`N_FIELDS` int(10),
PRIMARY KEY (`ID`)
) ENGINE=InnoDB DEFAULT
CHARSET=latin1
Name Size
Field Start Offsets (F*1) or (F*2) bytes
Extra Bytes
6 bytes (5 bytes if
COMPACT format)
Field Contents depends on content
-13-/43
Recovery of lost or corrupted InnoDB tables
InnoDB page format (REDUNDANT)
Extra bytes
Name Size Description
record_status 2 bits _ORDINARY, _NODE_PTR, _INFIMUM, _SUPREMUM
deleted_flag 1 bit 1 if record is deleted
min_rec_flag 1 bit 1 if record is predefined minimum record
n_owned 4 bits number of records owned by this record
heap_no 13 bits record's order number in heap of index page
n_fields 10 bits number of fields in this record, 1 to 1023
1byte_offs_flag 1 bit 1 if each Field Start Offsets is 1 byte long (this item is also called the "short" flag)
next 16 bits 16 bits pointer to next record in page
-14-/43
Recovery of lost or corrupted InnoDB tables
InnoDB page format (COMPACT)
Extra bytes
Name Size, bits Description
record_status
deleted_flag
min_rec_flag
4 4 bits used to delete mark a record, and mark a predefined
minimum record in alphabetical order
n_owned 4 the number of records owned by this record
(this term is explained in page0page.h)
heap_no 13 the order number of this record in the
heap of the index page
record type 3 000=conventional, 001=node pointer (inside B-tree),
010=infimum, 011=supremum, 1xx=reserved
next 16 bits 16 a relative pointer to the next record in the page
-15-/43
Recovery of lost or corrupted InnoDB tables
REDUNDANT
A row: (10, ‘abcdef’, 20)
4 6 7
Actualy stored as: (10, TRX_ID, PTR_ID, ‘abcdef’, 20)
6 4
Field Offsets
…. next
Extra 6 bytes:
0x00 00 00 0A
record_status
deleted_flag
min_rec_flag
n_owned
heap_no
n_fields
1byte_offs_flag
Fields
... ... abcdef 0x80 00 00 14
-16-/43
Recovery of lost or corrupted InnoDB tables
COMPACT
A row: (10, ‘abcdef’, 20)
6 NULLS
Actualy stored as: (10, TRX_ID, PTR_ID, ‘abcdef’, 20)
Field Offsets
…. next
Extra 5 bytes:
0x00 00 00 0A
Fields
... ... abcdef 0x80 00 00 14
A bit per NULL-
able field
-17-/43
Recovery of lost or corrupted InnoDB tables
Data types
INT types (fixed-size)
String types
• VARCHAR(x) – variable-size
• CHAR(x) – fixed-size, variable-size if UTF-8
DECIMAL
• Stored in strings before 5.0.3, variable in size
• Binary format after 5.0.3, fixed-size.
-18-/43
Recovery of lost or corrupted InnoDB tables
BLOB and other long fields
• Field length (so called offset) is one or two byte long
• Page size is 16k
• If record size < (UNIV_PAGE_SIZE/2-200) == ~7k
– the record is stored internally (in a PK page)
• Otherwise – 768 bytes internally, the rest in an
external page
-19-/43
2. Internal system tables
SYS_INDEXES and
SYS_TABLES
-20-/43
Recovery of lost or corrupted InnoDB tables
Why are SYS_* tables needed?
• Correspondence “table name” -> “index_id”
• Storage for other internal information
-21-/43
Recovery of lost or corrupted InnoDB tables
How MySQL stores data in InnoDB
SYS_TABLES and SYS_INDEXES
Always REDUNDANT format!
CREATE TABLE `SYS_INDEXES` (
`TABLE_ID` bigint(20) unsigned NOT NULL
default '0',
`ID` bigint(20) unsigned NOT NULL default
'0',
`NAME` varchar(120) default NULL,
`N_FIELDS` int(10) unsigned default NULL,
`TYPE` int(10) unsigned default NULL,
`SPACE` int(10) unsigned default NULL,
`PAGE_NO` int(10) unsigned default NULL,
PRIMARY KEY (`TABLE_ID`,`ID`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
CREATE TABLE `SYS_TABLES` (
`NAME` varchar(255) NOT NULL default '',
`ID` bigint(20) unsigned NOT NULL default
'0',
`N_COLS` int(10) unsigned default NULL,
`TYPE` int(10) unsigned default NULL,
`MIX_ID` bigint(20) unsigned default NULL,
`MIX_LEN` int(10) unsigned default NULL,
`CLUSTER_NAME` varchar(255) default NULL,
`SPACE` int(10) unsigned default NULL,
PRIMARY KEY (`NAME`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
index_id = 0-3 index_id = 0-1
Name:
PRIMARY
GEN_CLUSTER_ID
or unique index name
-22-/43
Recovery of lost or corrupted InnoDB tables
How MySQL stores data in InnoDB
NAME ID …
"archive/msg_store" 40 8 1 0 0 NULL 0
"archive/msg_store" 40 8 1 0 0 NULL 0
"archive/msg_store" 40 8 1 0 0 NULL 0
TABLE_ID ID NAME …
40 196389 "PRIMARY" 2 3 0 21031026
40 196390 "msg_hash" 1 0 0 21031028
SYS_TABLES
SYS_INDEXES
Example:
-23-/43
3. InnoDB Primary and
Secondary keys
-24-/43
Recovery of lost or corrupted InnoDB tables
Primary key
The table:
CREATE TABLE `t1` (
`ID` int(11),
`NAME` varchar(120),
`N_FIELDS` int(10),
PRIMARY KEY (`ID`),
KEY `NAME` (`NAME`)
) ENGINE=InnoDB DEFAULT
CHARSET=latin1
Fields in the PK:
1. ID
2. DB_TRX_ID
3. DB_ROLL_PTR
4. NAME
5. N_FIELDS
-25-/43
Recovery of lost or corrupted InnoDB tables
Secondary key
The table:
CREATE TABLE `t1` (
`ID` int(11),
`NAME` varchar(120),
`N_FIELDS` int(10),
PRIMARY KEY (`ID`),
KEY `NAME` (`NAME`)
) ENGINE=InnoDB DEFAULT
CHARSET=latin1
Fields in the SK:
1. NAME
2. ID  Primary key
-26-/43
4. Typical failure scenarios
-27-/43
Recovery of lost or corrupted InnoDB tables
Deleted records
• DELETE FROM table WHERE id = 5;
• Forgotten WHERE clause
Band-aid:
• Stop/kill mysqld ASAP
-28-/43
Recovery of lost or corrupted InnoDB tables
How delete is performed?
"row/row0upd.c“:
“…/* How is a delete performed?...The delete is
performed by setting the delete bit in the record and
substituting the id of the deleting transaction for the
original trx id, and substituting a new roll ptr for
previous roll ptr. The old trx id and roll ptr are saved
in the undo log record.
Thus, no physical changes occur in the index tree
structure at the time of the delete. Only when the
undo log is purged, the index records will be
physically deleted from the index trees.…”
-29-/43
Recovery of lost or corrupted InnoDB tables
Dropped table/database
• DROP TABLE table;
• DROP DATABASE database;
• Often happens when restoring from SQL dump
• Bad because .FRM file goes away
• Especially painful when innodb_file_per_table
Band-aid:
• Stop/kill mysqld ASAP
• Stop IO on an HDD or mount read-only or take a raw
image
-30-/43
Recovery of lost or corrupted InnoDB tables
Corrupted InnoDB tablespace
• Hardware failures
• OS or filesystem failures
• InnoDB bugs
• Corrupted InnoDB tablespace by other processes
Band-aid:
• Stop mysqld
• Take a copy of InnoDB files
-31-/43
Recovery of lost or corrupted InnoDB tables
Wrong UPDATE statement
• UPDATE user SET Password =
PASSWORD(‘qwerty’) WHERE User=‘root’;
• Again forgotten WHERE clause
• Bad because changes are applied in a PRIMARY
index immediately
• Old version goes to UNDO segment
Band-aid:
• Stop/kill mysqld ASAP
-32-/43
5. InnoDB recovery tool
-33-/43
Recovery of lost or corrupted InnoDB tables
Recovery prerequisites
1. Media
1. ibdata1
2. *.ibd
3. HDD image
2. Tables structure
1. SQL dump
2. *.FRM files
-34-/43
Recovery of lost or corrupted InnoDB tables
table_defs.h
{ /* int(11) unsigned */
name: “ID",
type: FT_UINT,
fixed_length: 4,
has_limits: TRUE,
limits: {
can_be_null: FALSE,
uint_min_val: 0,
uint_max_val: 4294967295ULL
},
can_be_null: FALSE
},
{ /* varchar(120) */
name: "NAME",
type: FT_CHAR,
min_length: 0,
max_length: 120,
has_limits: TRUE,
limits: {
can_be_null: TRUE,
char_min_len: 0,
char_max_len: 120,
char_ascii_only: TRUE
},
can_be_null: TRUE
},
• generated by create_defs.pl
-35-/43
Recovery of lost or corrupted InnoDB tables
How to get CREATE info from .frm files
1. CREATE TABLE t1 (id int) Engine=INNODB;
2. Replace t1.frm with the one’s you need to get
scheme
3. Run “show create table t1”
If mysqld crashes
1. See the end of bvi t1.frm :
.ID.NAME.N_FIELDS..
2. *.FRM viewer !TODO
-36-/43
Recovery of lost or corrupted InnoDB tables
InnoDB recovery tool
http://launchpad.net/percona-innodb-recovery-tool/
1. Written in Percona
2. Contributed by Percona and community
3. Supported by Percona
Consists of two major tools
•page_parser – splits InnoDB tablespace into 16k pages
•constraints_parser – scans a page and finds good records
-37-/43
Recovery of lost or corrupted InnoDB tables
InnoDB recovery tool
server# ./page_parser -4 -f /var/lib/mysql/ibdata1
Opening file: /var/lib/mysql/ibdata1
Read data from fn=3...
Read page #0.. saving it to pages-1259793800/0-18219008/0-00000000.page
Read page #1.. saving it to pages-1259793800/0-0/1-00000001.page
Read page #2.. saving it to pages-1259793800/4294967295-65535/2-00000002.page
Read page #3.. saving it to pages-1259793800/0-0/3-00000003.page
page_parser
In the recent versions the InnoDB pages are
sorted by page type
-38-/43
Recovery of lost or corrupted InnoDB tables
Page signature check
0{....0...4...4......=..E.
......
........<..~...A.......|..
......
..........................
......
...infimum......supremumf.
....qT
M/T/196001834/XXXXX
XXXXXXXXXXX L
X X X X X X X X X
XXXXXXXXXXXXX
1. INFIMUM and SUPREMUM
records are in fixed positions
2. Works with corrupted pages
-39-/43
Recovery of lost or corrupted InnoDB tables
InnoDB recovery tool
server# ./constraints_parser -4 -f pages-1259793800/0-16/51-00000051.page
constraints_parser
Table structure is defined in "include/table_defs.h"
See HOWTO for details
http://code.google.com/p/innodb-tools/wiki/InnodbRecoveryHowto
Filters inside table_defs.h are very important (if InnoDB
page is corrupted)
-40-/43
Recovery of lost or corrupted InnoDB tables
Check InnoDB page before reading recs
#./constraints_parser -5 -U -f pages/0-418/12665-00012665.page -V
Initializing table definitions...
Processing table: document_type_fieldsets_link
- total fields: 5
- nullable fields: 0
- minimum header size: 5
- minimum rec size: 25
- maximum rec size: 25
Read data from fn=3...
Page id: 12665
Checking a page
Infimum offset: 0x63
Supremum offset: 0x70
Next record at offset: 0x9F (159)
Next record at offset: 0xB0 (176)
Next record at offset: 0x3D95 (15765)
…
Next record at offset: 0x70 (112)
Page is good
1. Check if the tool can follow
all records by addresses
2. If so, find a rec. exactly at
the position where the
record is.
3. Helps a lot for COMPACT
format!
-41-/43
Recovery of lost or corrupted InnoDB tables
Import result
t1 1 "browse" 10
t1 2 "dashboard" 20
t1 3 "addFolder" 18
t1 4 "editFolder" 15
mysql> LOAD DATA INFILE '/path/to/datafile'
REPLACE INTO TABLE <table_name>
FIELDS TERMINATED BY 't' OPTIONALLY
ENCLOSED BY '"'
LINES STARTING BY '<table_name>t' ;
-42-/43
Questions ?
Thank you for coming!
• References
• http://www.mysqlperformanceblog.com/
• http://percona.com/
• http://www.slideshare.net/akuzminsky
-43-/43

Más contenido relacionado

La actualidad más candente (10)

Rdbms day3
Rdbms day3Rdbms day3
Rdbms day3
 
Sq lite module7
Sq lite module7Sq lite module7
Sq lite module7
 
Tables in dd
Tables in ddTables in dd
Tables in dd
 
Marc 21
Marc 21Marc 21
Marc 21
 
presentation on MARC21 Standard Bibliography for LibMS
presentation on MARC21 Standard Bibliography for LibMSpresentation on MARC21 Standard Bibliography for LibMS
presentation on MARC21 Standard Bibliography for LibMS
 
Marc
MarcMarc
Marc
 
Diving into MARC
Diving into MARCDiving into MARC
Diving into MARC
 
Dervy bis-155-week-3-quiz-new
Dervy bis-155-week-3-quiz-newDervy bis-155-week-3-quiz-new
Dervy bis-155-week-3-quiz-new
 
MARC21
MARC21MARC21
MARC21
 
Dervy bis 155 week 3 quiz new
Dervy   bis 155 week 3 quiz newDervy   bis 155 week 3 quiz new
Dervy bis 155 week 3 quiz new
 

Similar a Recover lost or corrupted InnoDB tables

Recovery of lost or corrupted inno db tables(mysql uc 2010)
Recovery of lost or corrupted inno db tables(mysql uc 2010)Recovery of lost or corrupted inno db tables(mysql uc 2010)
Recovery of lost or corrupted inno db tables(mysql uc 2010)Aleksandr Kuzminsky
 
cPanelCon 2014: InnoDB Anatomy
cPanelCon 2014: InnoDB AnatomycPanelCon 2014: InnoDB Anatomy
cPanelCon 2014: InnoDB AnatomyRyan Robson
 
Inno db internals innodb file formats and source code structure
Inno db internals innodb file formats and source code structureInno db internals innodb file formats and source code structure
Inno db internals innodb file formats and source code structurezhaolinjnu
 
MySQL Space Management
MySQL Space ManagementMySQL Space Management
MySQL Space ManagementMIJIN AN
 
Plmce 14 be a_hero_16x9_final
Plmce 14 be a_hero_16x9_finalPlmce 14 be a_hero_16x9_final
Plmce 14 be a_hero_16x9_finalMarco Tusa
 
InnoDB Internal
InnoDB InternalInnoDB Internal
InnoDB Internalmysqlops
 
Inno Db Internals Inno Db File Formats And Source Code Structure
Inno Db Internals Inno Db File Formats And Source Code StructureInno Db Internals Inno Db File Formats And Source Code Structure
Inno Db Internals Inno Db File Formats And Source Code StructureMySQLConference
 
Page Cache in Linux 2.6.pdf
Page Cache in Linux 2.6.pdfPage Cache in Linux 2.6.pdf
Page Cache in Linux 2.6.pdfycelgemici1
 
MySQL innoDB split and merge pages
MySQL innoDB split and merge pagesMySQL innoDB split and merge pages
MySQL innoDB split and merge pagesMarco Tusa
 
Star schema my sql
Star schema   my sqlStar schema   my sql
Star schema my sqldeathsubte
 
cPanelCon 2015: InnoDB Alchemy
cPanelCon 2015: InnoDB AlchemycPanelCon 2015: InnoDB Alchemy
cPanelCon 2015: InnoDB AlchemyRyan Robson
 
Data Warehouse and Business Intelligence - Recipe 2
Data Warehouse and Business Intelligence - Recipe 2Data Warehouse and Business Intelligence - Recipe 2
Data Warehouse and Business Intelligence - Recipe 2Massimo Cenci
 
Clase 12 manejo indices modificada
Clase 12 manejo indices   modificadaClase 12 manejo indices   modificada
Clase 12 manejo indices modificadaTitiushko Jazz
 
Clase 12 manejo indices modificada
Clase 12 manejo indices   modificadaClase 12 manejo indices   modificada
Clase 12 manejo indices modificadaTitiushko Jazz
 
0104 abap dictionary
0104 abap dictionary0104 abap dictionary
0104 abap dictionaryvkyecc1
 
How to use Parquet as a basis for ETL and analytics
How to use Parquet as a basis for ETL and analyticsHow to use Parquet as a basis for ETL and analytics
How to use Parquet as a basis for ETL and analyticsJulien Le Dem
 
[db tech showcase Tokyo 2017] C23: Lessons from SQLite4 by SQLite.org - Richa...
[db tech showcase Tokyo 2017] C23: Lessons from SQLite4 by SQLite.org - Richa...[db tech showcase Tokyo 2017] C23: Lessons from SQLite4 by SQLite.org - Richa...
[db tech showcase Tokyo 2017] C23: Lessons from SQLite4 by SQLite.org - Richa...Insight Technology, Inc.
 
(140625) #fitalk sq lite 소개와 구조 분석
(140625) #fitalk   sq lite 소개와 구조 분석(140625) #fitalk   sq lite 소개와 구조 분석
(140625) #fitalk sq lite 소개와 구조 분석INSIGHT FORENSIC
 
15 Ways to Kill Your Mysql Application Performance
15 Ways to Kill Your Mysql Application Performance15 Ways to Kill Your Mysql Application Performance
15 Ways to Kill Your Mysql Application Performanceguest9912e5
 

Similar a Recover lost or corrupted InnoDB tables (20)

Recovery of lost or corrupted inno db tables(mysql uc 2010)
Recovery of lost or corrupted inno db tables(mysql uc 2010)Recovery of lost or corrupted inno db tables(mysql uc 2010)
Recovery of lost or corrupted inno db tables(mysql uc 2010)
 
Data recovery talk on PLUK
Data recovery talk on PLUKData recovery talk on PLUK
Data recovery talk on PLUK
 
cPanelCon 2014: InnoDB Anatomy
cPanelCon 2014: InnoDB AnatomycPanelCon 2014: InnoDB Anatomy
cPanelCon 2014: InnoDB Anatomy
 
Inno db internals innodb file formats and source code structure
Inno db internals innodb file formats and source code structureInno db internals innodb file formats and source code structure
Inno db internals innodb file formats and source code structure
 
MySQL Space Management
MySQL Space ManagementMySQL Space Management
MySQL Space Management
 
Plmce 14 be a_hero_16x9_final
Plmce 14 be a_hero_16x9_finalPlmce 14 be a_hero_16x9_final
Plmce 14 be a_hero_16x9_final
 
InnoDB Internal
InnoDB InternalInnoDB Internal
InnoDB Internal
 
Inno Db Internals Inno Db File Formats And Source Code Structure
Inno Db Internals Inno Db File Formats And Source Code StructureInno Db Internals Inno Db File Formats And Source Code Structure
Inno Db Internals Inno Db File Formats And Source Code Structure
 
Page Cache in Linux 2.6.pdf
Page Cache in Linux 2.6.pdfPage Cache in Linux 2.6.pdf
Page Cache in Linux 2.6.pdf
 
MySQL innoDB split and merge pages
MySQL innoDB split and merge pagesMySQL innoDB split and merge pages
MySQL innoDB split and merge pages
 
Star schema my sql
Star schema   my sqlStar schema   my sql
Star schema my sql
 
cPanelCon 2015: InnoDB Alchemy
cPanelCon 2015: InnoDB AlchemycPanelCon 2015: InnoDB Alchemy
cPanelCon 2015: InnoDB Alchemy
 
Data Warehouse and Business Intelligence - Recipe 2
Data Warehouse and Business Intelligence - Recipe 2Data Warehouse and Business Intelligence - Recipe 2
Data Warehouse and Business Intelligence - Recipe 2
 
Clase 12 manejo indices modificada
Clase 12 manejo indices   modificadaClase 12 manejo indices   modificada
Clase 12 manejo indices modificada
 
Clase 12 manejo indices modificada
Clase 12 manejo indices   modificadaClase 12 manejo indices   modificada
Clase 12 manejo indices modificada
 
0104 abap dictionary
0104 abap dictionary0104 abap dictionary
0104 abap dictionary
 
How to use Parquet as a basis for ETL and analytics
How to use Parquet as a basis for ETL and analyticsHow to use Parquet as a basis for ETL and analytics
How to use Parquet as a basis for ETL and analytics
 
[db tech showcase Tokyo 2017] C23: Lessons from SQLite4 by SQLite.org - Richa...
[db tech showcase Tokyo 2017] C23: Lessons from SQLite4 by SQLite.org - Richa...[db tech showcase Tokyo 2017] C23: Lessons from SQLite4 by SQLite.org - Richa...
[db tech showcase Tokyo 2017] C23: Lessons from SQLite4 by SQLite.org - Richa...
 
(140625) #fitalk sq lite 소개와 구조 분석
(140625) #fitalk   sq lite 소개와 구조 분석(140625) #fitalk   sq lite 소개와 구조 분석
(140625) #fitalk sq lite 소개와 구조 분석
 
15 Ways to Kill Your Mysql Application Performance
15 Ways to Kill Your Mysql Application Performance15 Ways to Kill Your Mysql Application Performance
15 Ways to Kill Your Mysql Application Performance
 

Más de Arvids Godjuks

Uc2010 xtra backup-hot-backups-and-more
Uc2010 xtra backup-hot-backups-and-moreUc2010 xtra backup-hot-backups-and-more
Uc2010 xtra backup-hot-backups-and-moreArvids Godjuks
 
Alexandr Makarov - PHP framework Yii
Alexandr Makarov - PHP framework YiiAlexandr Makarov - PHP framework Yii
Alexandr Makarov - PHP framework YiiArvids Godjuks
 
Zabbix - an important part of your IT infrastructure
Zabbix - an important part of your IT infrastructureZabbix - an important part of your IT infrastructure
Zabbix - an important part of your IT infrastructureArvids Godjuks
 
PHP libevent Daemons. A high performance and reliable solution. Practical exp...
PHP libevent Daemons. A high performance and reliable solution. Practical exp...PHP libevent Daemons. A high performance and reliable solution. Practical exp...
PHP libevent Daemons. A high performance and reliable solution. Practical exp...Arvids Godjuks
 
Google Analytics: smart use
Google Analytics: smart useGoogle Analytics: smart use
Google Analytics: smart useArvids Godjuks
 
High Availability Solutions with MySQL
High Availability Solutions with MySQLHigh Availability Solutions with MySQL
High Availability Solutions with MySQLArvids Godjuks
 
Detecting logged in user's abnormal activity
Detecting logged in user's abnormal activityDetecting logged in user's abnormal activity
Detecting logged in user's abnormal activityArvids Godjuks
 

Más de Arvids Godjuks (8)

Uc2010 xtra backup-hot-backups-and-more
Uc2010 xtra backup-hot-backups-and-moreUc2010 xtra backup-hot-backups-and-more
Uc2010 xtra backup-hot-backups-and-more
 
Alexandr Makarov - PHP framework Yii
Alexandr Makarov - PHP framework YiiAlexandr Makarov - PHP framework Yii
Alexandr Makarov - PHP framework Yii
 
Zabbix - an important part of your IT infrastructure
Zabbix - an important part of your IT infrastructureZabbix - an important part of your IT infrastructure
Zabbix - an important part of your IT infrastructure
 
PHP libevent Daemons. A high performance and reliable solution. Practical exp...
PHP libevent Daemons. A high performance and reliable solution. Practical exp...PHP libevent Daemons. A high performance and reliable solution. Practical exp...
PHP libevent Daemons. A high performance and reliable solution. Practical exp...
 
System markup
System markupSystem markup
System markup
 
Google Analytics: smart use
Google Analytics: smart useGoogle Analytics: smart use
Google Analytics: smart use
 
High Availability Solutions with MySQL
High Availability Solutions with MySQLHigh Availability Solutions with MySQL
High Availability Solutions with MySQL
 
Detecting logged in user's abnormal activity
Detecting logged in user's abnormal activityDetecting logged in user's abnormal activity
Detecting logged in user's abnormal activity
 

Último

Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesBernd Ruecker
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 

Último (20)

Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architectures
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 

Recover lost or corrupted InnoDB tables

  • 1. Recovery of lost or corrupted InnoDB tables Yet another reason to use InnoDB OpenSQL Camp 2010, Sankt Augustin Aleksandr.Kuzminsky@Percona.com Percona Inc. http://MySQLPerformanceBlog.com
  • 2. Agenda 1. InnoDB format overview 2. Internal system tables SYS_INDEXES and SYS_TABLES 3. InnoDB Primary and Secondary keys 4. Typical failure scenarios 5. InnoDB recovery tool -2-/43 Three things are certain: Death, taxes and lost data. Guess which has occurred?
  • 3. 1. InnoDB format overview -3-/43
  • 4. Recovery of lost or corrupted InnoDB tables How MySQL stores data in InnoDB 1. A table space (ibdata1) • System tablespace(data dictionary, undo, insert buffer, etc.) • PRIMARY indices (PK + data) • SECONDARY indices (SK + PK) If the key is (f1, f2) it is stored as (f1, f2, PK) 1. file per table (.ibd) • PRIMARY index • SECONDARY indices 1. InnoDB pages size 16k (uncompressed) 2. Every index is identified by index_id -4-/43
  • 5. Recovery of lost or corrupted InnoDB tables -5-/43
  • 6. Recovery of lost or corrupted InnoDB tables How MySQL stores data in InnoDB Page identifier index_id TABLE: name test/site_folders, id 0 119, columns 9, indexes 1, appr.rows 1 COLUMNS: id: DATA_INT len 4 prec 0; name: type 12 len 765 prec 0; sites_count: DATA_INT len 4 prec 0; created_at: DATA_INT len 8 prec 0; updated_at: DATA_INT len 8 prec 0; DB_ROW_ID: DATA_SYS prtype 256 len 6 prec 0; DB_TRX_ID: DATA_SYS prtype 257 len 6 prec 0; DB_ROLL_PTR: DATA_SYS prtype 258 len 7 prec 0; INDEX: name PRIMARY, id 0 254, fields 1/7, type 3 root page 271, appr.key vals 1, leaf pages 1, size pages 1 FIELDS: id DB_TRX_ID DB_ROLL_PTR name sites_count created_at updated_at mysql> CREATE TABLE innodb_table_monitor(x int) engine=innodb Error log: -6-/43
  • 7. Recovery of lost or corrupted InnoDB tables InnoDB page format FIL HEADER PAGE_HEADER INFINUM+SUPREMUM RECORDS USER RECORDS FREE SPACE Page Directory Fil Trailer -7-/43
  • 8. Recovery of lost or corrupted InnoDB tables InnoDB page format Fil Header Name Si ze Remarks FIL_PAGE_SPACE 4 4 ID of the space the page is in FIL_PAGE_OFFSET 4 ordinal page number from start of space FIL_PAGE_PREV 4 offset of previous page in key order FIL_PAGE_NEXT 4 offset of next page in key order FIL_PAGE_LSN 8 log serial number of page's latest log record FIL_PAGE_TYPE 2 current defined types are: FIL_PAGE_INDEX, FIL_PAGE_UNDO_LOG, FIL_PAGE_INODE, FIL_PAGE_IBUF_FREE_LIST FIL_PAGE_FILE_FLU SH_LSN 8 "the file has been flushed to disk at least up to this lsn" (log serial number), valid only on the first page of the file FIL_PAGE_ARCH_LO G_NO 4 the latest archived log file number at the time that FIL_PAGE_FILE_FLUSH_LSN was written (in the log) Data are stored in FIL_PAGE_INDEX -8-/43
  • 9. Recovery of lost or corrupted InnoDB tables InnoDB page format Page Header Name Size Remarks PAGE_N_DIR_SLOTS 2 number of directory slots in the Page Directory part; initial value = 2 PAGE_HEAP_TOP 2 record pointer to first record in heap PAGE_N_HEAP 2 number of heap records; initial value = 2 PAGE_FREE 2 record pointer to first free record PAGE_GARBAGE 2 "number of bytes in deleted records" PAGE_LAST_INSERT 2 record pointer to the last inserted record PAGE_DIRECTION 2 either PAGE_LEFT, PAGE_RIGHT, or PAGE_NO_DIRECTION PAGE_N_DIRECTION 2 number of consecutive inserts in the same direction, e.g. "last 5 were all to the left" PAGE_N_RECS 2 number of user records PAGE_MAX_TRX_ID 8 the highest ID of a transaction which might have changed a record on the page (only set for secondary indexes) PAGE_LEVEL 2 level within the index (0 for a leaf page) PAGE_INDEX_ID 8 identifier of the index the page belongs to PAGE_BTR_SEG_LEA F 10 "file segment header for the leaf pages in a B-tree" (this is irrelevant here) PAGE_BTR_SEG_TOP 10 "file segment header for the non-leaf pages in a B-tree" (this is irrelevant here) index_id Highest bit is row format(1 -COMPACT, 0 - REDUNDANT ) -9-/43
  • 10. Recovery of lost or corrupted InnoDB tables How to check row format? • The highest bit of the PAGE_N_HEAP from the page header • 0 stands for version REDUNDANT, 1 - for COMACT • dc -e "2o `hexdump –C d pagefile | grep 00000020 | awk '{ print $12}'` p" | sed 's/./& /g' | awk '{ print $1}' -10-/43
  • 11. Recovery of lost or corrupted InnoDB tables Rows in an InnoDB page •Rows in a single pages is a linked list •The first record INFIMUM •The last record SUPREMUM •Sorted by Primary key next infimum 0 supremu mnext 10 0 data... next 10 1 data... next 10 2 data... next 10 3 data... -11-/43
  • 12. Recovery of lost or corrupted InnoDB tables Records are saved in insert order insert into t1 values(10, 'aaa'); insert into t1 values(30, 'ccc'); insert into t1 values(20, 'bbb'); JG....................N<E....... ................................ .............................2.. ...infimum......supremum......6. ........)....2..aaa............. ...*....2..ccc.... ...........+. ...2..bbb....................... ................................ -12-/43
  • 13. Recovery of lost or corrupted InnoDB tables Row format EXAMPLE: CREATE TABLE `t1` ( `ID` int(11) unsigned NOT NULL, `NAME` varchar(120), `N_FIELDS` int(10), PRIMARY KEY (`ID`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 Name Size Field Start Offsets (F*1) or (F*2) bytes Extra Bytes 6 bytes (5 bytes if COMPACT format) Field Contents depends on content -13-/43
  • 14. Recovery of lost or corrupted InnoDB tables InnoDB page format (REDUNDANT) Extra bytes Name Size Description record_status 2 bits _ORDINARY, _NODE_PTR, _INFIMUM, _SUPREMUM deleted_flag 1 bit 1 if record is deleted min_rec_flag 1 bit 1 if record is predefined minimum record n_owned 4 bits number of records owned by this record heap_no 13 bits record's order number in heap of index page n_fields 10 bits number of fields in this record, 1 to 1023 1byte_offs_flag 1 bit 1 if each Field Start Offsets is 1 byte long (this item is also called the "short" flag) next 16 bits 16 bits pointer to next record in page -14-/43
  • 15. Recovery of lost or corrupted InnoDB tables InnoDB page format (COMPACT) Extra bytes Name Size, bits Description record_status deleted_flag min_rec_flag 4 4 bits used to delete mark a record, and mark a predefined minimum record in alphabetical order n_owned 4 the number of records owned by this record (this term is explained in page0page.h) heap_no 13 the order number of this record in the heap of the index page record type 3 000=conventional, 001=node pointer (inside B-tree), 010=infimum, 011=supremum, 1xx=reserved next 16 bits 16 a relative pointer to the next record in the page -15-/43
  • 16. Recovery of lost or corrupted InnoDB tables REDUNDANT A row: (10, ‘abcdef’, 20) 4 6 7 Actualy stored as: (10, TRX_ID, PTR_ID, ‘abcdef’, 20) 6 4 Field Offsets …. next Extra 6 bytes: 0x00 00 00 0A record_status deleted_flag min_rec_flag n_owned heap_no n_fields 1byte_offs_flag Fields ... ... abcdef 0x80 00 00 14 -16-/43
  • 17. Recovery of lost or corrupted InnoDB tables COMPACT A row: (10, ‘abcdef’, 20) 6 NULLS Actualy stored as: (10, TRX_ID, PTR_ID, ‘abcdef’, 20) Field Offsets …. next Extra 5 bytes: 0x00 00 00 0A Fields ... ... abcdef 0x80 00 00 14 A bit per NULL- able field -17-/43
  • 18. Recovery of lost or corrupted InnoDB tables Data types INT types (fixed-size) String types • VARCHAR(x) – variable-size • CHAR(x) – fixed-size, variable-size if UTF-8 DECIMAL • Stored in strings before 5.0.3, variable in size • Binary format after 5.0.3, fixed-size. -18-/43
  • 19. Recovery of lost or corrupted InnoDB tables BLOB and other long fields • Field length (so called offset) is one or two byte long • Page size is 16k • If record size < (UNIV_PAGE_SIZE/2-200) == ~7k – the record is stored internally (in a PK page) • Otherwise – 768 bytes internally, the rest in an external page -19-/43
  • 20. 2. Internal system tables SYS_INDEXES and SYS_TABLES -20-/43
  • 21. Recovery of lost or corrupted InnoDB tables Why are SYS_* tables needed? • Correspondence “table name” -> “index_id” • Storage for other internal information -21-/43
  • 22. Recovery of lost or corrupted InnoDB tables How MySQL stores data in InnoDB SYS_TABLES and SYS_INDEXES Always REDUNDANT format! CREATE TABLE `SYS_INDEXES` ( `TABLE_ID` bigint(20) unsigned NOT NULL default '0', `ID` bigint(20) unsigned NOT NULL default '0', `NAME` varchar(120) default NULL, `N_FIELDS` int(10) unsigned default NULL, `TYPE` int(10) unsigned default NULL, `SPACE` int(10) unsigned default NULL, `PAGE_NO` int(10) unsigned default NULL, PRIMARY KEY (`TABLE_ID`,`ID`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 CREATE TABLE `SYS_TABLES` ( `NAME` varchar(255) NOT NULL default '', `ID` bigint(20) unsigned NOT NULL default '0', `N_COLS` int(10) unsigned default NULL, `TYPE` int(10) unsigned default NULL, `MIX_ID` bigint(20) unsigned default NULL, `MIX_LEN` int(10) unsigned default NULL, `CLUSTER_NAME` varchar(255) default NULL, `SPACE` int(10) unsigned default NULL, PRIMARY KEY (`NAME`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 index_id = 0-3 index_id = 0-1 Name: PRIMARY GEN_CLUSTER_ID or unique index name -22-/43
  • 23. Recovery of lost or corrupted InnoDB tables How MySQL stores data in InnoDB NAME ID … "archive/msg_store" 40 8 1 0 0 NULL 0 "archive/msg_store" 40 8 1 0 0 NULL 0 "archive/msg_store" 40 8 1 0 0 NULL 0 TABLE_ID ID NAME … 40 196389 "PRIMARY" 2 3 0 21031026 40 196390 "msg_hash" 1 0 0 21031028 SYS_TABLES SYS_INDEXES Example: -23-/43
  • 24. 3. InnoDB Primary and Secondary keys -24-/43
  • 25. Recovery of lost or corrupted InnoDB tables Primary key The table: CREATE TABLE `t1` ( `ID` int(11), `NAME` varchar(120), `N_FIELDS` int(10), PRIMARY KEY (`ID`), KEY `NAME` (`NAME`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 Fields in the PK: 1. ID 2. DB_TRX_ID 3. DB_ROLL_PTR 4. NAME 5. N_FIELDS -25-/43
  • 26. Recovery of lost or corrupted InnoDB tables Secondary key The table: CREATE TABLE `t1` ( `ID` int(11), `NAME` varchar(120), `N_FIELDS` int(10), PRIMARY KEY (`ID`), KEY `NAME` (`NAME`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 Fields in the SK: 1. NAME 2. ID  Primary key -26-/43
  • 27. 4. Typical failure scenarios -27-/43
  • 28. Recovery of lost or corrupted InnoDB tables Deleted records • DELETE FROM table WHERE id = 5; • Forgotten WHERE clause Band-aid: • Stop/kill mysqld ASAP -28-/43
  • 29. Recovery of lost or corrupted InnoDB tables How delete is performed? "row/row0upd.c“: “…/* How is a delete performed?...The delete is performed by setting the delete bit in the record and substituting the id of the deleting transaction for the original trx id, and substituting a new roll ptr for previous roll ptr. The old trx id and roll ptr are saved in the undo log record. Thus, no physical changes occur in the index tree structure at the time of the delete. Only when the undo log is purged, the index records will be physically deleted from the index trees.…” -29-/43
  • 30. Recovery of lost or corrupted InnoDB tables Dropped table/database • DROP TABLE table; • DROP DATABASE database; • Often happens when restoring from SQL dump • Bad because .FRM file goes away • Especially painful when innodb_file_per_table Band-aid: • Stop/kill mysqld ASAP • Stop IO on an HDD or mount read-only or take a raw image -30-/43
  • 31. Recovery of lost or corrupted InnoDB tables Corrupted InnoDB tablespace • Hardware failures • OS or filesystem failures • InnoDB bugs • Corrupted InnoDB tablespace by other processes Band-aid: • Stop mysqld • Take a copy of InnoDB files -31-/43
  • 32. Recovery of lost or corrupted InnoDB tables Wrong UPDATE statement • UPDATE user SET Password = PASSWORD(‘qwerty’) WHERE User=‘root’; • Again forgotten WHERE clause • Bad because changes are applied in a PRIMARY index immediately • Old version goes to UNDO segment Band-aid: • Stop/kill mysqld ASAP -32-/43
  • 33. 5. InnoDB recovery tool -33-/43
  • 34. Recovery of lost or corrupted InnoDB tables Recovery prerequisites 1. Media 1. ibdata1 2. *.ibd 3. HDD image 2. Tables structure 1. SQL dump 2. *.FRM files -34-/43
  • 35. Recovery of lost or corrupted InnoDB tables table_defs.h { /* int(11) unsigned */ name: “ID", type: FT_UINT, fixed_length: 4, has_limits: TRUE, limits: { can_be_null: FALSE, uint_min_val: 0, uint_max_val: 4294967295ULL }, can_be_null: FALSE }, { /* varchar(120) */ name: "NAME", type: FT_CHAR, min_length: 0, max_length: 120, has_limits: TRUE, limits: { can_be_null: TRUE, char_min_len: 0, char_max_len: 120, char_ascii_only: TRUE }, can_be_null: TRUE }, • generated by create_defs.pl -35-/43
  • 36. Recovery of lost or corrupted InnoDB tables How to get CREATE info from .frm files 1. CREATE TABLE t1 (id int) Engine=INNODB; 2. Replace t1.frm with the one’s you need to get scheme 3. Run “show create table t1” If mysqld crashes 1. See the end of bvi t1.frm : .ID.NAME.N_FIELDS.. 2. *.FRM viewer !TODO -36-/43
  • 37. Recovery of lost or corrupted InnoDB tables InnoDB recovery tool http://launchpad.net/percona-innodb-recovery-tool/ 1. Written in Percona 2. Contributed by Percona and community 3. Supported by Percona Consists of two major tools •page_parser – splits InnoDB tablespace into 16k pages •constraints_parser – scans a page and finds good records -37-/43
  • 38. Recovery of lost or corrupted InnoDB tables InnoDB recovery tool server# ./page_parser -4 -f /var/lib/mysql/ibdata1 Opening file: /var/lib/mysql/ibdata1 Read data from fn=3... Read page #0.. saving it to pages-1259793800/0-18219008/0-00000000.page Read page #1.. saving it to pages-1259793800/0-0/1-00000001.page Read page #2.. saving it to pages-1259793800/4294967295-65535/2-00000002.page Read page #3.. saving it to pages-1259793800/0-0/3-00000003.page page_parser In the recent versions the InnoDB pages are sorted by page type -38-/43
  • 39. Recovery of lost or corrupted InnoDB tables Page signature check 0{....0...4...4......=..E. ...... ........<..~...A.......|.. ...... .......................... ...... ...infimum......supremumf. ....qT M/T/196001834/XXXXX XXXXXXXXXXX L X X X X X X X X X XXXXXXXXXXXXX 1. INFIMUM and SUPREMUM records are in fixed positions 2. Works with corrupted pages -39-/43
  • 40. Recovery of lost or corrupted InnoDB tables InnoDB recovery tool server# ./constraints_parser -4 -f pages-1259793800/0-16/51-00000051.page constraints_parser Table structure is defined in "include/table_defs.h" See HOWTO for details http://code.google.com/p/innodb-tools/wiki/InnodbRecoveryHowto Filters inside table_defs.h are very important (if InnoDB page is corrupted) -40-/43
  • 41. Recovery of lost or corrupted InnoDB tables Check InnoDB page before reading recs #./constraints_parser -5 -U -f pages/0-418/12665-00012665.page -V Initializing table definitions... Processing table: document_type_fieldsets_link - total fields: 5 - nullable fields: 0 - minimum header size: 5 - minimum rec size: 25 - maximum rec size: 25 Read data from fn=3... Page id: 12665 Checking a page Infimum offset: 0x63 Supremum offset: 0x70 Next record at offset: 0x9F (159) Next record at offset: 0xB0 (176) Next record at offset: 0x3D95 (15765) … Next record at offset: 0x70 (112) Page is good 1. Check if the tool can follow all records by addresses 2. If so, find a rec. exactly at the position where the record is. 3. Helps a lot for COMPACT format! -41-/43
  • 42. Recovery of lost or corrupted InnoDB tables Import result t1 1 "browse" 10 t1 2 "dashboard" 20 t1 3 "addFolder" 18 t1 4 "editFolder" 15 mysql> LOAD DATA INFILE '/path/to/datafile' REPLACE INTO TABLE <table_name> FIELDS TERMINATED BY 't' OPTIONALLY ENCLOSED BY '"' LINES STARTING BY '<table_name>t' ; -42-/43
  • 43. Questions ? Thank you for coming! • References • http://www.mysqlperformanceblog.com/ • http://percona.com/ • http://www.slideshare.net/akuzminsky -43-/43