SlideShare una empresa de Scribd logo
1 de 28
[object Object],[object Object]
[object Object],[object Object]
Точка зрения разработчика, не DBA на: ,[object Object]
InnoDB threads
InnoDB data structures and algorithms
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
[object Object],[object Object]
A file/partition is a collection of segments.
A segment consists of fixed-length pages.
The page size is always 16KB in uncompressed tablespaces, and 1KB-16KB in compressed tablespaces (for both data and index).
[object Object],[object Object]
Undo
Insert Buffer
Doublewrite Buffer
MySQL Replication Info
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Redo Logging ,[object Object]
Redo log record format:  <space id, page no, operation code, data>
An example of a redo log record:
<0, 1234, insert, after record at offset 5444, ‘(25, 'heikki', …)'>
'Physiological' logging, per-page
Redo Logging (continued) ,[object Object]
-  'Reorganize page 1234'

Más contenido relacionado

La actualidad más candente

Working with the IFS on System i
Working with the IFS on System iWorking with the IFS on System i
Working with the IFS on System i
Chuck Walker
 
C programming session 14
C programming session 14C programming session 14
C programming session 14
Vivek Singh
 
file handling c++
file handling c++file handling c++
file handling c++
Guddu Spy
 

La actualidad más candente (20)

Apps1
Apps1Apps1
Apps1
 
Changing platforms of Oracle database
Changing platforms of Oracle databaseChanging platforms of Oracle database
Changing platforms of Oracle database
 
Working with the IFS on System i
Working with the IFS on System iWorking with the IFS on System i
Working with the IFS on System i
 
Recipe 5 of Data Warehouse and Business Intelligence - The null values manage...
Recipe 5 of Data Warehouse and Business Intelligence - The null values manage...Recipe 5 of Data Warehouse and Business Intelligence - The null values manage...
Recipe 5 of Data Warehouse and Business Intelligence - The null values manage...
 
C programming session 14
C programming session 14C programming session 14
C programming session 14
 
Recipes 6 of Data Warehouse and Business Intelligence - Naming convention tec...
Recipes 6 of Data Warehouse and Business Intelligence - Naming convention tec...Recipes 6 of Data Warehouse and Business Intelligence - Naming convention tec...
Recipes 6 of Data Warehouse and Business Intelligence - Naming convention tec...
 
Linker scripts
Linker scriptsLinker scripts
Linker scripts
 
Data Warehouse and Business Intelligence - Recipe 4 - Staging area - how to v...
Data Warehouse and Business Intelligence - Recipe 4 - Staging area - how to v...Data Warehouse and Business Intelligence - Recipe 4 - Staging area - how to v...
Data Warehouse and Business Intelligence - Recipe 4 - Staging area - how to v...
 
Data Warehouse and Business Intelligence - Recipe 1
Data Warehouse and Business Intelligence - Recipe 1Data Warehouse and Business Intelligence - Recipe 1
Data Warehouse and Business Intelligence - Recipe 1
 
06 - ELF format, knowing your friend
06 - ELF format, knowing your friend06 - ELF format, knowing your friend
06 - ELF format, knowing your friend
 
Stream classes in C++
Stream classes in C++Stream classes in C++
Stream classes in C++
 
Oracle notes
Oracle notesOracle notes
Oracle notes
 
file handling c++
file handling c++file handling c++
file handling c++
 
New T-SQL Features in SQL Server 2012
New T-SQL Features in SQL Server 2012 New T-SQL Features in SQL Server 2012
New T-SQL Features in SQL Server 2012
 
Recipe 14 - Build a Staging Area for an Oracle Data Warehouse (2)
Recipe 14 - Build a Staging Area for an Oracle Data Warehouse (2)Recipe 14 - Build a Staging Area for an Oracle Data Warehouse (2)
Recipe 14 - Build a Staging Area for an Oracle Data Warehouse (2)
 
2CPP17 - File IO
2CPP17 - File IO2CPP17 - File IO
2CPP17 - File IO
 
Testing File
Testing FileTesting File
Testing File
 
ELF
ELFELF
ELF
 
MySQL Overview
MySQL OverviewMySQL Overview
MySQL Overview
 
Filehadnling
FilehadnlingFilehadnling
Filehadnling
 

Similar a InnoDB: архитектура транзакционного хранилища (Константин Осипов)

jacobs_tuuri_performance
jacobs_tuuri_performancejacobs_tuuri_performance
jacobs_tuuri_performance
Hiroshi Ono
 
The InnoDB Storage Engine for MySQL
The InnoDB Storage Engine for MySQLThe InnoDB Storage Engine for MySQL
The InnoDB Storage Engine for MySQL
Morgan Tocker
 
Myth busters - performance tuning 101 2007
Myth busters - performance tuning 101 2007Myth busters - performance tuning 101 2007
Myth busters - performance tuning 101 2007
paulguerin
 

Similar a InnoDB: архитектура транзакционного хранилища (Константин Осипов) (20)

15 Ways to Kill Your Mysql Application Performance
15 Ways to Kill Your Mysql Application Performance15 Ways to Kill Your Mysql Application Performance
15 Ways to Kill Your Mysql Application Performance
 
Handling Database Deployments
Handling Database DeploymentsHandling Database Deployments
Handling Database Deployments
 
Inno db 5_7_features
Inno db 5_7_featuresInno db 5_7_features
Inno db 5_7_features
 
jacobs_tuuri_performance
jacobs_tuuri_performancejacobs_tuuri_performance
jacobs_tuuri_performance
 
The InnoDB Storage Engine for MySQL
The InnoDB Storage Engine for MySQLThe InnoDB Storage Engine for MySQL
The InnoDB Storage Engine for MySQL
 
Monitoreo de MySQL y PostgreSQL con SQL
Monitoreo de MySQL y PostgreSQL con SQLMonitoreo de MySQL y PostgreSQL con SQL
Monitoreo de MySQL y PostgreSQL con SQL
 
Inside sql server in memory oltp sql sat nyc 2017
Inside sql server in memory oltp sql sat nyc 2017Inside sql server in memory oltp sql sat nyc 2017
Inside sql server in memory oltp sql sat nyc 2017
 
cPanelCon 2014: InnoDB Anatomy
cPanelCon 2014: InnoDB AnatomycPanelCon 2014: InnoDB Anatomy
cPanelCon 2014: InnoDB Anatomy
 
Dbms important questions and answers
Dbms important questions and answersDbms important questions and answers
Dbms important questions and answers
 
12c Database new features
12c Database new features12c Database new features
12c Database new features
 
Maryna Popova "Deep dive AWS Redshift"
Maryna Popova "Deep dive AWS Redshift"Maryna Popova "Deep dive AWS Redshift"
Maryna Popova "Deep dive AWS Redshift"
 
Sqllite
SqlliteSqllite
Sqllite
 
Ibm db2 case study
Ibm db2 case studyIbm db2 case study
Ibm db2 case study
 
User Group3009
User Group3009User Group3009
User Group3009
 
[db tech showcase Tokyo 2017] C23: Lessons from SQLite4 by SQLite.org - Richa...
[db tech showcase Tokyo 2017] C23: Lessons from SQLite4 by SQLite.org - Richa...[db tech showcase Tokyo 2017] C23: Lessons from SQLite4 by SQLite.org - Richa...
[db tech showcase Tokyo 2017] C23: Lessons from SQLite4 by SQLite.org - Richa...
 
Optimizing MongoDB: Lessons Learned at Localytics
Optimizing MongoDB: Lessons Learned at LocalyticsOptimizing MongoDB: Lessons Learned at Localytics
Optimizing MongoDB: Lessons Learned at Localytics
 
Readme
ReadmeReadme
Readme
 
Myth busters - performance tuning 101 2007
Myth busters - performance tuning 101 2007Myth busters - performance tuning 101 2007
Myth busters - performance tuning 101 2007
 
Bank Question Solution-ADBA Previous Year Question for AP, ANE, AME, ADA, AE
Bank Question Solution-ADBA Previous Year Question for AP, ANE, AME, ADA, AEBank Question Solution-ADBA Previous Year Question for AP, ANE, AME, ADA, AE
Bank Question Solution-ADBA Previous Year Question for AP, ANE, AME, ADA, AE
 
Accessing data through hibernate: what DBAs should tell to developers and vic...
Accessing data through hibernate: what DBAs should tell to developers and vic...Accessing data through hibernate: what DBAs should tell to developers and vic...
Accessing data through hibernate: what DBAs should tell to developers and vic...
 

Más de Ontico

Готовим тестовое окружение, или сколько тестовых инстансов вам нужно / Алекса...
Готовим тестовое окружение, или сколько тестовых инстансов вам нужно / Алекса...Готовим тестовое окружение, или сколько тестовых инстансов вам нужно / Алекса...
Готовим тестовое окружение, или сколько тестовых инстансов вам нужно / Алекса...
Ontico
 

Más de Ontico (20)

One-cloud — система управления дата-центром в Одноклассниках / Олег Анастасье...
One-cloud — система управления дата-центром в Одноклассниках / Олег Анастасье...One-cloud — система управления дата-центром в Одноклассниках / Олег Анастасье...
One-cloud — система управления дата-центром в Одноклассниках / Олег Анастасье...
 
Масштабируя DNS / Артем Гавриченков (Qrator Labs)
Масштабируя DNS / Артем Гавриченков (Qrator Labs)Масштабируя DNS / Артем Гавриченков (Qrator Labs)
Масштабируя DNS / Артем Гавриченков (Qrator Labs)
 
Создание BigData-платформы для ФГУП Почта России / Андрей Бащенко (Luxoft)
Создание BigData-платформы для ФГУП Почта России / Андрей Бащенко (Luxoft)Создание BigData-платформы для ФГУП Почта России / Андрей Бащенко (Luxoft)
Создание BigData-платформы для ФГУП Почта России / Андрей Бащенко (Luxoft)
 
Готовим тестовое окружение, или сколько тестовых инстансов вам нужно / Алекса...
Готовим тестовое окружение, или сколько тестовых инстансов вам нужно / Алекса...Готовим тестовое окружение, или сколько тестовых инстансов вам нужно / Алекса...
Готовим тестовое окружение, или сколько тестовых инстансов вам нужно / Алекса...
 
Новые технологии репликации данных в PostgreSQL / Александр Алексеев (Postgre...
Новые технологии репликации данных в PostgreSQL / Александр Алексеев (Postgre...Новые технологии репликации данных в PostgreSQL / Александр Алексеев (Postgre...
Новые технологии репликации данных в PostgreSQL / Александр Алексеев (Postgre...
 
PostgreSQL Configuration for Humans / Alvaro Hernandez (OnGres)
PostgreSQL Configuration for Humans / Alvaro Hernandez (OnGres)PostgreSQL Configuration for Humans / Alvaro Hernandez (OnGres)
PostgreSQL Configuration for Humans / Alvaro Hernandez (OnGres)
 
Inexpensive Datamasking for MySQL with ProxySQL — Data Anonymization for Deve...
Inexpensive Datamasking for MySQL with ProxySQL — Data Anonymization for Deve...Inexpensive Datamasking for MySQL with ProxySQL — Data Anonymization for Deve...
Inexpensive Datamasking for MySQL with ProxySQL — Data Anonymization for Deve...
 
Опыт разработки модуля межсетевого экранирования для MySQL / Олег Брославский...
Опыт разработки модуля межсетевого экранирования для MySQL / Олег Брославский...Опыт разработки модуля межсетевого экранирования для MySQL / Олег Брославский...
Опыт разработки модуля межсетевого экранирования для MySQL / Олег Брославский...
 
ProxySQL Use Case Scenarios / Alkin Tezuysal (Percona)
ProxySQL Use Case Scenarios / Alkin Tezuysal (Percona)ProxySQL Use Case Scenarios / Alkin Tezuysal (Percona)
ProxySQL Use Case Scenarios / Alkin Tezuysal (Percona)
 
MySQL Replication — Advanced Features / Петр Зайцев (Percona)
MySQL Replication — Advanced Features / Петр Зайцев (Percona)MySQL Replication — Advanced Features / Петр Зайцев (Percona)
MySQL Replication — Advanced Features / Петр Зайцев (Percona)
 
Внутренний open-source. Как разрабатывать мобильное приложение большим количе...
Внутренний open-source. Как разрабатывать мобильное приложение большим количе...Внутренний open-source. Как разрабатывать мобильное приложение большим количе...
Внутренний open-source. Как разрабатывать мобильное приложение большим количе...
 
Подробно о том, как Causal Consistency реализовано в MongoDB / Михаил Тюленев...
Подробно о том, как Causal Consistency реализовано в MongoDB / Михаил Тюленев...Подробно о том, как Causal Consistency реализовано в MongoDB / Михаил Тюленев...
Подробно о том, как Causal Consistency реализовано в MongoDB / Михаил Тюленев...
 
Балансировка на скорости проводов. Без ASIC, без ограничений. Решения NFWare ...
Балансировка на скорости проводов. Без ASIC, без ограничений. Решения NFWare ...Балансировка на скорости проводов. Без ASIC, без ограничений. Решения NFWare ...
Балансировка на скорости проводов. Без ASIC, без ограничений. Решения NFWare ...
 
Перехват трафика — мифы и реальность / Евгений Усков (Qrator Labs)
Перехват трафика — мифы и реальность / Евгений Усков (Qrator Labs)Перехват трафика — мифы и реальность / Евгений Усков (Qrator Labs)
Перехват трафика — мифы и реальность / Евгений Усков (Qrator Labs)
 
И тогда наверняка вдруг запляшут облака! / Алексей Сушков (ПЕТЕР-СЕРВИС)
И тогда наверняка вдруг запляшут облака! / Алексей Сушков (ПЕТЕР-СЕРВИС)И тогда наверняка вдруг запляшут облака! / Алексей Сушков (ПЕТЕР-СЕРВИС)
И тогда наверняка вдруг запляшут облака! / Алексей Сушков (ПЕТЕР-СЕРВИС)
 
Как мы заставили Druid работать в Одноклассниках / Юрий Невиницин (OK.RU)
Как мы заставили Druid работать в Одноклассниках / Юрий Невиницин (OK.RU)Как мы заставили Druid работать в Одноклассниках / Юрий Невиницин (OK.RU)
Как мы заставили Druid работать в Одноклассниках / Юрий Невиницин (OK.RU)
 
Разгоняем ASP.NET Core / Илья Вербицкий (WebStoating s.r.o.)
Разгоняем ASP.NET Core / Илья Вербицкий (WebStoating s.r.o.)Разгоняем ASP.NET Core / Илья Вербицкий (WebStoating s.r.o.)
Разгоняем ASP.NET Core / Илья Вербицкий (WebStoating s.r.o.)
 
100500 способов кэширования в Oracle Database или как достичь максимальной ск...
100500 способов кэширования в Oracle Database или как достичь максимальной ск...100500 способов кэширования в Oracle Database или как достичь максимальной ск...
100500 способов кэширования в Oracle Database или как достичь максимальной ск...
 
Apache Ignite Persistence: зачем Persistence для In-Memory, и как он работает...
Apache Ignite Persistence: зачем Persistence для In-Memory, и как он работает...Apache Ignite Persistence: зачем Persistence для In-Memory, и как он работает...
Apache Ignite Persistence: зачем Persistence для In-Memory, и как он работает...
 
Механизмы мониторинга баз данных: взгляд изнутри / Дмитрий Еманов (Firebird P...
Механизмы мониторинга баз данных: взгляд изнутри / Дмитрий Еманов (Firebird P...Механизмы мониторинга баз данных: взгляд изнутри / Дмитрий Еманов (Firebird P...
Механизмы мониторинга баз данных: взгляд изнутри / Дмитрий Еманов (Firebird P...
 

Último

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Último (20)

GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 

InnoDB: архитектура транзакционного хранилища (Константин Осипов)

  • 1.
  • 2.
  • 3.
  • 5. InnoDB data structures and algorithms
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12. A file/partition is a collection of segments.
  • 13. A segment consists of fixed-length pages.
  • 14. The page size is always 16KB in uncompressed tablespaces, and 1KB-16KB in compressed tablespaces (for both data and index).
  • 15.
  • 16. Undo
  • 20.
  • 21.
  • 22.
  • 23. Redo log record format: <space id, page no, operation code, data>
  • 24. An example of a redo log record:
  • 25. <0, 1234, insert, after record at offset 5444, ‘(25, 'heikki', …)'>
  • 27.
  • 28. - 'Reorganize page 1234'
  • 29. - 'Delete all records on page 1234 after position 6543'
  • 30. - ...
  • 31.
  • 37.
  • 38. index
  • 39. undo
  • 40.
  • 41.
  • 42.
  • 43.
  • 44.
  • 45.
  • 46.
  • 47. Unique to InnoDB, saves random writes, improves insert speed
  • 48.
  • 49. as much as 7.2x faster than the theoretical rate of inserts in a &quot;normal&quot; DBMS clustered (primary key) index PK values 001 - nnn B-tree leaf nodes, containing data key values A Z B-tree leaf nodes, containing PKs Secondary index key values A Z B-tree leaf nodes, containing PKs Clustered index Secondary index
  • 50.
  • 51. if a user has IX lock on table T, no other user can take S lock on T
  • 52. if a user has an S lock on table T, no other user can take X locks on rows in T ? IS is compatible with S, but not with X. Thus, if a user has row share locks, no other user can lock the table in X mode S LOCKED! S LOCKED! S LOCKED! A B IS lock
  • 53.
  • 54. Statement-Based Replication Relies on No Phantoms MySQL binlog A’s DELETE B’s INSERT 5. The MySQL binlog contain B's transaction before A's We do not know if A deleted the row that B inserted! User B BEGIN; INSERT INTO t VALUES (10); 2. Before A commits, user B inserts the same row slave may get out-of- sync with the master! User A BEGIN; DELETE FROM t WHERE a = 10; 1. User A deletes a row User B BEGIN; INSERT INTO t VALUES (10); COMMIT; 3. User B commits before A User A BEGIN; DELETE FROM t WHERE a = 10; COMMIT; 4. User A commits
  • 55. Row-Based Replication Needs Less Locking MySQL binlog A’s DELETE B’s INSERT 5. The MySQL binlog contain B's transaction before A's Row-based binlog contains information of each individual row that was deleted or inserted => phantoms are no longer a problem! User B BEGIN; INSERT INTO t VALUES (10); 2. Before A commits, user B inserts the same row User A BEGIN; DELETE FROM t WHERE a = 10; 1. User A deletes a row User B BEGIN; INSERT INTO t VALUES (10); COMMIT; 3. User B commits before A User A BEGIN; DELETE FROM t WHERE a = 10; COMMIT; 4. User A commits
  • 56.
  • 57. Types of Gap Locking in InnoDB InnoDB minimizes gap locking by using record-only locks in UNIQUE searches UPDATE t SET a = a + 1 WHERE primary_key_value = 100; Gap lock locks just the gap before the key Record-only lock locks just the key and not the gap Insert-intention gap lock held when waiting to insert into a gap Next-key lock locks the key & the gap before the key
  • 58.
  • 59. UPDATE, DELETE use next-key locking
  • 60.
  • 61. No 'consistent' reads; all SELECTs return the very latest state of the database
  • 62. Downside: lots of locking, lots of deadlocks. REPEATABLE READ
  • 63.
  • 64. Data is “up to date”, but multiple SELECTs may be inconsistent with one another
  • 65. In V5.1, most gap-locking is removed w/ this level, but you MUST use row-based logging/replication
  • 66. Fewer gap locks mean fewer deadlocks
  • 67.
  • 68. Use innodb_locks_unsafe_for_binlog to remove gap locking in MySQL-5.0 and earlier
  • 69.

Notas del editor

  1. Everybody knows Heikki. Introduce mysqlf
  2. The following 32 pages are allocated individually (from the fragmented extent space); after that, full 64 page extents are allocated Using multiple tablespaces can be beneficial to users who want to move specific tables to separate physical disks or who wish to restore backups of single tables quickly without interrupting the use of the remaining InnoDB tables.
  3. Also overflow page pointers Whether any columns are stored off-page depends on the page size and the total size of the row. When the row is too long to fit entirely within the page of the clustered index, InnoDB will choose the longest columns for off-page storage until the row fits on the clustered index page.
  4. COMPACT mode always stores up to 768-byte prefix of such columns in the clustered index page New DYNAMIC mode stores long columns entirely “off-page”, with only a 20-byte prefix in the clustered index page If row does not fit in clustered index page, some long BLOB or VARCHAR column(s) may be stored on an overflow page the REDUNDANT format is available to retain compatibility with older versions of MySQL
  5. If you do not define a PRIMARY KEY for your table, InnoDB uses the first UNIQUE index that has only NOT NULL columns as the clustered index. If there is no such index in the table, InnoDB internally generates a clustered index where the rows are ordered by the row ID that InnoDB assigns to the rows in such a table. The row ID is a 6-byte field that increases monotonically as new rows are inserted. Thus, the rows ordered by the row ID are physically in insertion order. All InnoDB indexes are B-trees where the index records are stored in the leaf pages of the tree. The default size of an index page is 16KB. When new records are inserted, InnoDB tries to leave 1/16 of the page free for future insertions and updates of the index records.
  6. If you specify the autoextend option for the last data file, InnoDB extends the data file if it runs out of free space in the tablespace. The increment is 8MB at a time by default. It can be modified by changing the innodb_autoextend_increment system variable. You can use raw disk partitions as data files in the shared tablespace. Remember that only the last data file in the innodb_data_file_path can be specified as auto-extending.
  7. Note : InnoDB always needs the shared tablespace because it puts its internal data dictionary and undo logs there. To support recovery and unique features.
  8. Combination of physical (disk address) and logical (field content) logging -- Physiological Logging Logging is used for durability at crash recovery InnoDB keeps two logs, the redo log and the undo log. The redo log is for re-doing data changes that had not been written to disk when a crash occurred. The undo log is primarily for removing data changes that had been written to disk when a crash occurred, but should not have been written, because they were for uncommitted transactions. The undo log is inside the tablespace. The undo log is primarily for removing data changes that had been written to disk when a crash occurred, but should not have been written, because they were for uncommitted transactions . The undo log is inside the tablespace. The &amp;quot;insert&amp;quot; section of the undo log is needed only for transaction rollback and can be discarded at COMMIT time. The &amp;quot;update/delete&amp;quot; section of the undo log is also useful for consistent reads, and can be discarded when InnoDB has ended all transactions that might need the undo log records to reconstruct earlier versions of rows. An undo log record&apos;s contents are: Primary Key Value (not a page number or physical address), Old Transaction ID (of the transaction that updated the row), and the changes (only old values). COMMIT will write the contents of the log buffer to disk, and put undo log records in a history list. ROLLBACK will delete undo log records that are no longer needed. PURGE (an internal operation that occurs outside user control) will no-longer-necessary undo log records and, for data records that have been marked for deletion and are no longer necessary for consistent read, will remove the records.
  9. There is one redo log for the entire workspace, it contains multiple files, it is circular. The file header includes the last successful checkpoint. A redo log record&apos;s contents are: Space id, Page Number (4 bytes = page number within tablespace), Offset of change within page (2 bytes), Log Record Type (insert, update, delete, &amp;quot;fill space with blanks&amp;quot;, etc.), and the changes on that page ( only after images , not before images).
  10. Primary goal: make it faster! Run-time --- Faster! Foreground threads from MySQL server Memory: log buffer (REDO records), Buffer pool (data pages; index pages; undo records; adaptive hash indexes; table of lock info), and additional memory pool(cached data dictionary; open table handles)
  11. Typically two segments for each index (non-leaf index segment and leaf index segment) One segment for very small indexes only Each secondary index has its own pair of segments