SlideShare una empresa de Scribd logo
1 de 57
Descargar para leer sin conexión
Fractal Tree®
Technology Overview
The Art of Indexing
Martín Farach-Colton
Co-founder & Chief Technology Officer
The Art of Indexing
®
Not all indexing is the same
B-tree is the basis for almost all DB systems
• Data structure invented in 1972
• Has not kept up with hardware trends
‣ Works poorly on modern rotational disks
‣ Works poorly on SSD
Fractal Tree Indexes is the basis of TokuDB
• Scales with hardware
• Fast Indexing ➔ More Indexing ➔ Faster Queries
• Great Compression
• No Fragmentation
• Reduced wear on SSDs
How do Fractal Tree
Indexes outperform
B-trees?
How do Fractal Tree
Indexes outperform
B-trees?
First, some facts about storage systems
The Art of Indexing
®
Storage is quirky
Hard disks are slow for
random I/O but fast for
sequential I/O
Difference causes problems
like fragmentation, ...
The Art of Indexing
®
Storage is quirky
Hard disks are slow for
random I/O but fast for
sequential I/O
SSDs are fast for
random I/O but expensive for
sequential.
Garbage collection
causes artefacts: increased
wear, write cliffs...
Difference causes problems
like fragmentation, ...
The Art of Indexing
®
Storage ♡ Big Reads and Writes
Big I/Os
The Art of Indexing
®
Storage ♡ Big Reads and Writes
Big I/Os
Hard Disks: No
Fragmentation
The Art of Indexing
®
Storage ♡ Big Reads and Writes
Big I/Os
Hard Disks: No
Fragmentation SSDs: Less Garbage
Collection & Wear
The Art of Indexing
®
Storage ♡ Big Reads and Writes
Big I/Os
Hard Disks: No
Fragmentation SSDs: Less Garbage
Collection & Wear
But you only get the goodies if
each I/O has lots of new bytes
The Art of Indexing
®
Storage ♡ Big Reads and Writes
Big I/Os
Hard Disks: No
Fragmentation SSDs: Less Garbage
Collection & Wear
But you only get the goodies if
each I/O has lots of new bytes
If you read a big block, change one byte, then
write it, you get terrible wear problems
The Art of Indexing
®
Storage ♡ Big Reads and Writes
Big I/Os
Hard Disks: No
Fragmentation SSDs: Less Garbage
Collection & Wear
Both: Better
Compression
But you only get the goodies if
each I/O has lots of new bytes
If you read a big block, change one byte, then
write it, you get terrible wear problems
The Art of Indexing
®
Caching is great
• But you can’t cache all your data
• For stuff not in memory, you have to go to disk
Goal: Do the best we can for the stuff on disk
Data is big, RAM is Small
Data
RAM
Now, What’s a B-tree?
& a Fractal Tree Index
The Art of Indexing
®
What’s a B-tree?
B
B B B... ...
B B……………………………………………….………….
Search Tree
The Art of Indexing
®
What’s a B-tree?
B
B B B... ...
B B……………………………………………….………….
Each node has B pivots
Search Tree
The Art of Indexing
®
What’s a B-tree?
B
B B B... ...
B B……………………………………………….………….
Each node has B pivots
Search Tree
Fetching pivots as a group
saves I/Os
The Art of Indexing
®
What’s a B-tree?
B
B B B... ...
B B……………………………………………….………….
Each node has B pivots
Search Tree
Fetching pivots as a group
saves I/Os
Most leaves are on disk, not in RAM
The Art of Indexing
®
What’s a B-tree?
B
B B B... ...
B B……………………………………………….………….
Each node has B pivots
Search Tree
Fetching pivots as a group
saves I/Os
Most leaves are on disk, not in RAM Inserting into a leaf that’s not in memory
will require an I/O per insertion
The Art of Indexing
®
B-tree Delivery Service
If fast memory is like walking across a room
• Each update in a B-tree is a walking trip from
‣ New York
The Art of Indexing
®
B-tree Delivery Service
If fast memory is like walking across a room
• Each update in a B-tree is a walking trip from
‣ New York to St Louis
The Art of Indexing
®
B-tree Delivery Service
If fast memory is like walking across a room
• Each update in a B-tree is a walking trip from
‣ New York to St Louis
Each item gets its own round trip!
The Art of Indexing
®
Real-world delivery
Keep regional warehouses
• Only move stuff when you can move a lot
The Art of Indexing
®
Real-world delivery
Keep regional warehouses
• Only move stuff when you can move a lot
Each item gets moved several times
The Art of Indexing
®
Real-world delivery
Keep regional warehouses
• Only move stuff when you can move a lot
Each item gets moved several times
But each trip is vastly cheaper
The Art of Indexing
®
Fractal Tree Indexes
B
B B BB
Each node has
pivots & Buffers
The Art of Indexing
®
Fractal Tree Indexes
B
B B BB
Each node has
pivots & Buffers
The Art of Indexing
®
Fractal Tree Indexes
B
B B BB
Each node has
pivots & Buffers
The Art of Indexing
®
Fractal Tree Indexes
B
B B BB
Each node has
pivots & Buffers
Buffers fill as
updates arrive
The Art of Indexing
®
Fractal Tree Indexes
B
B B BB
Each node has
pivots & Buffers
Buffers fill as
updates arrive
The Art of Indexing
®
Fractal Tree Indexes
B
B B BB
Each node has
pivots & Buffers
Buffers fill as
updates arrive
The Art of Indexing
®
Fractal Tree Indexes
B
B B BB
Each node has
pivots & Buffers
Buffers fill as
updates arrive
The Art of Indexing
®
Fractal Tree Indexes
B
B B BB
Each node has
pivots & Buffers
Buffers fill as
updates arrive
The Art of Indexing
®
Fractal Tree Indexes
B
B B BB
Each node has
pivots & Buffers
Buffers fill as
updates arrive
The Art of Indexing
®
Fractal Tree Indexes
B
B B BB
Each node has
pivots & Buffers
Buffers fill as
updates arrive
The Art of Indexing
®
Fractal Tree Indexes
B
B B BB
Each node has
pivots & Buffers
Buffers fill as
updates arrive
The Art of Indexing
®
Fractal Tree Indexes
B
B B BB
Each node has
pivots & Buffers
Buffers fill as
updates arrive
The Art of Indexing
®
Fractal Tree Indexes
B
B B BB
Each node has
pivots & Buffers
Buffers fill as
updates arrive
The Art of Indexing
®
Fractal Tree Indexes
B
B B BB
Each node has
pivots & Buffers
Buffers fill as
updates arrive
The Art of Indexing
®
Fractal Tree Indexes
B
B B BB
Each node has
pivots & Buffers
Buffers fill as
updates arrive
The Art of Indexing
®
Fractal Tree Indexes
B
B B BB
Each node has
pivots & Buffers
Buffers fill as
updates arrive
The Art of Indexing
®
Fractal Tree Indexes
B
B B BB
Each node has
pivots & Buffers
Buffers fill as
updates arrive
The Art of Indexing
®
Fractal Tree Indexes
B
B B BB
Each node has
pivots & Buffers
Buffers fill as
updates arrive
The Art of Indexing
®
Fractal Tree Indexes
B
B B BB
Each node has
pivots & Buffers
Buffers fill as
updates arrive
The Art of Indexing
®
Fractal Tree Indexes
B
B B BB
Each node has
pivots & Buffers
Buffers fill as
updates arrive
Flush a buffer
when it fills
The Art of Indexing
®
Fractal Tree Indexes
B
B B BB
Each node has
pivots & Buffers
Buffers fill as
updates arrive
Flush a buffer
when it fills
A flush might take an I/O, but it
does lots of useful work
The Art of Indexing
®
Fractal Tree Indexes
B
B B BB
Each node has
pivots & Buffers
Buffers fill as
updates arrive
Flush a buffer
when it fills
A flush might take an I/O, but it
does lots of useful work
More changes per write ➔
fewer changes for same write
load ➔ less SSD wear
The Art of Indexing
®
Fractal Tree Indexes: Queries
B
B B BB
Lots of buffers have
messages
The Art of Indexing
®
Fractal Tree Indexes: Queries
B
B B BB
Lots of buffers have
messages
But query follows
root-leaf path
The Art of Indexing
®
Fractal Tree Indexes: Queries
B
B B BB
Lots of buffers have
messages
But query follows
root-leaf path
So every query has the
most up-to-date information
The Art of Indexing
®
Fractal Tree Indexes: Queries
B
B B BB
Lots of buffers have
messages
But query follows
root-leaf path
So every query has the
most up-to-date information
Messages can be
insert, update, delete
The Art of Indexing
®
Fractal Tree Indexes: Schema Changes
B
B B BB
Schema Changes are
broadcast messages
The Art of Indexing
®
Fractal Tree Indexes: Schema Changes
B
B B BB
Schema Changes are
broadcast messages
The Art of Indexing
®
Fractal Tree Indexes: Schema Changes
B
B B BB
Schema Changes are
broadcast messages
The Art of Indexing
®
Fractal Tree Indexes: Schema Changes
B
B B BB
Schema Changes are
broadcast messages
Insertion into root is fast,
changes are immediate
The Art of Indexing
®
Fractal Tree Indexes: Schema Changes
B
B B BB
Schema Changes are
broadcast messages
Leaves get rewritten
when broadcast message
makes it that far
Insertion into root is fast,
changes are immediate
Schema change
message still on root-
leaf path
The Art of Indexing
®
Analysis
Delivery system gives goodies:
• Messages get moved, but each I/O pays for a lot of
movement
• You get very fast inserts
• You get Hot Schema Changes
Each flush carries lots of useful information
• So it’s worth it to make nodes big
• No fragmentation, Better Compression
• Much better wear on SSDs

Más contenido relacionado

Similar a Fractal tree-technology-and-the-art-of-indexing

2011 03-31 Riak Stockholm Meetup
2011 03-31 Riak Stockholm Meetup2011 03-31 Riak Stockholm Meetup
2011 03-31 Riak Stockholm MeetupMårten Gustafson
 
Perl DBI Scripting with the ILS
Perl DBI Scripting with the ILSPerl DBI Scripting with the ILS
Perl DBI Scripting with the ILSRoy Zimmer
 
PASS Summit 2009 Keynote Dave DeWitt
PASS Summit 2009 Keynote Dave DeWittPASS Summit 2009 Keynote Dave DeWitt
PASS Summit 2009 Keynote Dave DeWittMark Ginnebaugh
 
Hive Bucketing in Apache Spark
Hive Bucketing in Apache SparkHive Bucketing in Apache Spark
Hive Bucketing in Apache SparkTejas Patil
 
MongoDB and Ecommerce : A perfect combination
MongoDB and Ecommerce : A perfect combinationMongoDB and Ecommerce : A perfect combination
MongoDB and Ecommerce : A perfect combinationSteven Francia
 
The right read optimization is actually write optimization
The right read optimization is actually write optimizationThe right read optimization is actually write optimization
The right read optimization is actually write optimizationjames tong
 
Modelling complex game economy with a graph database
Modelling complex game economy with a graph databaseModelling complex game economy with a graph database
Modelling complex game economy with a graph databaseC4Media
 
Bucketing 2.0: Improve Spark SQL Performance by Removing Shuffle
Bucketing 2.0: Improve Spark SQL Performance by Removing ShuffleBucketing 2.0: Improve Spark SQL Performance by Removing Shuffle
Bucketing 2.0: Improve Spark SQL Performance by Removing ShuffleDatabricks
 

Similar a Fractal tree-technology-and-the-art-of-indexing (10)

2011 03-31 Riak Stockholm Meetup
2011 03-31 Riak Stockholm Meetup2011 03-31 Riak Stockholm Meetup
2011 03-31 Riak Stockholm Meetup
 
Perl DBI Scripting with the ILS
Perl DBI Scripting with the ILSPerl DBI Scripting with the ILS
Perl DBI Scripting with the ILS
 
PASS Summit 2009 Keynote Dave DeWitt
PASS Summit 2009 Keynote Dave DeWittPASS Summit 2009 Keynote Dave DeWitt
PASS Summit 2009 Keynote Dave DeWitt
 
Forget The ORM!
Forget The ORM!Forget The ORM!
Forget The ORM!
 
Hive Bucketing in Apache Spark
Hive Bucketing in Apache SparkHive Bucketing in Apache Spark
Hive Bucketing in Apache Spark
 
MongoDB and Ecommerce : A perfect combination
MongoDB and Ecommerce : A perfect combinationMongoDB and Ecommerce : A perfect combination
MongoDB and Ecommerce : A perfect combination
 
The right read optimization is actually write optimization
The right read optimization is actually write optimizationThe right read optimization is actually write optimization
The right read optimization is actually write optimization
 
Media storage
Media storageMedia storage
Media storage
 
Modelling complex game economy with a graph database
Modelling complex game economy with a graph databaseModelling complex game economy with a graph database
Modelling complex game economy with a graph database
 
Bucketing 2.0: Improve Spark SQL Performance by Removing Shuffle
Bucketing 2.0: Improve Spark SQL Performance by Removing ShuffleBucketing 2.0: Improve Spark SQL Performance by Removing Shuffle
Bucketing 2.0: Improve Spark SQL Performance by Removing Shuffle
 

Más de Atner Yegorov

Здравоохранение Чувашии 2015
Здравоохранение Чувашии 2015Здравоохранение Чувашии 2015
Здравоохранение Чувашии 2015Atner Yegorov
 
Чувашия. Итоги за 5 лет. 2015
Чувашия. Итоги за 5 лет. 2015Чувашия. Итоги за 5 лет. 2015
Чувашия. Итоги за 5 лет. 2015Atner Yegorov
 
Инвестиционная политика Чувашской Республики 2015
Инвестиционная политика Чувашской Республики 2015Инвестиционная политика Чувашской Республики 2015
Инвестиционная политика Чувашской Республики 2015Atner Yegorov
 
Господдержка малого и среднего бизнеса в Чувашии 2015
Господдержка малого и среднего бизнеса в Чувашии 2015Господдержка малого и среднего бизнеса в Чувашии 2015
Господдержка малого и среднего бизнеса в Чувашии 2015Atner Yegorov
 
ИННОВАЦИИ В ПРОМЫШЛЕННОСТИ ЧУВАШИИ. 2015
ИННОВАЦИИ В ПРОМЫШЛЕННОСТИ ЧУВАШИИ. 2015ИННОВАЦИИ В ПРОМЫШЛЕННОСТИ ЧУВАШИИ. 2015
ИННОВАЦИИ В ПРОМЫШЛЕННОСТИ ЧУВАШИИ. 2015Atner Yegorov
 
Брендинг Чувашии
Брендинг ЧувашииБрендинг Чувашии
Брендинг ЧувашииAtner Yegorov
 
VIII Экономический форум Чебоксары
VIII Экономический форум ЧебоксарыVIII Экономический форум Чебоксары
VIII Экономический форум ЧебоксарыAtner Yegorov
 
Рейтин инвестклимата в регионах 2015
Рейтин инвестклимата в регионах 2015Рейтин инвестклимата в регионах 2015
Рейтин инвестклимата в регионах 2015Atner Yegorov
 
Аттракционы тематического парка, Татарстан
Аттракционы тематического парка, ТатарстанАттракционы тематического парка, Татарстан
Аттракционы тематического парка, ТатарстанAtner Yegorov
 
Тематический парк, Татарстан
Тематический парк, ТатарстанТематический парк, Татарстан
Тематический парк, ТатарстанAtner Yegorov
 
Будущее журналистики
Будущее журналистикиБудущее журналистики
Будущее журналистикиAtner Yegorov
 
Будущее нишевых СМИ
Будущее нишевых СМИБудущее нишевых СМИ
Будущее нишевых СМИAtner Yegorov
 
Будущее онлайн СМИ в регионах
Будущее онлайн СМИ в регионах Будущее онлайн СМИ в регионах
Будущее онлайн СМИ в регионах Atner Yegorov
 
Война форматов. Как люди читают новости
Война форматов. Как люди читают новости Война форматов. Как люди читают новости
Война форматов. Как люди читают новости Atner Yegorov
 
Национальная технологическая инициатива
Национальная технологическая инициативаНациональная технологическая инициатива
Национальная технологическая инициативаAtner Yegorov
 
О РАЗРАБОТКЕ И РЕАЛИЗАЦИИ НАЦИОНАЛЬНОЙ ТЕХНОЛОГИЧЕСКОЙ ИНИЦИАТИВЫ
О РАЗРАБОТКЕ И РЕАЛИЗАЦИИ НАЦИОНАЛЬНОЙ  ТЕХНОЛОГИЧЕСКОЙ ИНИЦИАТИВЫ О РАЗРАБОТКЕ И РЕАЛИЗАЦИИ НАЦИОНАЛЬНОЙ  ТЕХНОЛОГИЧЕСКОЙ ИНИЦИАТИВЫ
О РАЗРАБОТКЕ И РЕАЛИЗАЦИИ НАЦИОНАЛЬНОЙ ТЕХНОЛОГИЧЕСКОЙ ИНИЦИАТИВЫ Atner Yegorov
 
Участники заседания по модернизации
Участники заседания по модернизацииУчастники заседания по модернизации
Участники заседания по модернизацииAtner Yegorov
 
Город Innopolis
Город InnopolisГород Innopolis
Город InnopolisAtner Yegorov
 
презентация1
презентация1презентация1
презентация1Atner Yegorov
 

Más de Atner Yegorov (20)

Здравоохранение Чувашии 2015
Здравоохранение Чувашии 2015Здравоохранение Чувашии 2015
Здравоохранение Чувашии 2015
 
Чувашия. Итоги за 5 лет. 2015
Чувашия. Итоги за 5 лет. 2015Чувашия. Итоги за 5 лет. 2015
Чувашия. Итоги за 5 лет. 2015
 
Инвестиционная политика Чувашской Республики 2015
Инвестиционная политика Чувашской Республики 2015Инвестиционная политика Чувашской Республики 2015
Инвестиционная политика Чувашской Республики 2015
 
Господдержка малого и среднего бизнеса в Чувашии 2015
Господдержка малого и среднего бизнеса в Чувашии 2015Господдержка малого и среднего бизнеса в Чувашии 2015
Господдержка малого и среднего бизнеса в Чувашии 2015
 
ИННОВАЦИИ В ПРОМЫШЛЕННОСТИ ЧУВАШИИ. 2015
ИННОВАЦИИ В ПРОМЫШЛЕННОСТИ ЧУВАШИИ. 2015ИННОВАЦИИ В ПРОМЫШЛЕННОСТИ ЧУВАШИИ. 2015
ИННОВАЦИИ В ПРОМЫШЛЕННОСТИ ЧУВАШИИ. 2015
 
Брендинг Чувашии
Брендинг ЧувашииБрендинг Чувашии
Брендинг Чувашии
 
VIII Экономический форум Чебоксары
VIII Экономический форум ЧебоксарыVIII Экономический форум Чебоксары
VIII Экономический форум Чебоксары
 
Рейтин инвестклимата в регионах 2015
Рейтин инвестклимата в регионах 2015Рейтин инвестклимата в регионах 2015
Рейтин инвестклимата в регионах 2015
 
Аттракционы тематического парка, Татарстан
Аттракционы тематического парка, ТатарстанАттракционы тематического парка, Татарстан
Аттракционы тематического парка, Татарстан
 
Тематический парк, Татарстан
Тематический парк, ТатарстанТематический парк, Татарстан
Тематический парк, Татарстан
 
Будущее журналистики
Будущее журналистикиБудущее журналистики
Будущее журналистики
 
Будущее нишевых СМИ
Будущее нишевых СМИБудущее нишевых СМИ
Будущее нишевых СМИ
 
Relap
RelapRelap
Relap
 
Будущее онлайн СМИ в регионах
Будущее онлайн СМИ в регионах Будущее онлайн СМИ в регионах
Будущее онлайн СМИ в регионах
 
Война форматов. Как люди читают новости
Война форматов. Как люди читают новости Война форматов. Как люди читают новости
Война форматов. Как люди читают новости
 
Национальная технологическая инициатива
Национальная технологическая инициативаНациональная технологическая инициатива
Национальная технологическая инициатива
 
О РАЗРАБОТКЕ И РЕАЛИЗАЦИИ НАЦИОНАЛЬНОЙ ТЕХНОЛОГИЧЕСКОЙ ИНИЦИАТИВЫ
О РАЗРАБОТКЕ И РЕАЛИЗАЦИИ НАЦИОНАЛЬНОЙ  ТЕХНОЛОГИЧЕСКОЙ ИНИЦИАТИВЫ О РАЗРАБОТКЕ И РЕАЛИЗАЦИИ НАЦИОНАЛЬНОЙ  ТЕХНОЛОГИЧЕСКОЙ ИНИЦИАТИВЫ
О РАЗРАБОТКЕ И РЕАЛИЗАЦИИ НАЦИОНАЛЬНОЙ ТЕХНОЛОГИЧЕСКОЙ ИНИЦИАТИВЫ
 
Участники заседания по модернизации
Участники заседания по модернизацииУчастники заседания по модернизации
Участники заседания по модернизации
 
Город Innopolis
Город InnopolisГород Innopolis
Город Innopolis
 
презентация1
презентация1презентация1
презентация1
 

Último

Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 

Último (20)

Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 

Fractal tree-technology-and-the-art-of-indexing

  • 1. Fractal Tree® Technology Overview The Art of Indexing Martín Farach-Colton Co-founder & Chief Technology Officer
  • 2. The Art of Indexing ® Not all indexing is the same B-tree is the basis for almost all DB systems • Data structure invented in 1972 • Has not kept up with hardware trends ‣ Works poorly on modern rotational disks ‣ Works poorly on SSD Fractal Tree Indexes is the basis of TokuDB • Scales with hardware • Fast Indexing ➔ More Indexing ➔ Faster Queries • Great Compression • No Fragmentation • Reduced wear on SSDs
  • 3. How do Fractal Tree Indexes outperform B-trees?
  • 4. How do Fractal Tree Indexes outperform B-trees? First, some facts about storage systems
  • 5. The Art of Indexing ® Storage is quirky Hard disks are slow for random I/O but fast for sequential I/O Difference causes problems like fragmentation, ...
  • 6. The Art of Indexing ® Storage is quirky Hard disks are slow for random I/O but fast for sequential I/O SSDs are fast for random I/O but expensive for sequential. Garbage collection causes artefacts: increased wear, write cliffs... Difference causes problems like fragmentation, ...
  • 7. The Art of Indexing ® Storage ♡ Big Reads and Writes Big I/Os
  • 8. The Art of Indexing ® Storage ♡ Big Reads and Writes Big I/Os Hard Disks: No Fragmentation
  • 9. The Art of Indexing ® Storage ♡ Big Reads and Writes Big I/Os Hard Disks: No Fragmentation SSDs: Less Garbage Collection & Wear
  • 10. The Art of Indexing ® Storage ♡ Big Reads and Writes Big I/Os Hard Disks: No Fragmentation SSDs: Less Garbage Collection & Wear But you only get the goodies if each I/O has lots of new bytes
  • 11. The Art of Indexing ® Storage ♡ Big Reads and Writes Big I/Os Hard Disks: No Fragmentation SSDs: Less Garbage Collection & Wear But you only get the goodies if each I/O has lots of new bytes If you read a big block, change one byte, then write it, you get terrible wear problems
  • 12. The Art of Indexing ® Storage ♡ Big Reads and Writes Big I/Os Hard Disks: No Fragmentation SSDs: Less Garbage Collection & Wear Both: Better Compression But you only get the goodies if each I/O has lots of new bytes If you read a big block, change one byte, then write it, you get terrible wear problems
  • 13. The Art of Indexing ® Caching is great • But you can’t cache all your data • For stuff not in memory, you have to go to disk Goal: Do the best we can for the stuff on disk Data is big, RAM is Small Data RAM
  • 14. Now, What’s a B-tree? & a Fractal Tree Index
  • 15. The Art of Indexing ® What’s a B-tree? B B B B... ... B B……………………………………………….…………. Search Tree
  • 16. The Art of Indexing ® What’s a B-tree? B B B B... ... B B……………………………………………….…………. Each node has B pivots Search Tree
  • 17. The Art of Indexing ® What’s a B-tree? B B B B... ... B B……………………………………………….…………. Each node has B pivots Search Tree Fetching pivots as a group saves I/Os
  • 18. The Art of Indexing ® What’s a B-tree? B B B B... ... B B……………………………………………….…………. Each node has B pivots Search Tree Fetching pivots as a group saves I/Os Most leaves are on disk, not in RAM
  • 19. The Art of Indexing ® What’s a B-tree? B B B B... ... B B……………………………………………….…………. Each node has B pivots Search Tree Fetching pivots as a group saves I/Os Most leaves are on disk, not in RAM Inserting into a leaf that’s not in memory will require an I/O per insertion
  • 20. The Art of Indexing ® B-tree Delivery Service If fast memory is like walking across a room • Each update in a B-tree is a walking trip from ‣ New York
  • 21. The Art of Indexing ® B-tree Delivery Service If fast memory is like walking across a room • Each update in a B-tree is a walking trip from ‣ New York to St Louis
  • 22. The Art of Indexing ® B-tree Delivery Service If fast memory is like walking across a room • Each update in a B-tree is a walking trip from ‣ New York to St Louis Each item gets its own round trip!
  • 23. The Art of Indexing ® Real-world delivery Keep regional warehouses • Only move stuff when you can move a lot
  • 24. The Art of Indexing ® Real-world delivery Keep regional warehouses • Only move stuff when you can move a lot Each item gets moved several times
  • 25. The Art of Indexing ® Real-world delivery Keep regional warehouses • Only move stuff when you can move a lot Each item gets moved several times But each trip is vastly cheaper
  • 26. The Art of Indexing ® Fractal Tree Indexes B B B BB Each node has pivots & Buffers
  • 27. The Art of Indexing ® Fractal Tree Indexes B B B BB Each node has pivots & Buffers
  • 28. The Art of Indexing ® Fractal Tree Indexes B B B BB Each node has pivots & Buffers
  • 29. The Art of Indexing ® Fractal Tree Indexes B B B BB Each node has pivots & Buffers Buffers fill as updates arrive
  • 30. The Art of Indexing ® Fractal Tree Indexes B B B BB Each node has pivots & Buffers Buffers fill as updates arrive
  • 31. The Art of Indexing ® Fractal Tree Indexes B B B BB Each node has pivots & Buffers Buffers fill as updates arrive
  • 32. The Art of Indexing ® Fractal Tree Indexes B B B BB Each node has pivots & Buffers Buffers fill as updates arrive
  • 33. The Art of Indexing ® Fractal Tree Indexes B B B BB Each node has pivots & Buffers Buffers fill as updates arrive
  • 34. The Art of Indexing ® Fractal Tree Indexes B B B BB Each node has pivots & Buffers Buffers fill as updates arrive
  • 35. The Art of Indexing ® Fractal Tree Indexes B B B BB Each node has pivots & Buffers Buffers fill as updates arrive
  • 36. The Art of Indexing ® Fractal Tree Indexes B B B BB Each node has pivots & Buffers Buffers fill as updates arrive
  • 37. The Art of Indexing ® Fractal Tree Indexes B B B BB Each node has pivots & Buffers Buffers fill as updates arrive
  • 38. The Art of Indexing ® Fractal Tree Indexes B B B BB Each node has pivots & Buffers Buffers fill as updates arrive
  • 39. The Art of Indexing ® Fractal Tree Indexes B B B BB Each node has pivots & Buffers Buffers fill as updates arrive
  • 40. The Art of Indexing ® Fractal Tree Indexes B B B BB Each node has pivots & Buffers Buffers fill as updates arrive
  • 41. The Art of Indexing ® Fractal Tree Indexes B B B BB Each node has pivots & Buffers Buffers fill as updates arrive
  • 42. The Art of Indexing ® Fractal Tree Indexes B B B BB Each node has pivots & Buffers Buffers fill as updates arrive
  • 43. The Art of Indexing ® Fractal Tree Indexes B B B BB Each node has pivots & Buffers Buffers fill as updates arrive
  • 44. The Art of Indexing ® Fractal Tree Indexes B B B BB Each node has pivots & Buffers Buffers fill as updates arrive
  • 45. The Art of Indexing ® Fractal Tree Indexes B B B BB Each node has pivots & Buffers Buffers fill as updates arrive Flush a buffer when it fills
  • 46. The Art of Indexing ® Fractal Tree Indexes B B B BB Each node has pivots & Buffers Buffers fill as updates arrive Flush a buffer when it fills A flush might take an I/O, but it does lots of useful work
  • 47. The Art of Indexing ® Fractal Tree Indexes B B B BB Each node has pivots & Buffers Buffers fill as updates arrive Flush a buffer when it fills A flush might take an I/O, but it does lots of useful work More changes per write ➔ fewer changes for same write load ➔ less SSD wear
  • 48. The Art of Indexing ® Fractal Tree Indexes: Queries B B B BB Lots of buffers have messages
  • 49. The Art of Indexing ® Fractal Tree Indexes: Queries B B B BB Lots of buffers have messages But query follows root-leaf path
  • 50. The Art of Indexing ® Fractal Tree Indexes: Queries B B B BB Lots of buffers have messages But query follows root-leaf path So every query has the most up-to-date information
  • 51. The Art of Indexing ® Fractal Tree Indexes: Queries B B B BB Lots of buffers have messages But query follows root-leaf path So every query has the most up-to-date information Messages can be insert, update, delete
  • 52. The Art of Indexing ® Fractal Tree Indexes: Schema Changes B B B BB Schema Changes are broadcast messages
  • 53. The Art of Indexing ® Fractal Tree Indexes: Schema Changes B B B BB Schema Changes are broadcast messages
  • 54. The Art of Indexing ® Fractal Tree Indexes: Schema Changes B B B BB Schema Changes are broadcast messages
  • 55. The Art of Indexing ® Fractal Tree Indexes: Schema Changes B B B BB Schema Changes are broadcast messages Insertion into root is fast, changes are immediate
  • 56. The Art of Indexing ® Fractal Tree Indexes: Schema Changes B B B BB Schema Changes are broadcast messages Leaves get rewritten when broadcast message makes it that far Insertion into root is fast, changes are immediate Schema change message still on root- leaf path
  • 57. The Art of Indexing ® Analysis Delivery system gives goodies: • Messages get moved, but each I/O pays for a lot of movement • You get very fast inserts • You get Hot Schema Changes Each flush carries lots of useful information • So it’s worth it to make nodes big • No fragmentation, Better Compression • Much better wear on SSDs