SlideShare una empresa de Scribd logo
1 de 55
Castle-enhanced
   Cassandra
    Berlin Buzzwords
      June 4, 2012

        Eric Evans
    eric@acunu.com
    @jericevans, @acunu
1997
Before the Flood
                                   1990

                               Small databases
                                BTree indexes
                              BTree File systems
                                    RAID
                                Old hardware



Monday, 6 February 2012
2007
Are we there yet?
Figure 6            Figure 7




Source: IDC, 2007   Source: IDC, 2007


Figure 8            Figure 3




Source: IDC, 2007   Source: IDC, 2007
Big Data
    distribution


                   A


M                      B


                   C
Big Data
    distribution


                   A


M    Key = Aaa         B


                   C
Big Data
    distribution


                   A


M    Key = Aaa         B


                   C
Big Data
            write optimizing


• 7, 500 - 10,000 RPM
• 5ms - 9ms seeks
• ~150MB/s (sequential)
• 75-150 random IOPS
Big Data
            write optimizing

                        A   G


        A       C                   G       K


A   B       D       E           G       H       K   L
Big Data
            write optimizing

                        A   G
                                                query(K)

        A       C                   G       K


A   B       D       E           G       H           K      L
Big Data
            write optimizing

                        A   G


        A       C                   G       K


A   B       D       E           G       H       K   L
Big Data
     write optimizing


           Memory




                         Disk
S1    S2     S3     S4   S5
Big Data
     write optimizing


           Memory




                         Disk
S1    S2     S3     S4   S5
Two Revolutions
                                                    2010
                               Distributed, shared-nothing databases
                          Write-optimised indexes          Write-optimised indexes

                          BTree file systems                BTree file systems
                                 RAID                ...          RAID
                           New hardware                     New hardware




Monday, 6 February 2012
Bridging the Gap
                                           2011

                            Distributed, shared-nothing databases


                             Castle                      Castle
                                             ...
                          New hardware               New hardware



Monday, 6 February 2012
Castle
Castle is...
•   Filesystem (no, not really)
•   Key-value store for the Linux kernel
•   Write-optimized
    •   for rotational disks

    •   for SSDs

•   Versioned (clones, snapshots)
•   Disk aggregation
    •   for redundancy

    •   for performance

•   FLOSS!
Doubling Arrays

3


         Buffer values in memory
         until we have > B
Doubling Arrays

    3


9        Buffer values in memory
         until we have > B
Doubling Arrays

   3   9


           Then, promote them to
           disk.
Doubling Arrays

11   3   9
Doubling Arrays

    11   3   9


7
Doubling Arrays

   3   9


   7   11
Doubling Arrays

    5   3   9


1       7   11
Doubling Arrays

   1   5   3   7   9   11
Indexes




query(k)
Bloomfilters




query(k)
Disk Layout: RDA
Disk Layout: RDA
   random duplicate allocation
           random duplicate allocation

  4    2      1    4    5    2    5    3    1    3

  7    10     7    6    8    9    9    10   6    8

  15   12     14   11   13   14   11   12   13   15

                                  16        16
Disk Layout: RDA
Disk Layout: RDA
   random duplicate allocation
           random duplicate allocation

  4    2      1    4        5    3    1    3

  7    10     7    6        9    10   6    8

  15   12     14   11       11   12   13   15

                            16        16
Disk Layout: RDA
Disk Layout: RDA
   random duplicate allocation
           random duplicate allocation

  4    2      1    4        5    3    1    3

  7    10     7    6        9    10   6    8

  15   12     14   11       11   12   13   15

                            16        16

  14   9      2    13       8         5
Disk Layout: RDA
                                              random duplicate allocation
                                                                      Rebuild Times
                                       5



                                       4
                Rebuild Time (Hours)




                                       3



                                       2



                                       1



                                       0
                                           RAID10, 8 Disks   RAID5, 8 Disks    RDA, 8 Disks   RDA, 15 Disks

Monday, 6 February 2012
Reflex
(Formerly Acunu Data Platform)
ColumnFamilyStore


                 Memory




                               Disk
      S1    S2      S3    S4   S5
Memtables

Memory        Memory     Memory




                         Disk

   S1    S2    S3   S4   S5
AcunuColumnFamilyStore
Small random inserts
                    Small random inserts
                          3Inserting 3 billion rows
                             billion rows
                                          Acunu powered Cassandra -
                                               ‘standard’ Cassandra -




Monday, 6 February 2012
Insert latency
            Insert latency
(while While inserting 3 billion rows rows)
         inserting 3 billion
                        Acunu powered Cassandra x
                             ‘standard’ Cassandra +
Questions?
                bitbucket.org/acunu
                 github.com/acunu
www.acunu.com/2/category/technical%20articles/1.html

Más contenido relacionado

Destacado

2011.06.20 stratified-btree
2011.06.20 stratified-btree2011.06.20 stratified-btree
2011.06.20 stratified-btreeAcunu
 
Wikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-caseWikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-caseEric Evans
 
Wikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-caseWikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-caseEric Evans
 
Webinaire Business&Decision - Trifacta
Webinaire  Business&Decision - TrifactaWebinaire  Business&Decision - Trifacta
Webinaire Business&Decision - TrifactaVictor Coustenoble
 
Wikimedia Content API (Strangeloop)
Wikimedia Content API (Strangeloop)Wikimedia Content API (Strangeloop)
Wikimedia Content API (Strangeloop)Eric Evans
 
Time Series Data with Apache Cassandra (ApacheCon EU 2014)
Time Series Data with Apache Cassandra (ApacheCon EU 2014)Time Series Data with Apache Cassandra (ApacheCon EU 2014)
Time Series Data with Apache Cassandra (ApacheCon EU 2014)Eric Evans
 
DataStax et Apache Cassandra pour la gestion des flux IoT
DataStax et Apache Cassandra pour la gestion des flux IoTDataStax et Apache Cassandra pour la gestion des flux IoT
DataStax et Apache Cassandra pour la gestion des flux IoTVictor Coustenoble
 
CQL In Cassandra 1.0 (and beyond)
CQL In Cassandra 1.0 (and beyond)CQL In Cassandra 1.0 (and beyond)
CQL In Cassandra 1.0 (and beyond)Eric Evans
 
Virtual Nodes: Rethinking Topology in Cassandra
Virtual Nodes: Rethinking Topology in CassandraVirtual Nodes: Rethinking Topology in Cassandra
Virtual Nodes: Rethinking Topology in CassandraEric Evans
 
Cassandra by Example: Data Modelling with CQL3
Cassandra by Example:  Data Modelling with CQL3Cassandra by Example:  Data Modelling with CQL3
Cassandra by Example: Data Modelling with CQL3Eric Evans
 
Virtual Nodes: Rethinking Topology in Cassandra
Virtual Nodes: Rethinking Topology in CassandraVirtual Nodes: Rethinking Topology in Cassandra
Virtual Nodes: Rethinking Topology in CassandraEric Evans
 
CQL: SQL In Cassandra
CQL: SQL In CassandraCQL: SQL In Cassandra
CQL: SQL In CassandraEric Evans
 
It's not you, it's me: Ending a 15 year relationship with RRD
It's not you, it's me: Ending a 15 year relationship with RRDIt's not you, it's me: Ending a 15 year relationship with RRD
It's not you, it's me: Ending a 15 year relationship with RRDEric Evans
 
Rethinking Topology In Cassandra (ApacheCon NA)
Rethinking Topology In Cassandra (ApacheCon NA)Rethinking Topology In Cassandra (ApacheCon NA)
Rethinking Topology In Cassandra (ApacheCon NA)Eric Evans
 
Time Series Data with Apache Cassandra
Time Series Data with Apache CassandraTime Series Data with Apache Cassandra
Time Series Data with Apache CassandraEric Evans
 
Lightning fast analytics with Cassandra and Spark
Lightning fast analytics with Cassandra and SparkLightning fast analytics with Cassandra and Spark
Lightning fast analytics with Cassandra and SparkVictor Coustenoble
 
Time Series Data with Apache Cassandra
Time Series Data with Apache CassandraTime Series Data with Apache Cassandra
Time Series Data with Apache CassandraEric Evans
 
Time series storage in Cassandra
Time series storage in CassandraTime series storage in Cassandra
Time series storage in CassandraEric Evans
 

Destacado (20)

2011.06.20 stratified-btree
2011.06.20 stratified-btree2011.06.20 stratified-btree
2011.06.20 stratified-btree
 
Wikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-caseWikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-case
 
Wikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-caseWikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-case
 
Webinaire Business&Decision - Trifacta
Webinaire  Business&Decision - TrifactaWebinaire  Business&Decision - Trifacta
Webinaire Business&Decision - Trifacta
 
Wikimedia Content API (Strangeloop)
Wikimedia Content API (Strangeloop)Wikimedia Content API (Strangeloop)
Wikimedia Content API (Strangeloop)
 
Time Series Data with Apache Cassandra (ApacheCon EU 2014)
Time Series Data with Apache Cassandra (ApacheCon EU 2014)Time Series Data with Apache Cassandra (ApacheCon EU 2014)
Time Series Data with Apache Cassandra (ApacheCon EU 2014)
 
Webinar Degetel DataStax
Webinar Degetel DataStaxWebinar Degetel DataStax
Webinar Degetel DataStax
 
DataStax et Apache Cassandra pour la gestion des flux IoT
DataStax et Apache Cassandra pour la gestion des flux IoTDataStax et Apache Cassandra pour la gestion des flux IoT
DataStax et Apache Cassandra pour la gestion des flux IoT
 
CQL In Cassandra 1.0 (and beyond)
CQL In Cassandra 1.0 (and beyond)CQL In Cassandra 1.0 (and beyond)
CQL In Cassandra 1.0 (and beyond)
 
DataStax Enterprise BBL
DataStax Enterprise BBLDataStax Enterprise BBL
DataStax Enterprise BBL
 
Virtual Nodes: Rethinking Topology in Cassandra
Virtual Nodes: Rethinking Topology in CassandraVirtual Nodes: Rethinking Topology in Cassandra
Virtual Nodes: Rethinking Topology in Cassandra
 
Cassandra by Example: Data Modelling with CQL3
Cassandra by Example:  Data Modelling with CQL3Cassandra by Example:  Data Modelling with CQL3
Cassandra by Example: Data Modelling with CQL3
 
Virtual Nodes: Rethinking Topology in Cassandra
Virtual Nodes: Rethinking Topology in CassandraVirtual Nodes: Rethinking Topology in Cassandra
Virtual Nodes: Rethinking Topology in Cassandra
 
CQL: SQL In Cassandra
CQL: SQL In CassandraCQL: SQL In Cassandra
CQL: SQL In Cassandra
 
It's not you, it's me: Ending a 15 year relationship with RRD
It's not you, it's me: Ending a 15 year relationship with RRDIt's not you, it's me: Ending a 15 year relationship with RRD
It's not you, it's me: Ending a 15 year relationship with RRD
 
Rethinking Topology In Cassandra (ApacheCon NA)
Rethinking Topology In Cassandra (ApacheCon NA)Rethinking Topology In Cassandra (ApacheCon NA)
Rethinking Topology In Cassandra (ApacheCon NA)
 
Time Series Data with Apache Cassandra
Time Series Data with Apache CassandraTime Series Data with Apache Cassandra
Time Series Data with Apache Cassandra
 
Lightning fast analytics with Cassandra and Spark
Lightning fast analytics with Cassandra and SparkLightning fast analytics with Cassandra and Spark
Lightning fast analytics with Cassandra and Spark
 
Time Series Data with Apache Cassandra
Time Series Data with Apache CassandraTime Series Data with Apache Cassandra
Time Series Data with Apache Cassandra
 
Time series storage in Cassandra
Time series storage in CassandraTime series storage in Cassandra
Time series storage in Cassandra
 

Similar a Castle enhanced Cassandra

Pinterest的数据库分片架构
Pinterest的数据库分片架构Pinterest的数据库分片架构
Pinterest的数据库分片架构Tommy Chiu
 
DaStor/Cassandra report for CDR solution
DaStor/Cassandra report for CDR solutionDaStor/Cassandra report for CDR solution
DaStor/Cassandra report for CDR solutionSchubert Zhang
 
Under The Hood Of A Shard-Per-Core Database Architecture
Under The Hood Of A Shard-Per-Core Database ArchitectureUnder The Hood Of A Shard-Per-Core Database Architecture
Under The Hood Of A Shard-Per-Core Database ArchitectureScyllaDB
 
Making the most of ssd in oracle11g
Making the most of ssd in oracle11gMaking the most of ssd in oracle11g
Making the most of ssd in oracle11gGuy Harrison
 
Red Hat Storage Day New York - New Reference Architectures
Red Hat Storage Day New York - New Reference ArchitecturesRed Hat Storage Day New York - New Reference Architectures
Red Hat Storage Day New York - New Reference ArchitecturesRed_Hat_Storage
 
Intuitions for scaling data centric architectures - Benjamin Stopford
Intuitions for scaling data centric architectures - Benjamin StopfordIntuitions for scaling data centric architectures - Benjamin Stopford
Intuitions for scaling data centric architectures - Benjamin StopfordJAXLondon_Conference
 
QCon2016--Drive Best Spark Performance on AI
QCon2016--Drive Best Spark Performance on AIQCon2016--Drive Best Spark Performance on AI
QCon2016--Drive Best Spark Performance on AILex Yu
 
Databases for Storage Engineers
Databases for Storage EngineersDatabases for Storage Engineers
Databases for Storage EngineersThomas Kejser
 
Optimizing MongoDB: Lessons Learned at Localytics
Optimizing MongoDB: Lessons Learned at LocalyticsOptimizing MongoDB: Lessons Learned at Localytics
Optimizing MongoDB: Lessons Learned at Localyticsandrew311
 
Panzura & Scality - Cloud Storage made seamless - Cloud Expo New York City 2012
Panzura & Scality - Cloud Storage made seamless - Cloud Expo New York City 2012Panzura & Scality - Cloud Storage made seamless - Cloud Expo New York City 2012
Panzura & Scality - Cloud Storage made seamless - Cloud Expo New York City 2012Marc Villemade
 
Development to Production with Sharded MongoDB Clusters
Development to Production with Sharded MongoDB ClustersDevelopment to Production with Sharded MongoDB Clusters
Development to Production with Sharded MongoDB ClustersSeveralnines
 
Cacheconcurrencyconsistency cassandra svcc
Cacheconcurrencyconsistency cassandra svccCacheconcurrencyconsistency cassandra svcc
Cacheconcurrencyconsistency cassandra svccsrisatish ambati
 
Raid designs in Qsan Storage
Raid designs in Qsan StorageRaid designs in Qsan Storage
Raid designs in Qsan Storageqsantechnology
 

Similar a Castle enhanced Cassandra (20)

Pinterest的数据库分片架构
Pinterest的数据库分片架构Pinterest的数据库分片架构
Pinterest的数据库分片架构
 
Raid
RaidRaid
Raid
 
DaStor/Cassandra report for CDR solution
DaStor/Cassandra report for CDR solutionDaStor/Cassandra report for CDR solution
DaStor/Cassandra report for CDR solution
 
Under The Hood Of A Shard-Per-Core Database Architecture
Under The Hood Of A Shard-Per-Core Database ArchitectureUnder The Hood Of A Shard-Per-Core Database Architecture
Under The Hood Of A Shard-Per-Core Database Architecture
 
Making the most of ssd in oracle11g
Making the most of ssd in oracle11gMaking the most of ssd in oracle11g
Making the most of ssd in oracle11g
 
Red Hat Storage Day New York - New Reference Architectures
Red Hat Storage Day New York - New Reference ArchitecturesRed Hat Storage Day New York - New Reference Architectures
Red Hat Storage Day New York - New Reference Architectures
 
Intuitions for scaling data centric architectures - Benjamin Stopford
Intuitions for scaling data centric architectures - Benjamin StopfordIntuitions for scaling data centric architectures - Benjamin Stopford
Intuitions for scaling data centric architectures - Benjamin Stopford
 
QCon2016--Drive Best Spark Performance on AI
QCon2016--Drive Best Spark Performance on AIQCon2016--Drive Best Spark Performance on AI
QCon2016--Drive Best Spark Performance on AI
 
Databases for Storage Engineers
Databases for Storage EngineersDatabases for Storage Engineers
Databases for Storage Engineers
 
Optimizing MongoDB: Lessons Learned at Localytics
Optimizing MongoDB: Lessons Learned at LocalyticsOptimizing MongoDB: Lessons Learned at Localytics
Optimizing MongoDB: Lessons Learned at Localytics
 
Panzura & Scality - Cloud Storage made seamless - Cloud Expo New York City 2012
Panzura & Scality - Cloud Storage made seamless - Cloud Expo New York City 2012Panzura & Scality - Cloud Storage made seamless - Cloud Expo New York City 2012
Panzura & Scality - Cloud Storage made seamless - Cloud Expo New York City 2012
 
Introducing DynamoDB
Introducing DynamoDBIntroducing DynamoDB
Introducing DynamoDB
 
Presentation
PresentationPresentation
Presentation
 
Presentation
PresentationPresentation
Presentation
 
Development to Production with Sharded MongoDB Clusters
Development to Production with Sharded MongoDB ClustersDevelopment to Production with Sharded MongoDB Clusters
Development to Production with Sharded MongoDB Clusters
 
Galaxy Big Data with MariaDB
Galaxy Big Data with MariaDBGalaxy Big Data with MariaDB
Galaxy Big Data with MariaDB
 
Cacheconcurrencyconsistency cassandra svcc
Cacheconcurrencyconsistency cassandra svccCacheconcurrencyconsistency cassandra svcc
Cacheconcurrencyconsistency cassandra svcc
 
Sql server2008
Sql server2008Sql server2008
Sql server2008
 
Raid designs in Qsan Storage
Raid designs in Qsan StorageRaid designs in Qsan Storage
Raid designs in Qsan Storage
 
The Smug Mug Tale
The Smug Mug TaleThe Smug Mug Tale
The Smug Mug Tale
 

Más de Eric Evans

Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3Eric Evans
 
Cassandra: Not Just NoSQL, It's MoSQL
Cassandra: Not Just NoSQL, It's MoSQLCassandra: Not Just NoSQL, It's MoSQL
Cassandra: Not Just NoSQL, It's MoSQLEric Evans
 
NoSQL Yes, But YesCQL, No?
NoSQL Yes, But YesCQL, No?NoSQL Yes, But YesCQL, No?
NoSQL Yes, But YesCQL, No?Eric Evans
 
Cassandra Explained
Cassandra ExplainedCassandra Explained
Cassandra ExplainedEric Evans
 
Cassandra Explained
Cassandra ExplainedCassandra Explained
Cassandra ExplainedEric Evans
 
Outside The Box With Apache Cassnadra
Outside The Box With Apache CassnadraOutside The Box With Apache Cassnadra
Outside The Box With Apache CassnadraEric Evans
 
The Cassandra Distributed Database
The Cassandra Distributed DatabaseThe Cassandra Distributed Database
The Cassandra Distributed DatabaseEric Evans
 
An Introduction To Cassandra
An Introduction To CassandraAn Introduction To Cassandra
An Introduction To CassandraEric Evans
 
Cassandra In A Nutshell
Cassandra In A NutshellCassandra In A Nutshell
Cassandra In A NutshellEric Evans
 

Más de Eric Evans (9)

Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3
 
Cassandra: Not Just NoSQL, It's MoSQL
Cassandra: Not Just NoSQL, It's MoSQLCassandra: Not Just NoSQL, It's MoSQL
Cassandra: Not Just NoSQL, It's MoSQL
 
NoSQL Yes, But YesCQL, No?
NoSQL Yes, But YesCQL, No?NoSQL Yes, But YesCQL, No?
NoSQL Yes, But YesCQL, No?
 
Cassandra Explained
Cassandra ExplainedCassandra Explained
Cassandra Explained
 
Cassandra Explained
Cassandra ExplainedCassandra Explained
Cassandra Explained
 
Outside The Box With Apache Cassnadra
Outside The Box With Apache CassnadraOutside The Box With Apache Cassnadra
Outside The Box With Apache Cassnadra
 
The Cassandra Distributed Database
The Cassandra Distributed DatabaseThe Cassandra Distributed Database
The Cassandra Distributed Database
 
An Introduction To Cassandra
An Introduction To CassandraAn Introduction To Cassandra
An Introduction To Cassandra
 
Cassandra In A Nutshell
Cassandra In A NutshellCassandra In A Nutshell
Cassandra In A Nutshell
 

Último

Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 

Último (20)

Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 

Castle enhanced Cassandra

  • 1. Castle-enhanced Cassandra Berlin Buzzwords June 4, 2012 Eric Evans eric@acunu.com @jericevans, @acunu
  • 2.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12. Before the Flood 1990 Small databases BTree indexes BTree File systems RAID Old hardware Monday, 6 February 2012
  • 13.
  • 14. 2007
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20. Are we there yet? Figure 6 Figure 7 Source: IDC, 2007 Source: IDC, 2007 Figure 8 Figure 3 Source: IDC, 2007 Source: IDC, 2007
  • 21. Big Data distribution A M B C
  • 22. Big Data distribution A M Key = Aaa B C
  • 23. Big Data distribution A M Key = Aaa B C
  • 24. Big Data write optimizing • 7, 500 - 10,000 RPM • 5ms - 9ms seeks • ~150MB/s (sequential) • 75-150 random IOPS
  • 25. Big Data write optimizing A G A C G K A B D E G H K L
  • 26. Big Data write optimizing A G query(K) A C G K A B D E G H K L
  • 27. Big Data write optimizing A G A C G K A B D E G H K L
  • 28. Big Data write optimizing Memory Disk S1 S2 S3 S4 S5
  • 29. Big Data write optimizing Memory Disk S1 S2 S3 S4 S5
  • 30. Two Revolutions 2010 Distributed, shared-nothing databases Write-optimised indexes Write-optimised indexes BTree file systems BTree file systems RAID ... RAID New hardware New hardware Monday, 6 February 2012
  • 31. Bridging the Gap 2011 Distributed, shared-nothing databases Castle Castle ... New hardware New hardware Monday, 6 February 2012
  • 33. Castle is... • Filesystem (no, not really) • Key-value store for the Linux kernel • Write-optimized • for rotational disks • for SSDs • Versioned (clones, snapshots) • Disk aggregation • for redundancy • for performance • FLOSS!
  • 34. Doubling Arrays 3 Buffer values in memory until we have > B
  • 35. Doubling Arrays 3 9 Buffer values in memory until we have > B
  • 36. Doubling Arrays 3 9 Then, promote them to disk.
  • 38. Doubling Arrays 11 3 9 7
  • 39. Doubling Arrays 3 9 7 11
  • 40. Doubling Arrays 5 3 9 1 7 11
  • 41. Doubling Arrays 1 5 3 7 9 11
  • 44. Disk Layout: RDA Disk Layout: RDA random duplicate allocation random duplicate allocation 4 2 1 4 5 2 5 3 1 3 7 10 7 6 8 9 9 10 6 8 15 12 14 11 13 14 11 12 13 15 16 16
  • 45. Disk Layout: RDA Disk Layout: RDA random duplicate allocation random duplicate allocation 4 2 1 4 5 3 1 3 7 10 7 6 9 10 6 8 15 12 14 11 11 12 13 15 16 16
  • 46. Disk Layout: RDA Disk Layout: RDA random duplicate allocation random duplicate allocation 4 2 1 4 5 3 1 3 7 10 7 6 9 10 6 8 15 12 14 11 11 12 13 15 16 16 14 9 2 13 8 5
  • 47. Disk Layout: RDA random duplicate allocation Rebuild Times 5 4 Rebuild Time (Hours) 3 2 1 0 RAID10, 8 Disks RAID5, 8 Disks RDA, 8 Disks RDA, 15 Disks Monday, 6 February 2012
  • 49. ColumnFamilyStore Memory Disk S1 S2 S3 S4 S5
  • 50. Memtables Memory Memory Memory Disk S1 S2 S3 S4 S5
  • 51.
  • 53. Small random inserts Small random inserts 3Inserting 3 billion rows billion rows Acunu powered Cassandra - ‘standard’ Cassandra - Monday, 6 February 2012
  • 54. Insert latency Insert latency (while While inserting 3 billion rows rows) inserting 3 billion Acunu powered Cassandra x ‘standard’ Cassandra +
  • 55. Questions? bitbucket.org/acunu github.com/acunu www.acunu.com/2/category/technical%20articles/1.html

Notas del editor

  1. \n
  2. 15 years ago ...\n
  3. 15 years ago ...\n
  4. What were they thinking?\n
  5. ... your phone\n
  6. ... your notebook computer\n
  7. ... your notebook computer\n
  8. ... the Internet\n
  9. ... databases\n\nSmall databases, indexed w/ btrees\nBtree-based file systems\nRAID\nSCSI, IDE disks\n
  10. 10 years later ... \n
  11. 10 years later ... \n
  12. What are you thinking?\n
  13. ... your phone\n
  14. ... your computer\n
  15. ... the Internet (data-rich)\n
  16. 2006; 161 exabytes (.16 zettabytes)\n2010; 988 exabytes (.98 zettabytes)\n2011; 1.8 zettabytes\n2015; >8 zettabytes\n
  17. Times have changed, for Big Data, databases are distributed\nExample, Cassandra’s content-addressable ring\n
  18. Key-based partitioning...\n
  19. ... replication\n\nBASE replaces ACID\n
  20. Write optimization; Why it is important given disk limitations\n
  21. Btrees, how they work, properties, and how that relates to disk access\n
  22. Btrees, how they work, properties, and how that relates to disk access\n
  23. Btrees, how they work, properties, and how that relates to disk access\n
  24. How Cassandra write-optimizes; LSM-tree\n
  25. Sequential disk access wins.\n
  26. Where all that leaves the database stack.\n
  27. Enter the Castle-based stack.\n
  28. \n
  29. \n
  30. Write optimization ala Castle’s doubling-arrays, etc\n
  31. \n
  32. \n
  33. \n
  34. \n
  35. \n
  36. \n
  37. \n
  38. \n
  39. Indexing\n
  40. Bloomfilters\n
  41. Castle’s management of block devices for performance and redunancy\n
  42. \n
  43. \n
  44. \n
  45. Castle’s shared memory interface\n
  46. libcastle (C lib), and bindings\n
  47. Castle’s Java interface\n
  48. \n
  49. \n
  50. \n
  51. \n
  52. \n
  53. Putting Castle to work for Cassandra\n
  54. Cassandra’s ColumnFamilyStore abstraction / in-java LSM-tree\n
  55. When thresholds are hit, live memtables are “switched-out”, queued for flushing, and then left to the JVM’s garbage collector for cleanup.\n
  56. Garbage collection ramifications.\n
  57. Replacing CFS with a Castle-backend; Memory is (de)allocated by Castle in kernel-space\n
  58. Cassandra holds all bloomfilters and indexes in-memory; Castle does not have this requirement \n
  59. Performance\n
  60. \n
  61. \n
  62. \n