SlideShare una empresa de Scribd logo
1 de 30
Descargar para leer sin conexión
11.4.14   - mycassandra -   1
NoSQL, Key-Value Store (KVS), Document-oriented DB, GraphDB
     : memcached, Google Bigtable, Amazon Dynamo, Amazon SimpleDB, Apache Cassandra, Voldemort,
     Ringo, Vpork, MongoDB, CouchDB, Tokyo Cabinet/Tokyo Tyrant, Flare, ROMA, kumofs, Kai, Redis,
     Hadoop Hbase, Hypertable, Yahoo! PNUTS, Scalaris, Dynomite, ThruDB, Neo4j, IBM ObjectGrid, Oracle
     Coherence, Velocity, …

                                                             :“        ↔                             ”
• 
•                                                       (join, transaction)
•                             /




                                           - mycassandra -    
                                                                                                         2
• 
               •  key/value vs. multi-dimensional map vs. document vs. graph
          • 
               •              vs.
          •            vs.
          • 
               •  strong vs. weak
          • 
               •        vs.
          • 
               •  row vs. column
          • 
               •  master/slave vs. decentralized

11.4.14                                     - mycassandra -                        3
• 
               •  key/value vs. multi-dimensional map vs. document vs. graph
          • 
               •              vs.
          •            vs.
          • 
               •  strong vs. weak
          • 
               •        vs.
          • 
               •  row vs. column
          • 
               •  master/slave vs. decentralized

11.4.14                                     - mycassandra -                        4
vs.                                    



              write/read                                          
                           Bigtable, Cassandra,          MySQL, Sherpa
                           HBase
                
          Log-Structured                B+-Tree [R.Bayer ‘72]
                           Merge Tree [P. O’Neil ‘96]
          
                                                                       
                           Bigtable                      MySQL


                                                                              
11.4.14                            - mycassandra -                                    5
Write-Heavy                        
                     Read-Heavy                         


                                                                         write-optimized
                                             Better
                                                  
                                             Better


              read-optimized


                                 write-optimized
                                                                                   read-optimized



                                Yahoo! Cloud Serving Benchmark, SOCC ’10

11.4.14                                          - mycassandra -
                                                                                                          6
/

                1. 
                2. 

                      1.MyCassandra
                                  
                                   2.MyCassandra Cluster
                                                                                          


          read-optimized


                                                                      read/write-optimized
                      
      write-optimized




11.4.14                                         - mycassandra -                               7
Apache Cassandra


          • 
          • 
          •                                                                              

                    N = 3
                     ID         
       Consistent Hashing(               )


                    A
           F
               Z
                             secondary 1

                              Q
             V
                             N
                                                                    •  request proxy
          primary
                      secondary 2
                •          primary node
                                                                    •             secondary node
                                hash(key) = Q
                              key
   values
11.4.14                                         - mycassandra -                                         8
Google Bigtable
                         -                                :        -
          •  Bigtable:                                  sequential write
                                       I/O
          •  always writable
                                             write-lock


                                        <k1, cf1+cf2>
                  Cassandra    
                                 map: <key,ColumnFamily>
                                                       
                                                                     async
                                        Memtable
                                               
                  Memory
                  Disk
                     <k1, cf1>
                     <k1, cf2>
     write
         
                             Commit Log
                                                
                                                                           SSTable
                                                                                 
11.4.14                                       - mycassandra -                            9
Google Bigtable
                            -                         :    -
          key
                •  Memtable   value
                •  SSTable            value
                                                     I/O
                                           
                                       Map                       Cassandra          
                               <key,ColumnFamily>
                                                
     read
        
                             Memtable
                                             
                   Memory
                                 <k1, CF4>

                   Disk

                                                                                 <key, CF1>
                                   Commit Log
                                      I/O 
                                      <key, CF2>
                                  
                                    SSTable
                                                                             
   <key, CF3>
                                                                                          

11.4.14                                  - mycassandra -                                      10
1.                                                   
                                MyCassandra
                                          


               read-optimized




                                            write-optimized
                            


11.4.14                           - mycassandra -                  11
Cassandra
                   •  Cassandra                          /
                   • 



                                                                              Consistent Hashing
          InnoDB
               
    MyISAM
                         
   Memory …
                                  
                                                                              Gossip Protocol

                                             
                                    


                                                                     




                                  Bigtable
                                         
       MySQL
                                                     
       Redis
                                                                 
       …
11.4.14      MyCassandra:                                                                      12
MyCassandra
                           :                      Cassandra
                       :                      . JDBC API / stored procedure
                   :                     key-value store


                               MyCassandra node × 6




11.4.14                                                                       13
2.
                                 
                 MyCassandra Cluster
                                   




                 read/write-optimized




11.4.14        - mycassandra -           14
• 
    • 
                                                           sync
           async
          =>

    • 
          Quorum Protocol:   (           )+        (       )>   (      )


          =>


                                                                    mem
11.4.14                          - mycassandra -                                15
•  W:
                                                                               •  R:
                                                                    
          •  RW:


                                             MyCassandra
          •                     (W) /                           (R) /                       (RW)


          •                             gossip protocol


          • 
           1.                           (key                             )
           2.                                                                      × N-1
                            N=3                                 
            Consistent Hashing    ID   
                                        R             RW
                 
        RW
                                   W                        W
                                                                    R
                                            gossip
                     R
                                                      RW


                            W
          RW
             R
                                                        
           W

11.4.14                                                                                                 16
host
                                                                                     
                                                                                 node
(1) 1             /1       →
          ☓                                                                      storage
          ☓
(2) 1             /k                                 →
          ID                    [Amazon Dynamo, SOSP ’07]
          ☓
(3) 1                                                  →
                                                                           FT
           space

Fault
Torelance (FT)
        space
           FT
          space
                                                                                  (3)
1storage / 1node / 1 host
                                              (2)
               (1)
                                                      virtual node 

                                                      1 node / host        k nodes / host
11.4.14                                                                                           17
                                                      k storages / node
   1 storage / node
•  :
                                                                                  •  R:
                                                                          
       •  RW:
     =3, =2
    W:RW:R = 1:1:1         
                                        Client
               1) 
                               Proxy

                                                              2)  W, RW

                     ACK
                                                                                   ACK
                                                              3a)


          W
          
                                                   3b)             R
                 RW
             R
                                 
                                                                                  ACK      
                                 

                                : max (W, RW)



11.4.14                                     - mycassandra -                                    18
•  :
                                                                      •  R:

     =3, =2
                                                              
       •  RW:

    W:RW:R = 1:1:1   
         Client
                     Proxy
                  1) 

                                             2)  R, RW

                                             3a)
                         
                   3b)                    or                  W

          W
   RW
   R
                     
                                             4) 
                         
                                                      Proxy

                                                     (Cassandra   read repair     )
                         : max (R, RW)



11.4.14                            - mycassandra -                                      19
/
           
               •  MyCassandra Cluster: 6×3 = 18                /6       (W:R:RW = 6 : 6 : 6)
               •  Cassandra: 6      /6
           
               •                 : = 3,                                 :   =   =2
                                         : Bigtable (W), MySQL / InnoDB (R), Redis (RW)

                      : YCSB (Yahoo! Cloud Serving Benchmark) [SOCC ’10]
           
               1.    MyCassandra/Cassandra×6     YCSB Client×1
               2.    1KB values(100[Bytes]×10[columns])+key                        1,000
               3. 
               4.    YCSB
               5.    YCSB Stat

11.4.14                                               - mycassandra -                          20
YCSB                                                       

          •     4
               Workload
      Application                   Operation Ratio        Record
                              Example
                                             Selection
                              Log
                          Read: 0%               Zipfian( )
    Write
               Write-Only
                                  Write: 100%
    Heavy
                                                  Read: 50%
               Write-Heavy
 Session Store
                  Write: 50%
                                                            Read: 95%
    Read       Read-Heavy
 Photo tagging
                   Write: 5%
    Heavy
                                                            Read: 100%
               Read-Only
 Cache
                            Write: 0%

                ( ) Zipfian   :                        ,
                                                        /                  
11.4.14                              - mycassandra -                                            21
/                                                       
           1       11.5~23.5%       
       avg. write-latency                            Cassandra
          0.8
                                                                                          MyCassandra
          0.6                                                                             Cluster
          0.4                                                                      MySQL + Redis
Better
                                                                                                
          0.2
                    write:100%
              write:50%
               write:5%
         write:0%
           0
           (ms)
                                                          88.5%            
          10
                                            avg. read-latency
           8

Better
 6
                                
                                          85.2% 
             88.5% 
           4                                49.7%    

           2
                     read:0%
                read:50%
                read:95%
        read:100%
           0
           (ms)
                     Write-Only              Write-Heavy              Read-Heavy        Read-Only
11.4.14                                             - mycassandra -                                      22
30000       0.99     
                                                 Cassandra
                                   max. qps for 40 clients
                                                                                 MyCassandra
          25000                                                                  Cluster
          20000
                                                                                   6.53   
          15000
Better
          10000                        0.62       
           1.49    

           5000

               0
                     [100:0]
         [50:50]
                 [5:95]
            [0:100]
    [write:read]
      (query/sec)
   Write-Only      Write-Heavy             Read-Heavy          Read-Only

                             Write Heavy
                                       
                             Read Heavy
                                                                              
                        •                                             6.53
                        •                                                    
11.4.14                                    - mycassandra -                                              23
(1)
                                                  : HDD vs. SSD
          30000
                            Cassandra     HDD
                                                    30000
                                                                                  MyCassandra SSD              HDD
          25000                           SSD
                                                    25000
          20000                                     20000
                                                                                    Cluster
          15000                                     15000
                            (3)
        ( )
                                       ( )
     10000
Better
                                             10000
           5000                                       5000                    (3)
              0                                                0
            (qps)
                                       (qps)




  (1)                              HDD/SSD                         IOZone            HDD: Western digital
 SSD: Crucial
  (2)                                                              benchmark
                                                               sequential write      86,277 qps
           96,401 qps
  (3)                              
                           sequential read       108,914 qps
          216,099 qps
                                                               random write          2,485 qps
            29,045 qps
11.4.14                                      - mycassandra -   random read           926 qps
              21,751 qps
                                                                                                                     24
 Read-Heavy
               •                       88.5%
               •         6.53
                    =>
                                                           /

                        Write-Heavy
               •         Cassandra


11.4.14                                  - mycassandra -           25
(1/2)
       Write-Heavy
        •  MySQL

          •         :

          •             :
               • 
               • 
                    )                                                write-optimized
                                                             write-heavy
                                                                       
                            4                                                15000                  
                                Cassandra   MyCassandra
                                            cluster
                            3
                        
                                                    10000
                            2
                            1                                                 5000

                            0                                                     0
11.4.14                                                                                             26
                                   write latency      read latency                     throughput
(2/2)
           Amazon EC2
               •  1           /N

                         /
               •      /
               • 

               • 




11.4.14                            - mycassandra -            27
  FD-Tree: Tree Indexing on Flash Disks, VLDB ’10
               • 
               •  B+tree                 + LSM-tree
               •         SSD
           
               •    MySQL: RDBMS
               •    Anvil, SOSP ’09: 1
               •    Cloudy, VLDB ’10:
               •    Dynamo, SOSP ‘07:           vs.
               •    MyCassandra (        ):                     vs.




11.4.14                                       - mycassandra -         28
: MyCassandra/MyCassandra Cluster
                            Cassandra
 1. MyCassandra
       2. MyCassandra
                                                             Cluster
          data model
       multi-dimensional map (Column Family)
          throughput
       write
          write or read
     write and read
          latency
          low
            lower in case
     lower
          persistence
      yes
            yes or no (memory)
 yes
          consistency
      weak (eventual, quorum)
          replication
      sync / async
          data partition
   row
          node              decentralized
          organization
                                      throughput, latency              
11.4.14                                      - mycassandra -                     29
:
           1)

           2) MySQL + memcached
                    : MyCassandra Cluster
           -
           -
                                                     Table
          movie-id
     name
    thumb-name
         tag
                   count
          704122313
    movieA
 EY37lHk5bgU
         sport, succer, FIFA, …
 169,374
          704122314
    movieB
 Zk3BSYMWjzQ
 music, jazz, …
                472,803
11.4.14                      Read-Heavy - mycassandra -
                                      
                     Write-Heavy
                                                                      
                 30

Más contenido relacionado

La actualidad más candente

Hadoop World 2011: Advanced HBase Schema Design
Hadoop World 2011: Advanced HBase Schema DesignHadoop World 2011: Advanced HBase Schema Design
Hadoop World 2011: Advanced HBase Schema DesignCloudera, Inc.
 
Hadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, ClouderaHadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, ClouderaCloudera, Inc.
 
Intro to HBase - Lars George
Intro to HBase - Lars GeorgeIntro to HBase - Lars George
Intro to HBase - Lars GeorgeJAX London
 
Data Storage and Management project Report
Data Storage and Management project ReportData Storage and Management project Report
Data Storage and Management project ReportTushar Dalvi
 

La actualidad más candente (7)

Hadoop at Rakuten, 2011/07/06
Hadoop at Rakuten, 2011/07/06Hadoop at Rakuten, 2011/07/06
Hadoop at Rakuten, 2011/07/06
 
Hbase: an introduction
Hbase: an introductionHbase: an introduction
Hbase: an introduction
 
NoSQL: Cassadra vs. HBase
NoSQL: Cassadra vs. HBaseNoSQL: Cassadra vs. HBase
NoSQL: Cassadra vs. HBase
 
Hadoop World 2011: Advanced HBase Schema Design
Hadoop World 2011: Advanced HBase Schema DesignHadoop World 2011: Advanced HBase Schema Design
Hadoop World 2011: Advanced HBase Schema Design
 
Hadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, ClouderaHadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
 
Intro to HBase - Lars George
Intro to HBase - Lars GeorgeIntro to HBase - Lars George
Intro to HBase - Lars George
 
Data Storage and Management project Report
Data Storage and Management project ReportData Storage and Management project Report
Data Storage and Management project Report
 

Similar a 読み出し性能と書き込み性能を両立させるクラウドストレージ (OS-117-24)

読み出し性能と書き込み性能を両立させるクラウドストレージ (SACSIS2011-A6-1)
読み出し性能と書き込み性能を両立させるクラウドストレージ (SACSIS2011-A6-1)読み出し性能と書き込み性能を両立させるクラウドストレージ (SACSIS2011-A6-1)
読み出し性能と書き込み性能を両立させるクラウドストレージ (SACSIS2011-A6-1)Shun Nakamura
 
MyCassandra (Full English Version)
MyCassandra (Full English Version)MyCassandra (Full English Version)
MyCassandra (Full English Version)Shun Nakamura
 
Developers summit cassandraで見るNoSQL
Developers summit cassandraで見るNoSQLDevelopers summit cassandraで見るNoSQL
Developers summit cassandraで見るNoSQLRyu Kobayashi
 
MyCassandra: A Cloud Storage Supporting both Read Heavy and Write Heavy Workl...
MyCassandra: A Cloud Storage Supporting both Read Heavy and Write Heavy Workl...MyCassandra: A Cloud Storage Supporting both Read Heavy and Write Heavy Workl...
MyCassandra: A Cloud Storage Supporting both Read Heavy and Write Heavy Workl...Shun Nakamura
 
Spring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_dataSpring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_dataRoger Xia
 
Intro to cassandra
Intro to cassandraIntro to cassandra
Intro to cassandraAaron Ploetz
 

Similar a 読み出し性能と書き込み性能を両立させるクラウドストレージ (OS-117-24) (7)

読み出し性能と書き込み性能を両立させるクラウドストレージ (SACSIS2011-A6-1)
読み出し性能と書き込み性能を両立させるクラウドストレージ (SACSIS2011-A6-1)読み出し性能と書き込み性能を両立させるクラウドストレージ (SACSIS2011-A6-1)
読み出し性能と書き込み性能を両立させるクラウドストレージ (SACSIS2011-A6-1)
 
MyCassandra (Full English Version)
MyCassandra (Full English Version)MyCassandra (Full English Version)
MyCassandra (Full English Version)
 
Developers summit cassandraで見るNoSQL
Developers summit cassandraで見るNoSQLDevelopers summit cassandraで見るNoSQL
Developers summit cassandraで見るNoSQL
 
Cassandra
CassandraCassandra
Cassandra
 
MyCassandra: A Cloud Storage Supporting both Read Heavy and Write Heavy Workl...
MyCassandra: A Cloud Storage Supporting both Read Heavy and Write Heavy Workl...MyCassandra: A Cloud Storage Supporting both Read Heavy and Write Heavy Workl...
MyCassandra: A Cloud Storage Supporting both Read Heavy and Write Heavy Workl...
 
Spring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_dataSpring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_data
 
Intro to cassandra
Intro to cassandraIntro to cassandra
Intro to cassandra
 

Más de Shun Nakamura

シリコンバレーに行ってきた!
シリコンバレーに行ってきた!シリコンバレーに行ってきた!
シリコンバレーに行ってきた!Shun Nakamura
 
読み出し性能と書き込み性能を選択可能なクラウドストレージ (DEIM2011-C3-3)
読み出し性能と書き込み性能を選択可能なクラウドストレージ (DEIM2011-C3-3)読み出し性能と書き込み性能を選択可能なクラウドストレージ (DEIM2011-C3-3)
読み出し性能と書き込み性能を選択可能なクラウドストレージ (DEIM2011-C3-3)Shun Nakamura
 

Más de Shun Nakamura (6)

HBase at LINE
HBase at LINEHBase at LINE
HBase at LINE
 
シリコンバレーに行ってきた!
シリコンバレーに行ってきた!シリコンバレーに行ってきた!
シリコンバレーに行ってきた!
 
MyCassandra
MyCassandraMyCassandra
MyCassandra
 
読み出し性能と書き込み性能を選択可能なクラウドストレージ (DEIM2011-C3-3)
読み出し性能と書き込み性能を選択可能なクラウドストレージ (DEIM2011-C3-3)読み出し性能と書き込み性能を選択可能なクラウドストレージ (DEIM2011-C3-3)
読み出し性能と書き込み性能を選択可能なクラウドストレージ (DEIM2011-C3-3)
 
Cassandra勉強会
Cassandra勉強会Cassandra勉強会
Cassandra勉強会
 
ComSys WIP
ComSys WIPComSys WIP
ComSys WIP
 

Último

Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 

Último (20)

Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 

読み出し性能と書き込み性能を両立させるクラウドストレージ (OS-117-24)

  • 1. 11.4.14 - mycassandra - 1
  • 2. NoSQL, Key-Value Store (KVS), Document-oriented DB, GraphDB : memcached, Google Bigtable, Amazon Dynamo, Amazon SimpleDB, Apache Cassandra, Voldemort, Ringo, Vpork, MongoDB, CouchDB, Tokyo Cabinet/Tokyo Tyrant, Flare, ROMA, kumofs, Kai, Redis, Hadoop Hbase, Hypertable, Yahoo! PNUTS, Scalaris, Dynomite, ThruDB, Neo4j, IBM ObjectGrid, Oracle Coherence, Velocity, … :“ ↔ ” •  •  (join, transaction) •  / - mycassandra - 2
  • 3. •  •  key/value vs. multi-dimensional map vs. document vs. graph •  •  vs. •  vs. •  •  strong vs. weak •  •  vs. •  •  row vs. column •  •  master/slave vs. decentralized 11.4.14 - mycassandra - 3
  • 4. •  •  key/value vs. multi-dimensional map vs. document vs. graph •  •  vs. •  vs. •  •  strong vs. weak •  •  vs. •  •  row vs. column •  •  master/slave vs. decentralized 11.4.14 - mycassandra - 4
  • 5. vs. write/read Bigtable, Cassandra, MySQL, Sherpa HBase Log-Structured B+-Tree [R.Bayer ‘72] Merge Tree [P. O’Neil ‘96] Bigtable MySQL 11.4.14 - mycassandra - 5
  • 6. Write-Heavy Read-Heavy write-optimized Better Better read-optimized write-optimized read-optimized Yahoo! Cloud Serving Benchmark, SOCC ’10 11.4.14 - mycassandra - 6
  • 7. / 1.  2.  1.MyCassandra 2.MyCassandra Cluster read-optimized read/write-optimized write-optimized 11.4.14 - mycassandra - 7
  • 8. Apache Cassandra •  •  •  N = 3 ID Consistent Hashing( ) A F Z secondary 1 Q V N •  request proxy primary secondary 2 •  primary node •  secondary node hash(key) = Q key values 11.4.14 - mycassandra - 8
  • 9. Google Bigtable - : - •  Bigtable: sequential write I/O •  always writable write-lock <k1, cf1+cf2> Cassandra map: <key,ColumnFamily> async Memtable Memory Disk <k1, cf1> <k1, cf2> write Commit Log SSTable 11.4.14 - mycassandra - 9
  • 10. Google Bigtable - : - key •  Memtable value •  SSTable value I/O Map Cassandra <key,ColumnFamily> read Memtable Memory <k1, CF4> Disk <key, CF1> Commit Log I/O <key, CF2> SSTable <key, CF3> 11.4.14 - mycassandra - 10
  • 11. 1. MyCassandra read-optimized write-optimized 11.4.14 - mycassandra - 11
  • 12. Cassandra •  Cassandra / •  Consistent Hashing InnoDB MyISAM Memory … Gossip Protocol Bigtable MySQL Redis … 11.4.14 MyCassandra: 12
  • 13. MyCassandra : Cassandra : . JDBC API / stored procedure : key-value store MyCassandra node × 6 11.4.14 13
  • 14. 2. MyCassandra Cluster read/write-optimized 11.4.14 - mycassandra - 14
  • 15. •  •  sync async => •  Quorum Protocol: ( )+ ( )> ( ) => mem 11.4.14 - mycassandra - 15
  • 16. •  W: •  R: •  RW: MyCassandra •  (W) / (R) / (RW) •  gossip protocol •  1.  (key ) 2.  × N-1 N=3 Consistent Hashing ID R RW RW W W R gossip R RW W RW R W 11.4.14 16
  • 17. host node (1) 1 /1 → ☓ storage ☓ (2) 1 /k → ID [Amazon Dynamo, SOSP ’07] ☓ (3) 1 → FT space Fault Torelance (FT) space FT space (3) 1storage / 1node / 1 host (2) (1) virtual node 1 node / host k nodes / host 11.4.14 17 k storages / node 1 storage / node
  • 18. •  : •  R: •  RW: =3, =2 W:RW:R = 1:1:1 Client 1)  Proxy 2)  W, RW ACK ACK 3a) W 3b) R RW R ACK : max (W, RW) 11.4.14 - mycassandra - 18
  • 19. •  : •  R: =3, =2 •  RW: W:RW:R = 1:1:1 Client Proxy 1)  2)  R, RW 3a) 3b) or W W RW R 4)  Proxy (Cassandra read repair ) : max (R, RW) 11.4.14 - mycassandra - 19
  • 20. /   •  MyCassandra Cluster: 6×3 = 18 /6 (W:R:RW = 6 : 6 : 6) •  Cassandra: 6 /6   •  : = 3, : = =2   : Bigtable (W), MySQL / InnoDB (R), Redis (RW) : YCSB (Yahoo! Cloud Serving Benchmark) [SOCC ’10]   1.  MyCassandra/Cassandra×6 YCSB Client×1 2.  1KB values(100[Bytes]×10[columns])+key 1,000 3.  4.  YCSB 5.  YCSB Stat 11.4.14 - mycassandra - 20
  • 21. YCSB •  4 Workload Application Operation Ratio Record Example Selection Log Read: 0% Zipfian( ) Write Write-Only Write: 100% Heavy Read: 50% Write-Heavy Session Store Write: 50% Read: 95% Read Read-Heavy Photo tagging Write: 5% Heavy Read: 100% Read-Only Cache Write: 0% ( ) Zipfian : , / 11.4.14 - mycassandra - 21
  • 22. / 1 11.5~23.5% avg. write-latency Cassandra 0.8 MyCassandra 0.6 Cluster 0.4 MySQL + Redis Better 0.2 write:100% write:50% write:5% write:0% 0 (ms) 88.5% 10 avg. read-latency 8 Better 6 85.2% 88.5% 4 49.7% 2 read:0% read:50% read:95% read:100% 0 (ms) Write-Only Write-Heavy Read-Heavy Read-Only 11.4.14 - mycassandra - 22
  • 23. 30000 0.99 Cassandra max. qps for 40 clients MyCassandra 25000 Cluster 20000 6.53 15000 Better 10000 0.62 1.49 5000 0 [100:0] [50:50] [5:95] [0:100] [write:read] (query/sec) Write-Only Write-Heavy Read-Heavy Read-Only Write Heavy Read Heavy •  6.53 •  11.4.14 - mycassandra - 23
  • 24. (1) : HDD vs. SSD 30000 Cassandra HDD 30000 MyCassandra SSD HDD 25000 SSD 25000 20000 20000 Cluster 15000 15000 (3) ( ) ( ) 10000 Better 10000 5000 5000 (3) 0 0 (qps) (qps) (1) HDD/SSD IOZone HDD: Western digital SSD: Crucial (2) benchmark sequential write 86,277 qps 96,401 qps (3) sequential read 108,914 qps 216,099 qps random write 2,485 qps 29,045 qps 11.4.14 - mycassandra - random read 926 qps 21,751 qps 24
  • 25.  Read-Heavy •  88.5% •  6.53 => /   Write-Heavy •  Cassandra 11.4.14 - mycassandra - 25
  • 26. (1/2)  Write-Heavy •  MySQL •  : •  : •  •  ) write-optimized write-heavy 4 15000 Cassandra MyCassandra cluster 3 10000 2 1 5000 0 0 11.4.14 26 write latency read latency throughput
  • 27. (2/2)  Amazon EC2 •  1 /N   / •  / •  •  11.4.14 - mycassandra - 27
  • 28.   FD-Tree: Tree Indexing on Flash Disks, VLDB ’10 •  •  B+tree + LSM-tree •  SSD   •  MySQL: RDBMS •  Anvil, SOSP ’09: 1 •  Cloudy, VLDB ’10: •  Dynamo, SOSP ‘07: vs. •  MyCassandra ( ): vs. 11.4.14 - mycassandra - 28
  • 29. : MyCassandra/MyCassandra Cluster Cassandra 1. MyCassandra 2. MyCassandra Cluster data model multi-dimensional map (Column Family) throughput write write or read write and read latency low lower in case lower persistence yes yes or no (memory) yes consistency weak (eventual, quorum) replication sync / async data partition row node decentralized organization throughput, latency 11.4.14 - mycassandra - 29
  • 30. : 1) 2) MySQL + memcached : MyCassandra Cluster - - Table movie-id name thumb-name tag count 704122313 movieA EY37lHk5bgU sport, succer, FIFA, … 169,374 704122314 movieB Zk3BSYMWjzQ music, jazz, … 472,803 11.4.14 Read-Heavy - mycassandra - Write-Heavy 30