SlideShare a Scribd company logo
1 of 21
Download to read offline
# hibaridb




               Hibari/NOSQL
                for BIGDATA
                   November 2010
              Gemini Mobile Technologies
   Hibari Open Source project: http://sourceforge.net/projects/hibari/   1
Introduction
• Founded: July, 2001
• Offices: San Francisco, Tokyo, Beijing
• Investors:
   – Goldman Sachs, Mitsubishi-UFJ, Mizuho, Nomura, Ignite, Access, Aplix
• Accomplishments:
   – Messaging Products
       • Provide MMSC to 3 out of 4 Carriers in Japan (DoCoMo, Softbank,
         eMobile)
       • Largest MMSC in the world (Softbank Japan)
       • OEM to Alcatel-Lucent and ByteMobile
   – NOSQL / Big Data
       • 2006: First Mobile 3D SNS (Softbank, China Unicom, iPhone App)
       • 4/2010: WebMail, Japanese Mobile Carrier & Internet Provider
       • 7/2010: Hibari Open Source
                                                                            2
Customers




            3
Hibari (= Cloud Birds)




                         4
What is Hibari?

• Hibari is a production-ready, distributed, key-value, big
  data store.
   – China Mobile and China Unicom - SNS
   – Japanese internet provider - GB mailbox webmail
   – Japanese mobile carrier - GB mailbox webmail

• Hibari uses chain replication for strong consistency, high-
  availability, and durability.
• Hibari has excellent performance especially for read and
  large value operations.
• Hibari is open-source software under the Apache 2.0
  license.
                                                                5
Environments

• Hibari runs on commodity, heterogeneous servers.
• Hibari supports Red Hat, CentOS, and Fedora Linux
  distributions.
   – Debian, Ubuntu, Gentoo, Mac OS X, and Free BSD are coming
     soon.

• Hibari supports Erlang/OTP R13B04.
   – R14B is coming soon.

• Hibari supports Amazon S3, JSON-RPC-RFC4627,
  UBF/EBF/JSF and native Erlang client APIs.
   – Thrift is coming soon.


                                                                 6
Why NOSQL?
We needed to build a scalable, high performance
web mail system
 • Big Data
     – Several million end users from the start
     – Several billion messages in a few month
     – Hundreds of TB data
 • Low Cost requirements
     – Customer’s business model (Freemium)
     – Distributed >50 PC servers
     – No need for rich and expensive functions of SQL
 • Continuous growth of data in the storage
     – Elasticity to expand capacity due to increasing data
                                                              7
What were the customer’s needs?
• Durability
    – Data loss (e.g., messages, metadata) is not acceptable
• Strong Consistency
    – Because of interactive sessions, consistency is required
• Low Latency
    – <1 sec response time for end user transactions
• High Availability
    – As a branded service to the end user, service must always be
      available.
• Read Heavy
    – Many more read than write operations
• Big Data and Data size highly variable
    – Large GB mail box as service differentiator was required
    – Mail messages range from a few bytes to many MB with
      attachments
                                                                     8
How does Hibari address these needs?
• Durable updates
   Every update is written and flushed to stable storage (fsync() system call)
   before sending acknowledgments to the client.
• Consistent updates
   After an update is acknowledged, no client can see an older
   version. ”Chain Replication” is used to maintain consistency across all
   replicas.
• High Availability
   Each key can be replicated multiple times. As long as one copy of the key
   survives, all operations on that key are permitted.
• Lockless API
   Locks are not required for all client operations. Optionally, Hibari supports
   “test-and-set” of each key-value pair via an increasing (enforced by the
   server) timestamp value.
• Micro-transactions
   Under limited circumstances, operations on multiple keys can be given
   transactional commit/abort semantics.
                                                                                   9
Chain Replication for Strong Consistency
                                                     Key                                          Values
        Data Model                              0…………n                          BLOB (Binary Large OBject)


                                     Key 1          Key 2    Key 3       ・               ・         ・          ・    Key n
        Key Value Table
                                     Value          Value    Value       ・               ・         ・          ・    Value


                                                  Go to Chain A                                  Go to Chain B

                                                                  Consistent Hashing


            e t                                         s
 PC 1    rit ues PC 2              PC 3             lie           PC 1                  PC 2                PC 3
                        Re qu
                        Re




                                                  p
                                                Re
                          ad est



        W eq
                                            l
        R                                 Al
                                                                                            s              e t
                                                                                        lie             rit ues
                                                            Re qu
                                                            Re                        p
                                                                                    Re
                                                              ad est
  Head         Middle              Tail                                                                W eq
                                                                                l
                                                                              Al                       R

              Chain A
                                                                       Tail                     Head          Middle

                                                                                                                           10
                                                                                             Chain B
Concept of “Chain”

               Evenly distributed load in multiple nodes
                     (Case of 3 replications/6 chains)

            PC 1     PC 2     PC 3     PC 4      PC 5     PC 6


  Chain A   Head     Middle    Tail     Head     Middle    Tail    Chain D



  Chain B    Tail    Head     Middle     Tail    Head     Middle   Chain E



  Chain C   Middle    Tail     Head     Middle    Tail     Head    Chain F




                                                                             11
Chain Replication for High Availability
and Fault Tolerance
                               Failover mechanism

  PC 1       PC 2      PC 3                         PC 1     PC 2      PC 3




   Head      Middle     Tail                         Head                Tail



   Tail       Head     Middle                        Tail                Head



  Middle      Tail      Head                         Tail                Head




           Node down                                  Service continuation



                                                                                12
Hibari Benchmarking
Hibari’s benchmarking has been done with primarily 3 approaches

                • Micro benchmarks, standalone load client, and end-to-end as part
                  of larger systems
 In house
  In house      • C/C++ and Erlang based implementations
    tools
     tools      • Good for Gemini’s internal use but not ideal for the open source
                  community

Yahoo’s Cloud • New, easy to use Java-based load tool
 Yahoo’s Cloud
    Storage
     Storage   • Java implementation for Cassandra db driver and new Hibari db
  Benchmark
   Benchmark     driver
    (YCSB)
     (YCSB)    • But … investigating latency issues with Hibari’s YCSB db driver

                • Simple, easy to use Erlang-based load tool
Basho Bench     • Erlang implementation for Cassandra db driver and new Hibari db
 Basho Bench      driver
    (BB)
     (BB)
                • But … traffic scenarios are not as full featured as YCSB and
                  investigating stability issues with Cassandra’s BB db driver
                                                                                   13
Test conditions
  Tool: Basho Bench - Hibari db driver and Cassandra db driver
  Keys: 1 … 1,000,000 Truncated Pareto (20% of keys, 80% of time)
  Values: 10K, 50k, 100K, and 150K
  Operations:
      20 minutes Random Put
      40 minutes Random Get/Put (50%/50%)
  Replication factor: 2
  Disk sync of commit log: enabled
  Storage: Keys RAM+DISK, Values DISK only
Hardware




                                                                    14
Test scenario
1.




2.




                     15
“Hibari Stub” results

 Average                                                   Average
 Latency/msec        Hibari Stub “get”                     Latency/msec           Hibari Stub “put”
  100                                                         100
   90                                                         90
   80                                                         80
   70                                                         70
   60                                                         60
   50                                                         50
   40                                                         40
   30                                                         30
   20                                                         20
   10                                                         10
    0                                                          0
   elapsed    500         1000      1500            2000       elapsed    500         1000      1500            2000
Sec elapsed                                                 Sec elapsed
                    get 10K stub    get 50K stub                                put 10K stub    put 50K stub
                    get 100K stub   get 150K stub                               put 100K stub   put 150K stub




                                                                                                                       16
“Hibari and Cassandra” results by
Average
        Basho Bench tool                             Average
                                                                                          Apache-Cassandra 0.6.5

Latency/msec        Cassandra “get”                  Latency/msec              Hibari “get”
100                                                     100
 90                                                      90
 80                                                      80
 70                                                      70
 60                                                      60
 50                                                      50
 40                                                      40
 30                                                      30
 20                                                      20
 10                                                      10
  0                                                       0
Sec elapsed
  elapsed     500     1000       1500      2000        Sec elapsed
                                                         elapsed       500       1000       1500      2000

          get 10K    get 50K   get 100K   get 150K                   get 10K    get 50K   get 100K   get 150K


                    Cassandra “put”                                            Hibari “put”
100                                                     100
 90                                                      90
 80                                                      80
 70                                                      70
 60                                                      60
 50                                                      50
 40                                                      40
 30                                                      30
 20                                                      20
 10                                                      10
  0                                                       0
Sec elapsed
  elapsed     500      1000      1500      2000        Sec elapsed
                                                         elapsed       500       1000       1500      2000

          put 10K    put 50K   put 100K   put 150K                   put 10K    put 50K   put 100K   put 150K   17
Language (& Priority) Barriers

 • Latency issues with Java load tool speaking with an
   Erlang NOSQL server

 • Stability issues with Erlang load tool speaking with a
   Java NOSQL server




 Further investigation is required to identify issues and
 areas of improvement. Our primary targets are
 Hibari, Cassandra, and the YCSB tool.

                                                            18
“Applications / Customer first” approach
  leads to mutual complements
FEATURE             (e.g.) Cassandra            Hibari
Data model          Column-oriented             Key-value
Data partitioning   Consistent hashing          Consistent hashing
Data consistency    Configurable                Yes
Data replication    Preference lists            Chain replication
Elasticity          Admin operations, Gossip,   Chain migration
                    Data redistribution
Node health         Peer-to-peer monitor, gossip Admin server monitors
detection
O&M                 nodetool, JMX               Admin UI with brick, chain
                                                health, statistics
Performance         write-optimized             read-optimized
Implementation      Java                        Erlang
API                 get/put/delete, scan,      get/put/delete, micro-
                    map/reduce, atomic row ops transactions              19
“Not Only Hibari” Roadmap
                               2010                                                     ~
                                                                                    2011~


               Erlang 14A          Erlang 14B

Hibari       Open Source
                    Ubuntu
         Red Hat    Gentoo                                                      Enterprise
OS       Cent OS    Debian
         Fedora     MacOS X
                    Free BSD           Cassandra
Integration
                                  Thrift                                       Hadoop Map/Reduce
Application        UBF
tools                                            C++
                                                                                     Pig
                                                 Java
               Erlang                            Python
Client API     Amazon S3                         Ruby
               JSON-RPC-RFC4627                  Perl
                                                 Google Buffer Protocol              Monitoring tool
                         YCSB                   • Based on customer needs            Configuration tool
                         Basho Bench            • Expect Open source contribution
                                                                                     Management tool 20
Series of “NOSQL Hands-on Training” will be
announced soon! Please follow @geminimobile




    Thank you

   Hibari Open Source project: http://sourceforge.net/projects/hibari/
   Hibari Twitter: @hibaridb Hashtag: #hibaridb
   Gemini Twitter: @geminimobile
   Big Data blog: http://hibari-gemini.blogspot.com/
   Slideshare: http://www.slideshare.net/geminimobile
                                                                         21

More Related Content

Viewers also liked

QR Codes in Education
QR Codes in EducationQR Codes in Education
QR Codes in EducationAl Tucker
 
США в последней трети 19 века
США в последней трети 19 векаСША в последней трети 19 века
США в последней трети 19 векаProznanie.ru
 
внешняя политика во второй половине Xviiiв.
внешняя политика во второй половине Xviiiв.внешняя политика во второй половине Xviiiв.
внешняя политика во второй половине Xviiiв.Proznanie.ru
 
The current panama brochure
The current panama brochureThe current panama brochure
The current panama brochureTiff412
 
Ibm pure flex at cloudian seminar 2013
Ibm pure flex at cloudian seminar 2013Ibm pure flex at cloudian seminar 2013
Ibm pure flex at cloudian seminar 2013CLOUDIAN KK
 
QR Codes in the Classroom - PETE & C
QR Codes in the Classroom - PETE & CQR Codes in the Classroom - PETE & C
QR Codes in the Classroom - PETE & CAl Tucker
 
смятение умов
смятение умовсмятение умов
смятение умовProznanie.ru
 
Gdb principle
Gdb principleGdb principle
Gdb principlelibfetion
 
Agriculture renaissance 2010
Agriculture renaissance 2010Agriculture renaissance 2010
Agriculture renaissance 2010Prabhu Pingali
 
Lima David Niggli Presenting To Retail
Lima David Niggli Presenting To RetailLima David Niggli Presenting To Retail
Lima David Niggli Presenting To RetailJun Kapunan
 

Viewers also liked (17)

QR Codes in Education
QR Codes in EducationQR Codes in Education
QR Codes in Education
 
Lermontov
LermontovLermontov
Lermontov
 
Egipet
EgipetEgipet
Egipet
 
Germania
GermaniaGermania
Germania
 
США в последней трети 19 века
США в последней трети 19 векаСША в последней трети 19 века
США в последней трети 19 века
 
внешняя политика во второй половине Xviiiв.
внешняя политика во второй половине Xviiiв.внешняя политика во второй половине Xviiiв.
внешняя политика во второй половине Xviiiв.
 
Portofolio 100707
Portofolio 100707Portofolio 100707
Portofolio 100707
 
The current panama brochure
The current panama brochureThe current panama brochure
The current panama brochure
 
TheRecruitery
TheRecruiteryTheRecruitery
TheRecruitery
 
Ibm pure flex at cloudian seminar 2013
Ibm pure flex at cloudian seminar 2013Ibm pure flex at cloudian seminar 2013
Ibm pure flex at cloudian seminar 2013
 
QR Codes in the Classroom - PETE & C
QR Codes in the Classroom - PETE & CQR Codes in the Classroom - PETE & C
QR Codes in the Classroom - PETE & C
 
смятение умов
смятение умовсмятение умов
смятение умов
 
Bena brochure
Bena brochureBena brochure
Bena brochure
 
Nature Walk
Nature WalkNature Walk
Nature Walk
 
Gdb principle
Gdb principleGdb principle
Gdb principle
 
Agriculture renaissance 2010
Agriculture renaissance 2010Agriculture renaissance 2010
Agriculture renaissance 2010
 
Lima David Niggli Presenting To Retail
Lima David Niggli Presenting To RetailLima David Niggli Presenting To Retail
Lima David Niggli Presenting To Retail
 

Similar to Hibaridb: High Performance NoSQL Database

HIbari/NOSQL/Erlang for Big Data at Erlang User Conference 2010
HIbari/NOSQL/Erlang for Big Data at Erlang User Conference 2010HIbari/NOSQL/Erlang for Big Data at Erlang User Conference 2010
HIbari/NOSQL/Erlang for Big Data at Erlang User Conference 2010CLOUDIAN KK
 
2011 03-31 Riak Stockholm Meetup
2011 03-31 Riak Stockholm Meetup2011 03-31 Riak Stockholm Meetup
2011 03-31 Riak Stockholm MeetupMårten Gustafson
 
Message Queues : A Primer - International PHP Conference Fall 2012
Message Queues : A Primer - International PHP Conference Fall 2012Message Queues : A Primer - International PHP Conference Fall 2012
Message Queues : A Primer - International PHP Conference Fall 2012Mike Willbanks
 
Scaling the Container Dataplane
Scaling the Container Dataplane Scaling the Container Dataplane
Scaling the Container Dataplane Michelle Holley
 
Utah PHP Users Group - 2012
Utah PHP Users Group - 2012Utah PHP Users Group - 2012
Utah PHP Users Group - 2012Randy Secrist
 
OpenStack and OpenFlow Demos
OpenStack and OpenFlow DemosOpenStack and OpenFlow Demos
OpenStack and OpenFlow DemosBrent Salisbury
 
Architecting data center networks in the era of big data and cloud
Architecting data center networks in the era of big data and cloudArchitecting data center networks in the era of big data and cloud
Architecting data center networks in the era of big data and cloudbradhedlund
 
Lindsay distributed geventzmq
Lindsay distributed geventzmqLindsay distributed geventzmq
Lindsay distributed geventzmqRobin Xiao
 
Thoughts on consistency models
Thoughts on consistency modelsThoughts on consistency models
Thoughts on consistency modelsrogerbodamer
 
Unit 4 memory system
Unit 4   memory systemUnit 4   memory system
Unit 4 memory systemchidabdu
 
Wrapper induction construct wrappers automatically to extract information f...
Wrapper induction   construct wrappers automatically to extract information f...Wrapper induction   construct wrappers automatically to extract information f...
Wrapper induction construct wrappers automatically to extract information f...George Ang
 
Openflow勉強会 「OpenFlowコントローラを取り巻く状況とその実装」
Openflow勉強会 「OpenFlowコントローラを取り巻く状況とその実装」Openflow勉強会 「OpenFlowコントローラを取り巻く状況とその実装」
Openflow勉強会 「OpenFlowコントローラを取り巻く状況とその実装」Sho Shimizu
 
Approaches for Power Management Verification of SOC
Approaches for Power Management Verification of SOC Approaches for Power Management Verification of SOC
Approaches for Power Management Verification of SOC DVClub
 
Webinar: General Technical Overview of MongoDB
Webinar: General Technical Overview of MongoDBWebinar: General Technical Overview of MongoDB
Webinar: General Technical Overview of MongoDBMongoDB
 
Mysql(2)
Mysql(2)Mysql(2)
Mysql(2)tomcoh
 
How to Build a SaaS App With Twitter-like Throughput on Just 9 Servers
How to Build a SaaS App With Twitter-like Throughput on Just 9 ServersHow to Build a SaaS App With Twitter-like Throughput on Just 9 Servers
How to Build a SaaS App With Twitter-like Throughput on Just 9 ServersNew Relic
 

Similar to Hibaridb: High Performance NoSQL Database (20)

HIbari/NOSQL/Erlang for Big Data at Erlang User Conference 2010
HIbari/NOSQL/Erlang for Big Data at Erlang User Conference 2010HIbari/NOSQL/Erlang for Big Data at Erlang User Conference 2010
HIbari/NOSQL/Erlang for Big Data at Erlang User Conference 2010
 
Khan and morrison_dq207
Khan and morrison_dq207Khan and morrison_dq207
Khan and morrison_dq207
 
2011 03-31 Riak Stockholm Meetup
2011 03-31 Riak Stockholm Meetup2011 03-31 Riak Stockholm Meetup
2011 03-31 Riak Stockholm Meetup
 
Message Queues : A Primer - International PHP Conference Fall 2012
Message Queues : A Primer - International PHP Conference Fall 2012Message Queues : A Primer - International PHP Conference Fall 2012
Message Queues : A Primer - International PHP Conference Fall 2012
 
Scaling the Container Dataplane
Scaling the Container Dataplane Scaling the Container Dataplane
Scaling the Container Dataplane
 
Utah PHP Users Group - 2012
Utah PHP Users Group - 2012Utah PHP Users Group - 2012
Utah PHP Users Group - 2012
 
OpenStack and OpenFlow Demos
OpenStack and OpenFlow DemosOpenStack and OpenFlow Demos
OpenStack and OpenFlow Demos
 
Elastic Search
Elastic SearchElastic Search
Elastic Search
 
Architecting data center networks in the era of big data and cloud
Architecting data center networks in the era of big data and cloudArchitecting data center networks in the era of big data and cloud
Architecting data center networks in the era of big data and cloud
 
Networking concepts
Networking conceptsNetworking concepts
Networking concepts
 
Lindsay distributed geventzmq
Lindsay distributed geventzmqLindsay distributed geventzmq
Lindsay distributed geventzmq
 
Thoughts on consistency models
Thoughts on consistency modelsThoughts on consistency models
Thoughts on consistency models
 
Unit 4 memory system
Unit 4   memory systemUnit 4   memory system
Unit 4 memory system
 
Wrapper induction construct wrappers automatically to extract information f...
Wrapper induction   construct wrappers automatically to extract information f...Wrapper induction   construct wrappers automatically to extract information f...
Wrapper induction construct wrappers automatically to extract information f...
 
Openflow勉強会 「OpenFlowコントローラを取り巻く状況とその実装」
Openflow勉強会 「OpenFlowコントローラを取り巻く状況とその実装」Openflow勉強会 「OpenFlowコントローラを取り巻く状況とその実装」
Openflow勉強会 「OpenFlowコントローラを取り巻く状況とその実装」
 
Approaches for Power Management Verification of SOC
Approaches for Power Management Verification of SOC Approaches for Power Management Verification of SOC
Approaches for Power Management Verification of SOC
 
Webinar: General Technical Overview of MongoDB
Webinar: General Technical Overview of MongoDBWebinar: General Technical Overview of MongoDB
Webinar: General Technical Overview of MongoDB
 
Network & Internet Working Devices
Network & Internet Working DevicesNetwork & Internet Working Devices
Network & Internet Working Devices
 
Mysql(2)
Mysql(2)Mysql(2)
Mysql(2)
 
How to Build a SaaS App With Twitter-like Throughput on Just 9 Servers
How to Build a SaaS App With Twitter-like Throughput on Just 9 ServersHow to Build a SaaS App With Twitter-like Throughput on Just 9 Servers
How to Build a SaaS App With Twitter-like Throughput on Just 9 Servers
 

More from CLOUDIAN KK

CLOUDIAN HYPERSTORE - 風林火山ストレージ
CLOUDIAN HYPERSTORE - 風林火山ストレージCLOUDIAN HYPERSTORE - 風林火山ストレージ
CLOUDIAN HYPERSTORE - 風林火山ストレージCLOUDIAN KK
 
クラウディアンのご紹介
クラウディアンのご紹介クラウディアンのご紹介
クラウディアンのご紹介CLOUDIAN KK
 
IoT/ビッグデータ/AI連携により次世代ストレージが促進するビジネス変革
IoT/ビッグデータ/AI連携により次世代ストレージが促進するビジネス変革IoT/ビッグデータ/AI連携により次世代ストレージが促進するビジネス変革
IoT/ビッグデータ/AI連携により次世代ストレージが促進するビジネス変革CLOUDIAN KK
 
CLOUDIAN Presentation at VERITAS VISION in Tokyo
CLOUDIAN Presentation at VERITAS VISION in TokyoCLOUDIAN Presentation at VERITAS VISION in Tokyo
CLOUDIAN Presentation at VERITAS VISION in TokyoCLOUDIAN KK
 
S3 API接続検証プログラムのご紹介
S3 API接続検証プログラムのご紹介S3 API接続検証プログラムのご紹介
S3 API接続検証プログラムのご紹介CLOUDIAN KK
 
Auto tiering and Versioning of CLOUDIAN HyperStore
Auto tiering and Versioning of CLOUDIAN HyperStoreAuto tiering and Versioning of CLOUDIAN HyperStore
Auto tiering and Versioning of CLOUDIAN HyperStoreCLOUDIAN KK
 
AWS SDK for Python and CLOUDIAN HyperStore
AWS SDK for Python and CLOUDIAN HyperStoreAWS SDK for Python and CLOUDIAN HyperStore
AWS SDK for Python and CLOUDIAN HyperStoreCLOUDIAN KK
 
AWS CLI and CLOUDIAN HyperStore
AWS CLI and CLOUDIAN HyperStoreAWS CLI and CLOUDIAN HyperStore
AWS CLI and CLOUDIAN HyperStoreCLOUDIAN KK
 
ZiDOMA data and CLOUDIAN HyperStore
ZiDOMA data and CLOUDIAN HyperStoreZiDOMA data and CLOUDIAN HyperStore
ZiDOMA data and CLOUDIAN HyperStoreCLOUDIAN KK
 
FOBAS CSC and CLOUDIAN HyperStore
FOBAS CSC and CLOUDIAN HyperStoreFOBAS CSC and CLOUDIAN HyperStore
FOBAS CSC and CLOUDIAN HyperStoreCLOUDIAN KK
 
ARCserve backup and CLOUDIAN HyperStore
ARCserve backup and CLOUDIAN HyperStoreARCserve backup and CLOUDIAN HyperStore
ARCserve backup and CLOUDIAN HyperStoreCLOUDIAN KK
 
Cloudian presentation at idc japan sv2016
Cloudian presentation at idc japan sv2016Cloudian presentation at idc japan sv2016
Cloudian presentation at idc japan sv2016CLOUDIAN KK
 
ITコアを刷新するハイブリッドクラウド型ITシステム
ITコアを刷新するハイブリッドクラウド型ITシステムITコアを刷新するハイブリッドクラウド型ITシステム
ITコアを刷新するハイブリッドクラウド型ITシステムCLOUDIAN KK
 
【FOBAS】Data is money. ストレージ分散投資のススメ
【FOBAS】Data is money. ストレージ分散投資のススメ【FOBAS】Data is money. ストレージ分散投資のススメ
【FOBAS】Data is money. ストレージ分散投資のススメCLOUDIAN KK
 
【ARI】ストレージのコスト・利便性・非機能要求項目を徹底比較
【ARI】ストレージのコスト・利便性・非機能要求項目を徹底比較【ARI】ストレージのコスト・利便性・非機能要求項目を徹底比較
【ARI】ストレージのコスト・利便性・非機能要求項目を徹底比較CLOUDIAN KK
 
【SIS】オブジェクトストレージを活用した増え続ける長期保管データの運用の効率化
【SIS】オブジェクトストレージを活用した増え続ける長期保管データの運用の効率化【SIS】オブジェクトストレージを活用した増え続ける長期保管データの運用の効率化
【SIS】オブジェクトストレージを活用した増え続ける長期保管データの運用の効率化CLOUDIAN KK
 
【CLOUDIAN】コード化されたインフラの実装
【CLOUDIAN】コード化されたインフラの実装【CLOUDIAN】コード化されたインフラの実装
【CLOUDIAN】コード化されたインフラの実装CLOUDIAN KK
 
【CLOUDIAN】自動階層化による現有ストレージ活用術
【CLOUDIAN】自動階層化による現有ストレージ活用術【CLOUDIAN】自動階層化による現有ストレージ活用術
【CLOUDIAN】自動階層化による現有ストレージ活用術CLOUDIAN KK
 
【CLOUDIAN】秒間隔RPO(目標復旧時点)の実現
【CLOUDIAN】秒間隔RPO(目標復旧時点)の実現【CLOUDIAN】秒間隔RPO(目標復旧時点)の実現
【CLOUDIAN】秒間隔RPO(目標復旧時点)の実現CLOUDIAN KK
 
【Cloudian】FIT2015における会社製品紹介
【Cloudian】FIT2015における会社製品紹介【Cloudian】FIT2015における会社製品紹介
【Cloudian】FIT2015における会社製品紹介CLOUDIAN KK
 

More from CLOUDIAN KK (20)

CLOUDIAN HYPERSTORE - 風林火山ストレージ
CLOUDIAN HYPERSTORE - 風林火山ストレージCLOUDIAN HYPERSTORE - 風林火山ストレージ
CLOUDIAN HYPERSTORE - 風林火山ストレージ
 
クラウディアンのご紹介
クラウディアンのご紹介クラウディアンのご紹介
クラウディアンのご紹介
 
IoT/ビッグデータ/AI連携により次世代ストレージが促進するビジネス変革
IoT/ビッグデータ/AI連携により次世代ストレージが促進するビジネス変革IoT/ビッグデータ/AI連携により次世代ストレージが促進するビジネス変革
IoT/ビッグデータ/AI連携により次世代ストレージが促進するビジネス変革
 
CLOUDIAN Presentation at VERITAS VISION in Tokyo
CLOUDIAN Presentation at VERITAS VISION in TokyoCLOUDIAN Presentation at VERITAS VISION in Tokyo
CLOUDIAN Presentation at VERITAS VISION in Tokyo
 
S3 API接続検証プログラムのご紹介
S3 API接続検証プログラムのご紹介S3 API接続検証プログラムのご紹介
S3 API接続検証プログラムのご紹介
 
Auto tiering and Versioning of CLOUDIAN HyperStore
Auto tiering and Versioning of CLOUDIAN HyperStoreAuto tiering and Versioning of CLOUDIAN HyperStore
Auto tiering and Versioning of CLOUDIAN HyperStore
 
AWS SDK for Python and CLOUDIAN HyperStore
AWS SDK for Python and CLOUDIAN HyperStoreAWS SDK for Python and CLOUDIAN HyperStore
AWS SDK for Python and CLOUDIAN HyperStore
 
AWS CLI and CLOUDIAN HyperStore
AWS CLI and CLOUDIAN HyperStoreAWS CLI and CLOUDIAN HyperStore
AWS CLI and CLOUDIAN HyperStore
 
ZiDOMA data and CLOUDIAN HyperStore
ZiDOMA data and CLOUDIAN HyperStoreZiDOMA data and CLOUDIAN HyperStore
ZiDOMA data and CLOUDIAN HyperStore
 
FOBAS CSC and CLOUDIAN HyperStore
FOBAS CSC and CLOUDIAN HyperStoreFOBAS CSC and CLOUDIAN HyperStore
FOBAS CSC and CLOUDIAN HyperStore
 
ARCserve backup and CLOUDIAN HyperStore
ARCserve backup and CLOUDIAN HyperStoreARCserve backup and CLOUDIAN HyperStore
ARCserve backup and CLOUDIAN HyperStore
 
Cloudian presentation at idc japan sv2016
Cloudian presentation at idc japan sv2016Cloudian presentation at idc japan sv2016
Cloudian presentation at idc japan sv2016
 
ITコアを刷新するハイブリッドクラウド型ITシステム
ITコアを刷新するハイブリッドクラウド型ITシステムITコアを刷新するハイブリッドクラウド型ITシステム
ITコアを刷新するハイブリッドクラウド型ITシステム
 
【FOBAS】Data is money. ストレージ分散投資のススメ
【FOBAS】Data is money. ストレージ分散投資のススメ【FOBAS】Data is money. ストレージ分散投資のススメ
【FOBAS】Data is money. ストレージ分散投資のススメ
 
【ARI】ストレージのコスト・利便性・非機能要求項目を徹底比較
【ARI】ストレージのコスト・利便性・非機能要求項目を徹底比較【ARI】ストレージのコスト・利便性・非機能要求項目を徹底比較
【ARI】ストレージのコスト・利便性・非機能要求項目を徹底比較
 
【SIS】オブジェクトストレージを活用した増え続ける長期保管データの運用の効率化
【SIS】オブジェクトストレージを活用した増え続ける長期保管データの運用の効率化【SIS】オブジェクトストレージを活用した増え続ける長期保管データの運用の効率化
【SIS】オブジェクトストレージを活用した増え続ける長期保管データの運用の効率化
 
【CLOUDIAN】コード化されたインフラの実装
【CLOUDIAN】コード化されたインフラの実装【CLOUDIAN】コード化されたインフラの実装
【CLOUDIAN】コード化されたインフラの実装
 
【CLOUDIAN】自動階層化による現有ストレージ活用術
【CLOUDIAN】自動階層化による現有ストレージ活用術【CLOUDIAN】自動階層化による現有ストレージ活用術
【CLOUDIAN】自動階層化による現有ストレージ活用術
 
【CLOUDIAN】秒間隔RPO(目標復旧時点)の実現
【CLOUDIAN】秒間隔RPO(目標復旧時点)の実現【CLOUDIAN】秒間隔RPO(目標復旧時点)の実現
【CLOUDIAN】秒間隔RPO(目標復旧時点)の実現
 
【Cloudian】FIT2015における会社製品紹介
【Cloudian】FIT2015における会社製品紹介【Cloudian】FIT2015における会社製品紹介
【Cloudian】FIT2015における会社製品紹介
 

Recently uploaded

DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 

Recently uploaded (20)

DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 

Hibaridb: High Performance NoSQL Database

  • 1. # hibaridb Hibari/NOSQL for BIGDATA November 2010 Gemini Mobile Technologies Hibari Open Source project: http://sourceforge.net/projects/hibari/ 1
  • 2. Introduction • Founded: July, 2001 • Offices: San Francisco, Tokyo, Beijing • Investors: – Goldman Sachs, Mitsubishi-UFJ, Mizuho, Nomura, Ignite, Access, Aplix • Accomplishments: – Messaging Products • Provide MMSC to 3 out of 4 Carriers in Japan (DoCoMo, Softbank, eMobile) • Largest MMSC in the world (Softbank Japan) • OEM to Alcatel-Lucent and ByteMobile – NOSQL / Big Data • 2006: First Mobile 3D SNS (Softbank, China Unicom, iPhone App) • 4/2010: WebMail, Japanese Mobile Carrier & Internet Provider • 7/2010: Hibari Open Source 2
  • 4. Hibari (= Cloud Birds) 4
  • 5. What is Hibari? • Hibari is a production-ready, distributed, key-value, big data store. – China Mobile and China Unicom - SNS – Japanese internet provider - GB mailbox webmail – Japanese mobile carrier - GB mailbox webmail • Hibari uses chain replication for strong consistency, high- availability, and durability. • Hibari has excellent performance especially for read and large value operations. • Hibari is open-source software under the Apache 2.0 license. 5
  • 6. Environments • Hibari runs on commodity, heterogeneous servers. • Hibari supports Red Hat, CentOS, and Fedora Linux distributions. – Debian, Ubuntu, Gentoo, Mac OS X, and Free BSD are coming soon. • Hibari supports Erlang/OTP R13B04. – R14B is coming soon. • Hibari supports Amazon S3, JSON-RPC-RFC4627, UBF/EBF/JSF and native Erlang client APIs. – Thrift is coming soon. 6
  • 7. Why NOSQL? We needed to build a scalable, high performance web mail system • Big Data – Several million end users from the start – Several billion messages in a few month – Hundreds of TB data • Low Cost requirements – Customer’s business model (Freemium) – Distributed >50 PC servers – No need for rich and expensive functions of SQL • Continuous growth of data in the storage – Elasticity to expand capacity due to increasing data 7
  • 8. What were the customer’s needs? • Durability – Data loss (e.g., messages, metadata) is not acceptable • Strong Consistency – Because of interactive sessions, consistency is required • Low Latency – <1 sec response time for end user transactions • High Availability – As a branded service to the end user, service must always be available. • Read Heavy – Many more read than write operations • Big Data and Data size highly variable – Large GB mail box as service differentiator was required – Mail messages range from a few bytes to many MB with attachments 8
  • 9. How does Hibari address these needs? • Durable updates Every update is written and flushed to stable storage (fsync() system call) before sending acknowledgments to the client. • Consistent updates After an update is acknowledged, no client can see an older version. ”Chain Replication” is used to maintain consistency across all replicas. • High Availability Each key can be replicated multiple times. As long as one copy of the key survives, all operations on that key are permitted. • Lockless API Locks are not required for all client operations. Optionally, Hibari supports “test-and-set” of each key-value pair via an increasing (enforced by the server) timestamp value. • Micro-transactions Under limited circumstances, operations on multiple keys can be given transactional commit/abort semantics. 9
  • 10. Chain Replication for Strong Consistency Key Values Data Model 0…………n BLOB (Binary Large OBject) Key 1 Key 2 Key 3 ・ ・ ・ ・ Key n Key Value Table Value Value Value ・ ・ ・ ・ Value Go to Chain A Go to Chain B Consistent Hashing e t s PC 1 rit ues PC 2 PC 3 lie PC 1 PC 2 PC 3 Re qu Re p Re ad est W eq l R Al s e t lie rit ues Re qu Re p Re ad est Head Middle Tail W eq l Al R Chain A Tail Head Middle 10 Chain B
  • 11. Concept of “Chain” Evenly distributed load in multiple nodes (Case of 3 replications/6 chains) PC 1 PC 2 PC 3 PC 4 PC 5 PC 6 Chain A Head Middle Tail Head Middle Tail Chain D Chain B Tail Head Middle Tail Head Middle Chain E Chain C Middle Tail Head Middle Tail Head Chain F 11
  • 12. Chain Replication for High Availability and Fault Tolerance Failover mechanism PC 1 PC 2 PC 3 PC 1 PC 2 PC 3 Head Middle Tail Head Tail Tail Head Middle Tail Head Middle Tail Head Tail Head Node down Service continuation 12
  • 13. Hibari Benchmarking Hibari’s benchmarking has been done with primarily 3 approaches • Micro benchmarks, standalone load client, and end-to-end as part of larger systems In house In house • C/C++ and Erlang based implementations tools tools • Good for Gemini’s internal use but not ideal for the open source community Yahoo’s Cloud • New, easy to use Java-based load tool Yahoo’s Cloud Storage Storage • Java implementation for Cassandra db driver and new Hibari db Benchmark Benchmark driver (YCSB) (YCSB) • But … investigating latency issues with Hibari’s YCSB db driver • Simple, easy to use Erlang-based load tool Basho Bench • Erlang implementation for Cassandra db driver and new Hibari db Basho Bench driver (BB) (BB) • But … traffic scenarios are not as full featured as YCSB and investigating stability issues with Cassandra’s BB db driver 13
  • 14. Test conditions Tool: Basho Bench - Hibari db driver and Cassandra db driver Keys: 1 … 1,000,000 Truncated Pareto (20% of keys, 80% of time) Values: 10K, 50k, 100K, and 150K Operations: 20 minutes Random Put 40 minutes Random Get/Put (50%/50%) Replication factor: 2 Disk sync of commit log: enabled Storage: Keys RAM+DISK, Values DISK only Hardware 14
  • 16. “Hibari Stub” results Average Average Latency/msec Hibari Stub “get” Latency/msec Hibari Stub “put” 100 100 90 90 80 80 70 70 60 60 50 50 40 40 30 30 20 20 10 10 0 0 elapsed 500 1000 1500 2000 elapsed 500 1000 1500 2000 Sec elapsed Sec elapsed get 10K stub get 50K stub put 10K stub put 50K stub get 100K stub get 150K stub put 100K stub put 150K stub 16
  • 17. “Hibari and Cassandra” results by Average Basho Bench tool Average Apache-Cassandra 0.6.5 Latency/msec Cassandra “get” Latency/msec Hibari “get” 100 100 90 90 80 80 70 70 60 60 50 50 40 40 30 30 20 20 10 10 0 0 Sec elapsed elapsed 500 1000 1500 2000 Sec elapsed elapsed 500 1000 1500 2000 get 10K get 50K get 100K get 150K get 10K get 50K get 100K get 150K Cassandra “put” Hibari “put” 100 100 90 90 80 80 70 70 60 60 50 50 40 40 30 30 20 20 10 10 0 0 Sec elapsed elapsed 500 1000 1500 2000 Sec elapsed elapsed 500 1000 1500 2000 put 10K put 50K put 100K put 150K put 10K put 50K put 100K put 150K 17
  • 18. Language (& Priority) Barriers • Latency issues with Java load tool speaking with an Erlang NOSQL server • Stability issues with Erlang load tool speaking with a Java NOSQL server Further investigation is required to identify issues and areas of improvement. Our primary targets are Hibari, Cassandra, and the YCSB tool. 18
  • 19. “Applications / Customer first” approach leads to mutual complements FEATURE (e.g.) Cassandra Hibari Data model Column-oriented Key-value Data partitioning Consistent hashing Consistent hashing Data consistency Configurable Yes Data replication Preference lists Chain replication Elasticity Admin operations, Gossip, Chain migration Data redistribution Node health Peer-to-peer monitor, gossip Admin server monitors detection O&M nodetool, JMX Admin UI with brick, chain health, statistics Performance write-optimized read-optimized Implementation Java Erlang API get/put/delete, scan, get/put/delete, micro- map/reduce, atomic row ops transactions 19
  • 20. “Not Only Hibari” Roadmap 2010 ~ 2011~ Erlang 14A Erlang 14B Hibari Open Source Ubuntu Red Hat Gentoo Enterprise OS Cent OS Debian Fedora MacOS X Free BSD Cassandra Integration Thrift Hadoop Map/Reduce Application UBF tools C++ Pig Java Erlang Python Client API Amazon S3 Ruby JSON-RPC-RFC4627 Perl Google Buffer Protocol Monitoring tool YCSB • Based on customer needs Configuration tool Basho Bench • Expect Open source contribution Management tool 20
  • 21. Series of “NOSQL Hands-on Training” will be announced soon! Please follow @geminimobile Thank you Hibari Open Source project: http://sourceforge.net/projects/hibari/ Hibari Twitter: @hibaridb Hashtag: #hibaridb Gemini Twitter: @geminimobile Big Data blog: http://hibari-gemini.blogspot.com/ Slideshare: http://www.slideshare.net/geminimobile 21