SlideShare una empresa de Scribd logo
1 de 26
Zing Database – Distributed Key-Value Database Nguyễn Quang Nam Zing Web-Technical Team
Content Why Introduction Overview architecture 1 3 2 Single Server/Storage 4 Distribution 5
Introduction
Some statistics: - Feeds: 1.6 B, 700 GB hard drive in 4 DB instances, 8 caching servers, 136 GB memory cache in used. - User Profiles: 44.5 M registered accounts, 2 database instances, 30 GB memory cache. - Comments: 350 M, 50 GB hard drive in 2 DB instances, 20 GB memory cache
Why
Access time L1 cache reference 0.5 ns Branch mispredict 5 ns L2 cache reference 7 ns Mutex lock/unlock 100 ns Main memory reference 100 ns Compress 1K bytes with Zippy 10,000 ns Send 2K bytes over 1 Gbps network 20,000 ns Read 1 MB sequentially from memory 250,000 ns Round trip within same datacenter 500,000 ns Disk seek 10,000,000 ns Read 1 MB sequentially from network 10,000,000 ns Read 1 MB sequentially from disk 30,000,000 ns Send packet CA->Netherlands->CA 150,000,000 ns by Jeff Dean (http://labs.google.com/people/jeff)
Standard & Real Requirement - Time to load a page < 200 ms - Read data rate ~12K ops/sec - Write data rate ~8K ops/sec - Caching service/Database recovery time < 5 mins
Existent thing - RDBMS (MySQL, MSSQL): Write: too slow; Read: so so with a small DB, too bad with a huge DB - Cassandra (by Facebook): difficult to do operation/maintain, and performance is not so good - HBase/Hadoop: We use this for log system - MongoDB, Membase, Tokyo Tyrant, .. : OK! we use these in several cases, but not suitable for all
Overview architecture
 
Server/Storage
ZNonblockingServer - Based on TNonblockingServer (Apache Thrift) - 185K reqs/sec (original TNonblockingServer is just 45K reqs/sec) - Serialize/Deserialize data - Prevent overload server - Data is not secured while transferring - Protect service from invalid requests
ICache - Least Recently Used/Time based expiration strategy - zlru_table<key_type, value_type>: hash table data structure - Re-write malloc/free functions instead of using standard malloc/free in glibc to reduce memory fragment - Support dirty-items marking => for lazy DB flush
ZiDB - Separate into DataFile & IndexFile - 1 seek for a read, 1-2 seeks for a write - IndexFile (hash structure) is loaded onto memory as a mapping file (shared memory) to reduce system call - Write-ahead log to avoid data loss - Data magic-padding - Checksum & checkpoint for repair data - Partitioning DB for easier maintenance
Distribution
Key requirements: - Scalability - Load balance - Availability - Consistency
2 Models: - Centralized: 1 addressing server & multiple storage servers => bottleneck & single-point-of-failure - Peer-peer: Each server includes addressing module & storage 2 Types of routing: - Client routing: Each client itself does the addressing and query data  - Server routing: The addressing is done at server
Operation Flows * Addressing module is moved into each storage node in Peer-peer model  Business Logic Server Addressing Server (DHT) Storage Layer Storage Node 1 ICache ZiDB Storage Module Storage Node N ICache ZiDB Storage Module … (1)  Request key locations (2) Key locations (3) Get & Set  operations (4) Operation  returns
Addressing: - Provide key locations of resources - Basically a Distributed Hash Table, using consistent hashing - Hashing: Jenkins, Murmur, or any algorithm that satisfies two conditions:   - Uniform distribution of generated keys in the key space   - Consistency (MD5, SHA are bad choice since performance)
Addressing - Node location: Each node is assigned a continuous range of IDs (hashed key)
Addressing - Node location: Golden ratio principle (a/b = 2b/a) - Init ratio = 1.618 - Max ratio ~ 2.6 - Easy to implement - Easy for routing from client 2 3 4 5 1
Server 1: 1,2,3 Server 2: 4,5,6,7 Server 3: 8,9 1 4 7 3 6 2 5 8 9 Addressing - Node location: Virtual nodes - Each real server has multiple virtual nodes on ring - More virtual nodes, more balance of load - Hard to maintain table of nodes
A A A B B C Addressing – Multi-layer rings - Store the change history of system  - Provide availability/reconfigurability - Able to put a node on ring manually * Write: data is located on the highest ring * Read: data is located on the highest ring, then lower rings if not found
Replication & Backup  - Each node has one primary range of IDs, and Some secondary range of IDs - Each real node need a backup instance to replace in case  it’s down * Data is queried from primary node, then secondary nodes
Configuration: to find the best parameters to configure DB or to choose the suitable DB type.  - How many read/write per second? - Length Deviation of data: data length is same same or much different each others,  - Has updation/deletion data?  - How important of data: acceptable loss or not - The old data can be recycled?
Q & A Contact: Nguyễn Quang Nam [email_address] http://me.zing.vn/nam.nq

Más contenido relacionado

La actualidad más candente

How Facebook actually works????
How Facebook actually works????How Facebook actually works????
How Facebook actually works????Dhruv Patel
 
Redis : Database, cache, pub/sub and more at Jelly button games
Redis : Database, cache, pub/sub and more at Jelly button gamesRedis : Database, cache, pub/sub and more at Jelly button games
Redis : Database, cache, pub/sub and more at Jelly button gamesRedis Labs
 
OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh
OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen AnhOGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh
OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen AnhBuff Nguyen
 
10 domino integration
10   domino integration10   domino integration
10 domino integrationdarwinodb
 
Newsql 2015-150213024325-conversion-gate01
Newsql 2015-150213024325-conversion-gate01Newsql 2015-150213024325-conversion-gate01
Newsql 2015-150213024325-conversion-gate01Jagadeesha DG
 
High Performance - Joomla!Days NL 2009 #jd09nl
High Performance - Joomla!Days NL 2009 #jd09nlHigh Performance - Joomla!Days NL 2009 #jd09nl
High Performance - Joomla!Days NL 2009 #jd09nlJoomla!Days Netherlands
 
Zarafa SummerCamp 2012 - Steve Hardy Friday Keynote
Zarafa SummerCamp 2012 - Steve Hardy Friday KeynoteZarafa SummerCamp 2012 - Steve Hardy Friday Keynote
Zarafa SummerCamp 2012 - Steve Hardy Friday KeynoteZarafa
 
Operationalizing MongoDB at AOL
Operationalizing MongoDB at AOLOperationalizing MongoDB at AOL
Operationalizing MongoDB at AOLradiocats
 
WordCamp RVA 2011 - Performance & Tuning
WordCamp RVA 2011 - Performance & TuningWordCamp RVA 2011 - Performance & Tuning
WordCamp RVA 2011 - Performance & TuningTimothy Wood
 
Modern Distributed Messaging and RPC
Modern Distributed Messaging and RPCModern Distributed Messaging and RPC
Modern Distributed Messaging and RPCMax Alexejev
 
[WSO2Con EU 2017] Ballerina: Exploring Data Integration
[WSO2Con EU 2017] Ballerina: Exploring Data Integration[WSO2Con EU 2017] Ballerina: Exploring Data Integration
[WSO2Con EU 2017] Ballerina: Exploring Data IntegrationWSO2
 
Introduction to Apache BookKeeper Distributed Storage
Introduction to Apache BookKeeper Distributed StorageIntroduction to Apache BookKeeper Distributed Storage
Introduction to Apache BookKeeper Distributed StorageStreamlio
 
When to Use MongoDB...and When You Should Not...
When to Use MongoDB...and When You Should Not...When to Use MongoDB...and When You Should Not...
When to Use MongoDB...and When You Should Not...MongoDB
 
Zarafa SummerCamp 2012 - Exchange Web Services, technical information
Zarafa SummerCamp 2012 - Exchange Web Services, technical informationZarafa SummerCamp 2012 - Exchange Web Services, technical information
Zarafa SummerCamp 2012 - Exchange Web Services, technical informationZarafa
 
MongoDB vs Mysql. A devops point of view
MongoDB vs Mysql. A devops point of viewMongoDB vs Mysql. A devops point of view
MongoDB vs Mysql. A devops point of viewPierre Baillet
 
Optimising for Performance
Optimising for PerformanceOptimising for Performance
Optimising for Performancethomas_mb
 

La actualidad más candente (19)

How Facebook actually works????
How Facebook actually works????How Facebook actually works????
How Facebook actually works????
 
Redis : Database, cache, pub/sub and more at Jelly button games
Redis : Database, cache, pub/sub and more at Jelly button gamesRedis : Database, cache, pub/sub and more at Jelly button games
Redis : Database, cache, pub/sub and more at Jelly button games
 
OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh
OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen AnhOGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh
OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh
 
10 domino integration
10   domino integration10   domino integration
10 domino integration
 
Microsoft Web Technology Stack
Microsoft Web Technology StackMicrosoft Web Technology Stack
Microsoft Web Technology Stack
 
Newsql 2015-150213024325-conversion-gate01
Newsql 2015-150213024325-conversion-gate01Newsql 2015-150213024325-conversion-gate01
Newsql 2015-150213024325-conversion-gate01
 
High Performance - Joomla!Days NL 2009 #jd09nl
High Performance - Joomla!Days NL 2009 #jd09nlHigh Performance - Joomla!Days NL 2009 #jd09nl
High Performance - Joomla!Days NL 2009 #jd09nl
 
Zarafa SummerCamp 2012 - Steve Hardy Friday Keynote
Zarafa SummerCamp 2012 - Steve Hardy Friday KeynoteZarafa SummerCamp 2012 - Steve Hardy Friday Keynote
Zarafa SummerCamp 2012 - Steve Hardy Friday Keynote
 
Operationalizing MongoDB at AOL
Operationalizing MongoDB at AOLOperationalizing MongoDB at AOL
Operationalizing MongoDB at AOL
 
WordCamp RVA 2011 - Performance & Tuning
WordCamp RVA 2011 - Performance & TuningWordCamp RVA 2011 - Performance & Tuning
WordCamp RVA 2011 - Performance & Tuning
 
Modern Distributed Messaging and RPC
Modern Distributed Messaging and RPCModern Distributed Messaging and RPC
Modern Distributed Messaging and RPC
 
[WSO2Con EU 2017] Ballerina: Exploring Data Integration
[WSO2Con EU 2017] Ballerina: Exploring Data Integration[WSO2Con EU 2017] Ballerina: Exploring Data Integration
[WSO2Con EU 2017] Ballerina: Exploring Data Integration
 
Introduction to Apache BookKeeper Distributed Storage
Introduction to Apache BookKeeper Distributed StorageIntroduction to Apache BookKeeper Distributed Storage
Introduction to Apache BookKeeper Distributed Storage
 
Ui perf
Ui perfUi perf
Ui perf
 
When to Use MongoDB...and When You Should Not...
When to Use MongoDB...and When You Should Not...When to Use MongoDB...and When You Should Not...
When to Use MongoDB...and When You Should Not...
 
Load balancing at tuenti
Load balancing at tuentiLoad balancing at tuenti
Load balancing at tuenti
 
Zarafa SummerCamp 2012 - Exchange Web Services, technical information
Zarafa SummerCamp 2012 - Exchange Web Services, technical informationZarafa SummerCamp 2012 - Exchange Web Services, technical information
Zarafa SummerCamp 2012 - Exchange Web Services, technical information
 
MongoDB vs Mysql. A devops point of view
MongoDB vs Mysql. A devops point of viewMongoDB vs Mysql. A devops point of view
MongoDB vs Mysql. A devops point of view
 
Optimising for Performance
Optimising for PerformanceOptimising for Performance
Optimising for Performance
 

Destacado

Design a scalable social network: Problems and Solutions
Design a scalable social network: Problems and SolutionsDesign a scalable social network: Problems and Solutions
Design a scalable social network: Problems and SolutionsChau Thanh
 
IoT and developer chances
IoT and developer chancesIoT and developer chances
IoT and developer chancesChau Thanh
 
Buiding and Deploying SaaS with WSO2 as as-a-Service
Buiding and Deploying SaaS with WSO2 as as-a-ServiceBuiding and Deploying SaaS with WSO2 as as-a-Service
Buiding and Deploying SaaS with WSO2 as as-a-ServiceWSO2
 
Memcached vs redis
Memcached vs redisMemcached vs redis
Memcached vs redisqianshi
 
Design a scalable site: Problem and solutions
Design a scalable site: Problem and solutionsDesign a scalable site: Problem and solutions
Design a scalable site: Problem and solutionsChau Thanh
 
Sơ lược kiến trúc hệ thống Zing Me
Sơ lược kiến trúc hệ thống Zing MeSơ lược kiến trúc hệ thống Zing Me
Sơ lược kiến trúc hệ thống Zing Mezingopen
 
Building ZingMe News Feed System
Building ZingMe News Feed SystemBuilding ZingMe News Feed System
Building ZingMe News Feed SystemChau Thanh
 
Design a scalable social network: Problems and solutions
Design a scalable social network: Problems and solutionsDesign a scalable social network: Problems and solutions
Design a scalable social network: Problems and solutionsChau Thanh
 
Architecture Patterns - Open Discussion
Architecture Patterns - Open DiscussionArchitecture Patterns - Open Discussion
Architecture Patterns - Open DiscussionNguyen Tung
 
Zingme practice for building scalable website with PHP
Zingme practice for building scalable website with PHPZingme practice for building scalable website with PHP
Zingme practice for building scalable website with PHPChau Thanh
 
SaaS Introduction-May2014
SaaS Introduction-May2014SaaS Introduction-May2014
SaaS Introduction-May2014Nguyen Tung
 
Microservice Architecture
Microservice ArchitectureMicroservice Architecture
Microservice ArchitectureNguyen Tung
 
7 Stages of Scaling Web Applications
7 Stages of Scaling Web Applications7 Stages of Scaling Web Applications
7 Stages of Scaling Web ApplicationsDavid Mitzenmacher
 
facebook architecture for 600M users
facebook architecture for 600M usersfacebook architecture for 600M users
facebook architecture for 600M usersJongyoon Choi
 

Destacado (15)

Big data
Big dataBig data
Big data
 
Design a scalable social network: Problems and Solutions
Design a scalable social network: Problems and SolutionsDesign a scalable social network: Problems and Solutions
Design a scalable social network: Problems and Solutions
 
IoT and developer chances
IoT and developer chancesIoT and developer chances
IoT and developer chances
 
Buiding and Deploying SaaS with WSO2 as as-a-Service
Buiding and Deploying SaaS with WSO2 as as-a-ServiceBuiding and Deploying SaaS with WSO2 as as-a-Service
Buiding and Deploying SaaS with WSO2 as as-a-Service
 
Memcached vs redis
Memcached vs redisMemcached vs redis
Memcached vs redis
 
Design a scalable site: Problem and solutions
Design a scalable site: Problem and solutionsDesign a scalable site: Problem and solutions
Design a scalable site: Problem and solutions
 
Sơ lược kiến trúc hệ thống Zing Me
Sơ lược kiến trúc hệ thống Zing MeSơ lược kiến trúc hệ thống Zing Me
Sơ lược kiến trúc hệ thống Zing Me
 
Building ZingMe News Feed System
Building ZingMe News Feed SystemBuilding ZingMe News Feed System
Building ZingMe News Feed System
 
Design a scalable social network: Problems and solutions
Design a scalable social network: Problems and solutionsDesign a scalable social network: Problems and solutions
Design a scalable social network: Problems and solutions
 
Architecture Patterns - Open Discussion
Architecture Patterns - Open DiscussionArchitecture Patterns - Open Discussion
Architecture Patterns - Open Discussion
 
Zingme practice for building scalable website with PHP
Zingme practice for building scalable website with PHPZingme practice for building scalable website with PHP
Zingme practice for building scalable website with PHP
 
SaaS Introduction-May2014
SaaS Introduction-May2014SaaS Introduction-May2014
SaaS Introduction-May2014
 
Microservice Architecture
Microservice ArchitectureMicroservice Architecture
Microservice Architecture
 
7 Stages of Scaling Web Applications
7 Stages of Scaling Web Applications7 Stages of Scaling Web Applications
7 Stages of Scaling Web Applications
 
facebook architecture for 600M users
facebook architecture for 600M usersfacebook architecture for 600M users
facebook architecture for 600M users
 

Similar a Zing Database – Distributed Key-Value Database

Understanding and building big data Architectures - NoSQL
Understanding and building big data Architectures - NoSQLUnderstanding and building big data Architectures - NoSQL
Understanding and building big data Architectures - NoSQLHyderabad Scalability Meetup
 
In-Memory Data Grids - Ampool (1)
In-Memory Data Grids - Ampool (1)In-Memory Data Grids - Ampool (1)
In-Memory Data Grids - Ampool (1)Chinmay Kulkarni
 
Elasticsearch for Logs & Metrics - a deep dive
Elasticsearch for Logs & Metrics - a deep diveElasticsearch for Logs & Metrics - a deep dive
Elasticsearch for Logs & Metrics - a deep diveSematext Group, Inc.
 
Hadoop Architecture_Cluster_Cap_Plan
Hadoop Architecture_Cluster_Cap_PlanHadoop Architecture_Cluster_Cap_Plan
Hadoop Architecture_Cluster_Cap_PlanNarayana B
 
Hardware Provisioning
Hardware ProvisioningHardware Provisioning
Hardware ProvisioningMongoDB
 
Managing Security At 1M Events a Second using Elasticsearch
Managing Security At 1M Events a Second using ElasticsearchManaging Security At 1M Events a Second using Elasticsearch
Managing Security At 1M Events a Second using ElasticsearchJoe Alex
 
Application Caching: The Hidden Microservice
Application Caching: The Hidden MicroserviceApplication Caching: The Hidden Microservice
Application Caching: The Hidden MicroserviceScott Mansfield
 
Caching Methodology & Strategies
Caching Methodology & StrategiesCaching Methodology & Strategies
Caching Methodology & StrategiesTiệp Vũ
 
Caching methodology and strategies
Caching methodology and strategiesCaching methodology and strategies
Caching methodology and strategiesTiep Vu
 
EVCache: Lowering Costs for a Low Latency Cache with RocksDB
EVCache: Lowering Costs for a Low Latency Cache with RocksDBEVCache: Lowering Costs for a Low Latency Cache with RocksDB
EVCache: Lowering Costs for a Low Latency Cache with RocksDBScott Mansfield
 
(ATS6-PLAT06) Maximizing AEP Performance
(ATS6-PLAT06) Maximizing AEP Performance(ATS6-PLAT06) Maximizing AEP Performance
(ATS6-PLAT06) Maximizing AEP PerformanceBIOVIA
 
Data has a better idea the in-memory data grid
Data has a better idea   the in-memory data gridData has a better idea   the in-memory data grid
Data has a better idea the in-memory data gridBogdan Dina
 
Using galera replication to create geo distributed clusters on the wan
Using galera replication to create geo distributed clusters on the wanUsing galera replication to create geo distributed clusters on the wan
Using galera replication to create geo distributed clusters on the wanSakari Keskitalo
 
Using galera replication to create geo distributed clusters on the wan
Using galera replication to create geo distributed clusters on the wanUsing galera replication to create geo distributed clusters on the wan
Using galera replication to create geo distributed clusters on the wanSakari Keskitalo
 
Centralized log-management-with-elastic-stack
Centralized log-management-with-elastic-stackCentralized log-management-with-elastic-stack
Centralized log-management-with-elastic-stackRich Lee
 
Taking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionTaking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionSplunk
 

Similar a Zing Database – Distributed Key-Value Database (20)

Understanding and building big data Architectures - NoSQL
Understanding and building big data Architectures - NoSQLUnderstanding and building big data Architectures - NoSQL
Understanding and building big data Architectures - NoSQL
 
In-Memory Data Grids - Ampool (1)
In-Memory Data Grids - Ampool (1)In-Memory Data Grids - Ampool (1)
In-Memory Data Grids - Ampool (1)
 
Elasticsearch for Logs & Metrics - a deep dive
Elasticsearch for Logs & Metrics - a deep diveElasticsearch for Logs & Metrics - a deep dive
Elasticsearch for Logs & Metrics - a deep dive
 
Hadoop Architecture_Cluster_Cap_Plan
Hadoop Architecture_Cluster_Cap_PlanHadoop Architecture_Cluster_Cap_Plan
Hadoop Architecture_Cluster_Cap_Plan
 
Hardware Provisioning
Hardware ProvisioningHardware Provisioning
Hardware Provisioning
 
Managing Security At 1M Events a Second using Elasticsearch
Managing Security At 1M Events a Second using ElasticsearchManaging Security At 1M Events a Second using Elasticsearch
Managing Security At 1M Events a Second using Elasticsearch
 
Application Caching: The Hidden Microservice
Application Caching: The Hidden MicroserviceApplication Caching: The Hidden Microservice
Application Caching: The Hidden Microservice
 
Caching Methodology & Strategies
Caching Methodology & StrategiesCaching Methodology & Strategies
Caching Methodology & Strategies
 
Caching methodology and strategies
Caching methodology and strategiesCaching methodology and strategies
Caching methodology and strategies
 
EVCache: Lowering Costs for a Low Latency Cache with RocksDB
EVCache: Lowering Costs for a Low Latency Cache with RocksDBEVCache: Lowering Costs for a Low Latency Cache with RocksDB
EVCache: Lowering Costs for a Low Latency Cache with RocksDB
 
(ATS6-PLAT06) Maximizing AEP Performance
(ATS6-PLAT06) Maximizing AEP Performance(ATS6-PLAT06) Maximizing AEP Performance
(ATS6-PLAT06) Maximizing AEP Performance
 
Data has a better idea the in-memory data grid
Data has a better idea   the in-memory data gridData has a better idea   the in-memory data grid
Data has a better idea the in-memory data grid
 
FAQ
FAQFAQ
FAQ
 
Using galera replication to create geo distributed clusters on the wan
Using galera replication to create geo distributed clusters on the wanUsing galera replication to create geo distributed clusters on the wan
Using galera replication to create geo distributed clusters on the wan
 
Using galera replication to create geo distributed clusters on the wan
Using galera replication to create geo distributed clusters on the wanUsing galera replication to create geo distributed clusters on the wan
Using galera replication to create geo distributed clusters on the wan
 
Centralized log-management-with-elastic-stack
Centralized log-management-with-elastic-stackCentralized log-management-with-elastic-stack
Centralized log-management-with-elastic-stack
 
Using galera replication to create geo distributed clusters on the wan
Using galera replication to create geo distributed clusters on the wanUsing galera replication to create geo distributed clusters on the wan
Using galera replication to create geo distributed clusters on the wan
 
Exchange Server 2013 Database and Store Changes
Exchange Server 2013 Database and Store ChangesExchange Server 2013 Database and Store Changes
Exchange Server 2013 Database and Store Changes
 
Taking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionTaking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout Session
 
Gcp data engineer
Gcp data engineerGcp data engineer
Gcp data engineer
 

Más de zingopen

Zing Me cung cấp gói hỗ trợ miễn phí cho Doanh nghiệp
Zing Me cung cấp gói hỗ trợ miễn phí cho Doanh nghiệpZing Me cung cấp gói hỗ trợ miễn phí cho Doanh nghiệp
Zing Me cung cấp gói hỗ trợ miễn phí cho Doanh nghiệpzingopen
 
Zing Me Platform Policy
Zing Me Platform PolicyZing Me Platform Policy
Zing Me Platform Policyzingopen
 
Zing Me Workshop 11082012
Zing Me Workshop 11082012Zing Me Workshop 11082012
Zing Me Workshop 11082012zingopen
 
Quản lý Zing Me fanpage một cách hiệu quả
Quản lý Zing Me fanpage một cách hiệu quảQuản lý Zing Me fanpage một cách hiệu quả
Quản lý Zing Me fanpage một cách hiệu quảzingopen
 
The social shop- proposal
The social shop- proposalThe social shop- proposal
The social shop- proposalzingopen
 
Tích hợp kỹ thuật của Ứng dụng trên Zing Me
Tích hợp kỹ thuật của Ứng dụng trên Zing MeTích hợp kỹ thuật của Ứng dụng trên Zing Me
Tích hợp kỹ thuật của Ứng dụng trên Zing Mezingopen
 
Zing Open Platform APIs
Zing Open Platform APIsZing Open Platform APIs
Zing Open Platform APIszingopen
 
Fanpage Management
Fanpage ManagementFanpage Management
Fanpage Managementzingopen
 
Partnership Proposal
Partnership ProposalPartnership Proposal
Partnership Proposalzingopen
 
Cơ hội và thách thức cho DN Vừa và Nhỏ trên MXH
Cơ hội và thách thức cho DN Vừa và Nhỏ trên MXHCơ hội và thách thức cho DN Vừa và Nhỏ trên MXH
Cơ hội và thách thức cho DN Vừa và Nhỏ trên MXHzingopen
 
Checklist Zing Me Fanpage
Checklist Zing Me FanpageChecklist Zing Me Fanpage
Checklist Zing Me Fanpagezingopen
 
Check List Zing Me Fan page
Check List Zing Me Fan pageCheck List Zing Me Fan page
Check List Zing Me Fan pagezingopen
 
Check List Zing Me Fan page
Check List Zing Me Fan pageCheck List Zing Me Fan page
Check List Zing Me Fan pagezingopen
 
Check list Zing Me Fan page
Check list Zing Me Fan pageCheck list Zing Me Fan page
Check list Zing Me Fan pagezingopen
 
Behavior of Zing Me users
 Behavior of Zing Me users Behavior of Zing Me users
Behavior of Zing Me userszingopen
 
Zing Me Users Proflie
Zing Me Users Proflie Zing Me Users Proflie
Zing Me Users Proflie zingopen
 
Build fame and make money with social media
Build fame and make money with social mediaBuild fame and make money with social media
Build fame and make money with social mediazingopen
 
Google cooperate with VNG_Presentation
Google cooperate with VNG_PresentationGoogle cooperate with VNG_Presentation
Google cooperate with VNG_Presentationzingopen
 
Branding in Farm 2
Branding in Farm 2Branding in Farm 2
Branding in Farm 2zingopen
 
Zing me credential
Zing me credentialZing me credential
Zing me credentialzingopen
 

Más de zingopen (20)

Zing Me cung cấp gói hỗ trợ miễn phí cho Doanh nghiệp
Zing Me cung cấp gói hỗ trợ miễn phí cho Doanh nghiệpZing Me cung cấp gói hỗ trợ miễn phí cho Doanh nghiệp
Zing Me cung cấp gói hỗ trợ miễn phí cho Doanh nghiệp
 
Zing Me Platform Policy
Zing Me Platform PolicyZing Me Platform Policy
Zing Me Platform Policy
 
Zing Me Workshop 11082012
Zing Me Workshop 11082012Zing Me Workshop 11082012
Zing Me Workshop 11082012
 
Quản lý Zing Me fanpage một cách hiệu quả
Quản lý Zing Me fanpage một cách hiệu quảQuản lý Zing Me fanpage một cách hiệu quả
Quản lý Zing Me fanpage một cách hiệu quả
 
The social shop- proposal
The social shop- proposalThe social shop- proposal
The social shop- proposal
 
Tích hợp kỹ thuật của Ứng dụng trên Zing Me
Tích hợp kỹ thuật của Ứng dụng trên Zing MeTích hợp kỹ thuật của Ứng dụng trên Zing Me
Tích hợp kỹ thuật của Ứng dụng trên Zing Me
 
Zing Open Platform APIs
Zing Open Platform APIsZing Open Platform APIs
Zing Open Platform APIs
 
Fanpage Management
Fanpage ManagementFanpage Management
Fanpage Management
 
Partnership Proposal
Partnership ProposalPartnership Proposal
Partnership Proposal
 
Cơ hội và thách thức cho DN Vừa và Nhỏ trên MXH
Cơ hội và thách thức cho DN Vừa và Nhỏ trên MXHCơ hội và thách thức cho DN Vừa và Nhỏ trên MXH
Cơ hội và thách thức cho DN Vừa và Nhỏ trên MXH
 
Checklist Zing Me Fanpage
Checklist Zing Me FanpageChecklist Zing Me Fanpage
Checklist Zing Me Fanpage
 
Check List Zing Me Fan page
Check List Zing Me Fan pageCheck List Zing Me Fan page
Check List Zing Me Fan page
 
Check List Zing Me Fan page
Check List Zing Me Fan pageCheck List Zing Me Fan page
Check List Zing Me Fan page
 
Check list Zing Me Fan page
Check list Zing Me Fan pageCheck list Zing Me Fan page
Check list Zing Me Fan page
 
Behavior of Zing Me users
 Behavior of Zing Me users Behavior of Zing Me users
Behavior of Zing Me users
 
Zing Me Users Proflie
Zing Me Users Proflie Zing Me Users Proflie
Zing Me Users Proflie
 
Build fame and make money with social media
Build fame and make money with social mediaBuild fame and make money with social media
Build fame and make money with social media
 
Google cooperate with VNG_Presentation
Google cooperate with VNG_PresentationGoogle cooperate with VNG_Presentation
Google cooperate with VNG_Presentation
 
Branding in Farm 2
Branding in Farm 2Branding in Farm 2
Branding in Farm 2
 
Zing me credential
Zing me credentialZing me credential
Zing me credential
 

Último

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 

Último (20)

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 

Zing Database – Distributed Key-Value Database

  • 1. Zing Database – Distributed Key-Value Database Nguyễn Quang Nam Zing Web-Technical Team
  • 2. Content Why Introduction Overview architecture 1 3 2 Single Server/Storage 4 Distribution 5
  • 4. Some statistics: - Feeds: 1.6 B, 700 GB hard drive in 4 DB instances, 8 caching servers, 136 GB memory cache in used. - User Profiles: 44.5 M registered accounts, 2 database instances, 30 GB memory cache. - Comments: 350 M, 50 GB hard drive in 2 DB instances, 20 GB memory cache
  • 5. Why
  • 6. Access time L1 cache reference 0.5 ns Branch mispredict 5 ns L2 cache reference 7 ns Mutex lock/unlock 100 ns Main memory reference 100 ns Compress 1K bytes with Zippy 10,000 ns Send 2K bytes over 1 Gbps network 20,000 ns Read 1 MB sequentially from memory 250,000 ns Round trip within same datacenter 500,000 ns Disk seek 10,000,000 ns Read 1 MB sequentially from network 10,000,000 ns Read 1 MB sequentially from disk 30,000,000 ns Send packet CA->Netherlands->CA 150,000,000 ns by Jeff Dean (http://labs.google.com/people/jeff)
  • 7. Standard & Real Requirement - Time to load a page < 200 ms - Read data rate ~12K ops/sec - Write data rate ~8K ops/sec - Caching service/Database recovery time < 5 mins
  • 8. Existent thing - RDBMS (MySQL, MSSQL): Write: too slow; Read: so so with a small DB, too bad with a huge DB - Cassandra (by Facebook): difficult to do operation/maintain, and performance is not so good - HBase/Hadoop: We use this for log system - MongoDB, Membase, Tokyo Tyrant, .. : OK! we use these in several cases, but not suitable for all
  • 10.  
  • 12. ZNonblockingServer - Based on TNonblockingServer (Apache Thrift) - 185K reqs/sec (original TNonblockingServer is just 45K reqs/sec) - Serialize/Deserialize data - Prevent overload server - Data is not secured while transferring - Protect service from invalid requests
  • 13. ICache - Least Recently Used/Time based expiration strategy - zlru_table<key_type, value_type>: hash table data structure - Re-write malloc/free functions instead of using standard malloc/free in glibc to reduce memory fragment - Support dirty-items marking => for lazy DB flush
  • 14. ZiDB - Separate into DataFile & IndexFile - 1 seek for a read, 1-2 seeks for a write - IndexFile (hash structure) is loaded onto memory as a mapping file (shared memory) to reduce system call - Write-ahead log to avoid data loss - Data magic-padding - Checksum & checkpoint for repair data - Partitioning DB for easier maintenance
  • 16. Key requirements: - Scalability - Load balance - Availability - Consistency
  • 17. 2 Models: - Centralized: 1 addressing server & multiple storage servers => bottleneck & single-point-of-failure - Peer-peer: Each server includes addressing module & storage 2 Types of routing: - Client routing: Each client itself does the addressing and query data - Server routing: The addressing is done at server
  • 18. Operation Flows * Addressing module is moved into each storage node in Peer-peer model Business Logic Server Addressing Server (DHT) Storage Layer Storage Node 1 ICache ZiDB Storage Module Storage Node N ICache ZiDB Storage Module … (1) Request key locations (2) Key locations (3) Get & Set operations (4) Operation returns
  • 19. Addressing: - Provide key locations of resources - Basically a Distributed Hash Table, using consistent hashing - Hashing: Jenkins, Murmur, or any algorithm that satisfies two conditions: - Uniform distribution of generated keys in the key space - Consistency (MD5, SHA are bad choice since performance)
  • 20. Addressing - Node location: Each node is assigned a continuous range of IDs (hashed key)
  • 21. Addressing - Node location: Golden ratio principle (a/b = 2b/a) - Init ratio = 1.618 - Max ratio ~ 2.6 - Easy to implement - Easy for routing from client 2 3 4 5 1
  • 22. Server 1: 1,2,3 Server 2: 4,5,6,7 Server 3: 8,9 1 4 7 3 6 2 5 8 9 Addressing - Node location: Virtual nodes - Each real server has multiple virtual nodes on ring - More virtual nodes, more balance of load - Hard to maintain table of nodes
  • 23. A A A B B C Addressing – Multi-layer rings - Store the change history of system - Provide availability/reconfigurability - Able to put a node on ring manually * Write: data is located on the highest ring * Read: data is located on the highest ring, then lower rings if not found
  • 24. Replication & Backup - Each node has one primary range of IDs, and Some secondary range of IDs - Each real node need a backup instance to replace in case it’s down * Data is queried from primary node, then secondary nodes
  • 25. Configuration: to find the best parameters to configure DB or to choose the suitable DB type. - How many read/write per second? - Length Deviation of data: data length is same same or much different each others, - Has updation/deletion data? - How important of data: acceptable loss or not - The old data can be recycled?
  • 26. Q & A Contact: Nguyễn Quang Nam [email_address] http://me.zing.vn/nam.nq