Submit Search
Upload
Distributed Hash Table and Consistent Hashing
•
Download as DOCX, PDF
•
32 likes
•
24,137 views
C
CloudFundoo
Follow
An Introduction to Consistent Hashing and its uses
Read less
Read more
Technology
Education
Report
Share
Report
Share
1 of 8
Download now
Recommended
Localization & calling
Localization & calling
RUpaliLohar
Architecture of Mobile Computing
Architecture of Mobile Computing
JAINIK PATEL
Unit 1 architecture of distributed systems
Unit 1 architecture of distributed systems
karan2190
SRVCC (Single Radio Voice Call Continuity) in VoLTE & Comparison with CSFB
SRVCC (Single Radio Voice Call Continuity) in VoLTE & Comparison with CSFB
Vikas Shokeen
IEEE 802.11 Architecture and Services
IEEE 802.11 Architecture and Services
Sayed Chhattan Shah
NETCONF YANG tutorial
NETCONF YANG tutorial
Tail-f Systems
ATM Networking Concept
ATM Networking Concept
Tushar Ranjan
Middleware
Middleware
nava rathna
Recommended
Localization & calling
Localization & calling
RUpaliLohar
Architecture of Mobile Computing
Architecture of Mobile Computing
JAINIK PATEL
Unit 1 architecture of distributed systems
Unit 1 architecture of distributed systems
karan2190
SRVCC (Single Radio Voice Call Continuity) in VoLTE & Comparison with CSFB
SRVCC (Single Radio Voice Call Continuity) in VoLTE & Comparison with CSFB
Vikas Shokeen
IEEE 802.11 Architecture and Services
IEEE 802.11 Architecture and Services
Sayed Chhattan Shah
NETCONF YANG tutorial
NETCONF YANG tutorial
Tail-f Systems
ATM Networking Concept
ATM Networking Concept
Tushar Ranjan
Middleware
Middleware
nava rathna
Layering and Architecture
Layering and Architecture
selvakumar_b1985
4. concurrency control
4. concurrency control
AbDul ThaYyal
Distributed system notes unit I
Distributed system notes unit I
NANDINI SHARMA
Mobile IP
Mobile IP
Nijo Job
Handover in Mobile Computing
Handover in Mobile Computing
KABILESH RAMAR
Point to-point protocol (ppp)
Point to-point protocol (ppp)
Kongu Engineering College, Perundurai, Erode
communication-protocols
communication-protocols
Ali Kamil
Wi-Fi Direct
Wi-Fi Direct
shivam_kedia
Data dissemination
Data dissemination
Vikram Nandini
Mobile computing : Indirect TCP
Mobile computing : Indirect TCP
Sushant Kushwaha
Ad-Hoc Networks
Ad-Hoc Networks
Mshari Alabdulkarim
Ip packet delivery
Ip packet delivery
rajisri2
Lab practice 1 configuring basic routing and switching (with answer)
Lab practice 1 configuring basic routing and switching (with answer)
Arz Sy
Introduction to Virtualization
Introduction to Virtualization
Rahul Hada
Mobile Network Layer
Mobile Network Layer
Rahul Hada
Mobile Computing UNIT-6
Mobile Computing UNIT-6
Ramesh Babu
21 Scheme_ MODULE-3_CCN.pdf
21 Scheme_ MODULE-3_CCN.pdf
Dr. Shivashankar
Utran architecture(rashmi)
Utran architecture(rashmi)
Dr. ABHISHEK K PANDEY
Ethernet and token ring
Ethernet and token ring
Abhijeet Shah
Transport layer security (tls)
Transport layer security (tls)
Kalpesh Kalekar
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
Sandro Moreira
More Related Content
What's hot
Layering and Architecture
Layering and Architecture
selvakumar_b1985
4. concurrency control
4. concurrency control
AbDul ThaYyal
Distributed system notes unit I
Distributed system notes unit I
NANDINI SHARMA
Mobile IP
Mobile IP
Nijo Job
Handover in Mobile Computing
Handover in Mobile Computing
KABILESH RAMAR
Point to-point protocol (ppp)
Point to-point protocol (ppp)
Kongu Engineering College, Perundurai, Erode
communication-protocols
communication-protocols
Ali Kamil
Wi-Fi Direct
Wi-Fi Direct
shivam_kedia
Data dissemination
Data dissemination
Vikram Nandini
Mobile computing : Indirect TCP
Mobile computing : Indirect TCP
Sushant Kushwaha
Ad-Hoc Networks
Ad-Hoc Networks
Mshari Alabdulkarim
Ip packet delivery
Ip packet delivery
rajisri2
Lab practice 1 configuring basic routing and switching (with answer)
Lab practice 1 configuring basic routing and switching (with answer)
Arz Sy
Introduction to Virtualization
Introduction to Virtualization
Rahul Hada
Mobile Network Layer
Mobile Network Layer
Rahul Hada
Mobile Computing UNIT-6
Mobile Computing UNIT-6
Ramesh Babu
21 Scheme_ MODULE-3_CCN.pdf
21 Scheme_ MODULE-3_CCN.pdf
Dr. Shivashankar
Utran architecture(rashmi)
Utran architecture(rashmi)
Dr. ABHISHEK K PANDEY
Ethernet and token ring
Ethernet and token ring
Abhijeet Shah
Transport layer security (tls)
Transport layer security (tls)
Kalpesh Kalekar
What's hot
(20)
Layering and Architecture
Layering and Architecture
4. concurrency control
4. concurrency control
Distributed system notes unit I
Distributed system notes unit I
Mobile IP
Mobile IP
Handover in Mobile Computing
Handover in Mobile Computing
Point to-point protocol (ppp)
Point to-point protocol (ppp)
communication-protocols
communication-protocols
Wi-Fi Direct
Wi-Fi Direct
Data dissemination
Data dissemination
Mobile computing : Indirect TCP
Mobile computing : Indirect TCP
Ad-Hoc Networks
Ad-Hoc Networks
Ip packet delivery
Ip packet delivery
Lab practice 1 configuring basic routing and switching (with answer)
Lab practice 1 configuring basic routing and switching (with answer)
Introduction to Virtualization
Introduction to Virtualization
Mobile Network Layer
Mobile Network Layer
Mobile Computing UNIT-6
Mobile Computing UNIT-6
21 Scheme_ MODULE-3_CCN.pdf
21 Scheme_ MODULE-3_CCN.pdf
Utran architecture(rashmi)
Utran architecture(rashmi)
Ethernet and token ring
Ethernet and token ring
Transport layer security (tls)
Transport layer security (tls)
Recently uploaded
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
Sandro Moreira
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
Martijn de Jong
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
Nanddeep Nachan
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
Remote DBA Services
presentation ICT roal in 21st century education
presentation ICT roal in 21st century education
jfdjdjcjdnsjd
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
UiPathCommunity
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
Product Anonymous
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
apidays
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
apidays
Architecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Edi Saputra
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
Khushali Kathiriya
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Zilliz
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
apidays
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Juan lago vázquez
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
sammart93
Recently uploaded
(20)
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
presentation ICT roal in 21st century education
presentation ICT roal in 21st century education
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Architecting Cloud Native Applications
Architecting Cloud Native Applications
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Distributed Hash Table and Consistent Hashing
1.
CloudFundoo 2012
Distributed Hash Tables and Consistent Hashing DHT(Distributed Hash Table) is one of the fundamental algorithms used in distributed scalable systems; it is used in web caching, P2P systems, distributed file systems etc. First step in understanding DHT is Hash Tables. Hash tables need key, value and a hash function, where hash function maps the key to a location where the value is stored. Keys Hash Function Stored Values Key1 Value3 Key2 Value4 Key3 Value1 Value2 Key4 value = hashfunc(key) Python’s dictionary data type is implemented using hashing, see the example below. 1. #!/usr/bin/python 2. 3. dict = {'Name': 'Zara', 'Age': 11, 'Class': 'First'}; 1 4. 5. dict['Age'] = 12; 6. dict['School'] = "State School"; 7. Copyright © CloudFundoo | cloudfundoo@gmail.com, http://cloudfundoo.wordpress.com/
2.
CloudFundoo 2012
8. 9. print "dict['Age']: ", dict['Age']; 10. print "dict['School']: ", dict['School']; If we have a perfect hash function we will get an O (1) performance i.e. constant time performance out of hash table while searching for a (key, value) pair, this is because hash function distributes the keys evenly across the table. One of the problem with hashing is it requires lot of memory (or space) to accommodate the entire table, even if most of the table is empty we need to allocate memory for entire table, so there is waste of memory most of the time. This is called as time-space tradeoff, hashing gives best time for search at the expense of memory. When we want to accommodate large number of keys (millions and millions, say for the case of a cloud storage system), we will have to divide keys in to subsets, and map those subsets of keys to a bucket, each bucket can reside in a separate machine/node. You can assume bucket as a separate hash table. Distributed Hash Table Using buckets to distribute the (key, value) pair is called DHT. A simple scheme to implement DHT is by using modulus operation on key i.e. your hash function is key mod n, where n is the number of buckets you have. Key Space K1 Kn/3 K2n/3 Kn 2 Bucket 1 Bucket 2 Bucket 3 Copyright © CloudFundoo | cloudfundoo@gmail.com, http://cloudfundoo.wordpress.com/
3.
CloudFundoo 2012 If you
have 6 buckets then, key = 1 will go to bucket 1 since key % 6 = 1, key=2 will go to bucket 2 since key % 6 = 2 and so on. We will need a second hashing to find the actual (key, value) pair inside a particular bucket. We can use two dictionaries to visualize DHT; here each row in Client/Proxy dictionary is equivalent to a bucket in DHT. Bucket 1 Client/Proxy Bucket 3 Bucket 0 3 This scheme will work perfectly fine as long as we don’t change the number of buckets. This scheme starts to fail when we add/remove Copyright © CloudFundoo | cloudfundoo@gmail.com, http://cloudfundoo.wordpress.com/
4.
CloudFundoo 2012 buckets to/from
the system. Lets add one more bucket to the system, the number of buckets is now equal to seven, i.e. n=7. The key = 7 which was previously mapped to bucket 1 now map to bucket 0 since key % 7 is equal to 0. In order to make it still work we need to move the data between buckets, which is going to be expensive in this hashing scheme. Let’s do some calculation, consider modulo hash function, h(key) = key mod n Where n is the number of buckets, when we increase the number of buckets by one, the hash function becomes h(key) = key mod (n+1) Because of the addition of a new bucket, most of keys will hash to a different bucket, let’s calculate the ratio of keys moving to different bucket, K–n keys will move to a different bucket if keys are in the range 0 to K, only the first n keys will remain in the same buckets. So ratio of keys moving to a different bucket is (K – n)/K = 1- n/K If there are 10 buckets and 1000 keys, then 99% of keys will move to a different bucket when we add another bucket. If we are using python’s hash() or hashlib.md5 hashing functions, then the fraction of keys moving to another bucket is 4 n/(n +1) Copyright © CloudFundoo | cloudfundoo@gmail.com, http://cloudfundoo.wordpress.com/
5.
CloudFundoo 2012 So we
need a scheme to reduce the number of keys moving to a different bucket, consistent hashing is a scheme for the same. Consistent Hashing A ring is the core of consistent hashing; first we hash the bucket IDs to points on ring. B1 B4 B2 B3 Then we hash the keys to ring, the resulting ring will look like below. B1 K1 K4 B3 B2 K3 K2 B3 5 So if we want to find the bucket which stores the value corresponding to a key, we first need to hash the key to a point in that ring and then Copyright © CloudFundoo | cloudfundoo@gmail.com, http://cloudfundoo.wordpress.com/
6.
CloudFundoo 2012 we need
to search in the clockwise direction in the ring to find the first bucket in that ring, that bucket will be the one storing the value corresponding to the key. For key K1 value will be stored in bucket B2, for key K2 value will be stored in bucket B3 and so on. Hashing is working fine with this scheme, but we introduced this scheme to handle addition/removal of buckets, let see how it handles this, this is explained in below picture. B1 K4 K1 B3 B2 K3 K2 B3 So if we are removing bucket B3, key K2 seems to have a problem, let’s see how consistent hashing solves this problem, key K2 still hash to the same point in circle, while searching in the clockwise direction it sees no bucket called B3, so searches past B3 in clockwise direction and it will find bucket B4, where value corresponding to key K2 is stored. For other keys there is no problem, all remains same, key K4 in bucket B1, key K1 in bucket B2 etc. So we need to move only the contents of removed bucket to the clockwise adjacent bucket. 6 Let’s see what will happen if we add a bucket, see a slightly modified diagram below. Copyright © CloudFundoo | cloudfundoo@gmail.com, http://cloudfundoo.wordpress.com/
7.
CloudFundoo 2012
B1 K5 K1 K4 B2 B3 K3 K2 B3 The additional key K5 is mapped to B1, so we have both keys K4 and K5 mapping to bucket B1, like bucket removal scenario where keys K2 and K3 maps to bucket B4 after removal. K5 B1 B5 K1 K4 B2 B3 K2 K3 B3 7 Let’s add a new bucket B5, the new bucket B5 goes in between keys K4 and K5, key K4 which was previously mapped to bucket B1, now Copyright © CloudFundoo | cloudfundoo@gmail.com, http://cloudfundoo.wordpress.com/
8.
CloudFundoo 2012 goes to
bucket B5 and Key K5 still maps to bucket B1. So only the keys which lie between B4 and B5 should be moved from B1 to B5. On an average the fraction of keys which we need to move between buckets when one bucket is added to the system is given as 1/(n +1) So by introducing consistent hashing we reduced the fraction of keys which we need to move, from n/(n+1) to 1/(n+1), which is significant. There is lot of details to consistent hashing, which is not covered in this. Consistent hashing has a great role in distributed systems like DNS, P2P, distributed storage, and web caching systems etc, OpenStack Swift Storage and Memcached are open source projects which use this to achieve scalability and Availability. <EOF> 8 Copyright © CloudFundoo | cloudfundoo@gmail.com, http://cloudfundoo.wordpress.com/
Download now