2. Dictionary
● Abstract Data type, maintain set of items with keys.
– Insert (item) – also replacing existed item
– Delete (item)
– Search (key): return the item with given key or report if
doesn't exist (null).
Item → (key, value)
3. Simple Approach: DAT
Storing items directly into
giant array which is the
index of item as key.
0 NULL
1 ITEM 1
2 NULL
3 NULL
4 ITEM 2
...
...
m-1 ...
5. Two Big Problem
(1)Keys may be negative integers
● Maps keys into non negative integers (such as String or
string of bits)
(2)Gigantic memory
● Hashing (cut into piece)
6. Big idea
Reduce universe () of all keys (integers)
down to reasonable size of m for table. 0 NULL
1 ITEM 1
2 NULL
3 NULL
4 ITEM 2
...
...
m-1 ...
k1
k2
k3
k4
keyspace
h(k2) = 1
h(k3) = 4
h(k)
Ideal: m = (n)
8. Chaining
● Use linked list to store value with collide keys.
0 NULL /
1 ITEM 1 /
2 NULL /
3 NULL /
4 ITEM 2 ITEM 3 /
...
... /
m-1 ... /
k1
k2
k3
k4
keyspace
h(k1) = 4
h(k4) = 4
h(k) Worst cases:
- a bunch of keys is
mapped into the same
index
9. Length of chain
Expected length of chain for n keys with m slots:
n
m
=⍺
⍺ = load factor of table
⍺ = O(1) if m = O(n)
10. Hash functions
(1) Division Method:
h(k )=k modm
(2) Multiplication method
h(k )=[(a . k )mod 2w ]≫(w−r )
a = integer
w = bit of machine
r = bits to shift
m=2r
11. Cont...
(3) Universal hashing
h(k )=[(a . k+b)mod p ]modm
Random integer
∈ {0,1,..., p-1}
Prime > ||
Worst cases: k1 ≠ k2
Pr
[h (k 1)=h (k 2)]
=
1
m