1. A Bloom Filter based
Distributed PIT system
2nd Workshop CCNx 2012 in Sophia Antipolis
Wei YOU, Bertrand MATHIEU, Patrick TRUONG,
Jean-François PELTIER, Gwendal SIMON
contact: wei.you@orange.com
2. Objectives
Context of current solutions:
PIT executes exact-match => huge memory space
Almost every incoming packet will lead a change at the PIT entry => fast
processing task
With conventional solution (e.g. HashTable) & current technologies (SRAM,
RDRAM, etc.) => tradeoff between speed and memory space
Objectives of our study: to reduce the PIT table space requirement and
speed up the lookup/update process
Bloom filter is a possible solution since it is
fast & space-efficient
Well implemented in IP applications
BUT does not support the information retrieval => Thus the distributed
architecture is here.
=> Our Solution: A Bloom Filter based Distributed PIT system
2 A Bloom Filter based distributed PIT system Wei You
3. DiPIT: a distributed Bloom filter based PIT
Distributed structure: PITi
one small PITi per CCN face
All the PITis are independent from each other
each PITi is a counting Bloom filter
small size, fast performance.
Reduction of false positives via a Shared Bloom Filter
Possible mismatch but its ratio can be controlled at a low level
One additional binary Bloom filter, shared by all the faces
3 A Bloom Filter based distributed PIT system Wei You
4. PIT table size estimation
Analyze the required table size on function of Interest arrival rate ( in)
16 interfaces, in = [0 ~ 200Mpck/s], RTT = 80ms
fp = 1% and 0.1%, SBF = 1Mbits.
H-bit = 28 bits, CCN face identifier = 2Bytes
4 A Bloom Filter based distributed PIT system Wei You
5. PIT table size estimation
Analyze the required memory space based on the ratio of similar
interests (same content name)
hash table is better only when 80% of traffic for the same Interest name
5 A Bloom Filter based distributed PIT system Wei You
6. Implementations in CCNx (release 0.4.2)
Implementation in CCNx (0.4.2)
Initialization of the face->cbf is according to the face->flags
The ccnd_handle has a shared binary Bloom filter ccnd_handle->sbf
Le ccnd_handle->sbf has a RST mechanism which is triggered by the number
of inserted elements
In the process of incoming Interest & incoming Content
Packets are filtered with the face->flags
Lookup & update the Interest/Data in the counting Bloom Filters and the SBF
Binary Bloom filter state check after a SBF update
Get the match results.
6 A Bloom Filter based distributed PIT system Wei You
7. 1st Evaluation: in-line network
Real testbed in Telecom Bretagne
composed of 9 CCNx nodes
Client Server
Settings
10000 ContentNames (Interest & Data), zipf distribution, =0.7
1 content provider and 1 clients.
9 nodes in line, 1Mbits for each PITi,1Mbits for SBF, 2.5% de threshold of
RST in SBF
Results:
The client (node 0) generates 10000 Interests on 4827 different names
The server (node 8) sends 4826 contents
DiPIT
Thus the false positive rate in PITi is 1.7%
DiPIT blocks 6 Interests => The packet lost rate is 0.1%
2 times RST in each nodes.
7 A Bloom Filter based distributed PIT system Wei You
8. 2nd Evaluation: Geant network
Settings Result
Geant, 1Mbit PITi, 1Mbits SBF Node 0 sends 4372 + 16 + 395 =
2.5% de threshold of RST in SBF 4783 Interests. Thus there are 4783
4761 = 18 Interests which get lost
during the forwarding process. Total
PLR in the path => 0.37%
395 Node 8 gets 4165 + 15 + 593 =
0
16
1 2 4765 Interests, sends 4761
Contents. Thus the PLR in node 8
4372 => 0.08%
25 605 Node RST (times)
3 4 5
0 3
1 4
593 2 4
4165 3 5
101 15
6 7 8 4761 Data 4 5
5 5
6 4
4157 7 3
8 2
8 A Bloom Filter based distributed PIT system Wei You
9. Where to deploy such a solution:
Case study: a hierarchical network topology
Topology
3 levels, edge, core and peering routers.
Each terminal: 10Mpck/s, _interest = 95%
Internal link delay d = 20ms.
Peering link delay D = 20ms
Recommendations (e.g. the edge router)
If acceptable fp >0.01% DiPIT is always better than hash tables
if the < 66Mpck/s, it is better to use RLDRAM because it is
cheaper
If the acceptable fp < 0.01%, the hash table is a better solution
However when > 86 Mpck/s the hash table can no more be
used. DiPIT with SRAM is the only option
9 A Bloom Filter based distributed PIT system Wei You
Wei You
10. Conclusion
The Bloom Filter based distributed PIT architecture (DiPIT) can
significantly reduce the memory space requirement of
implementing the CCN PIT table, with a small acceptable false
positive ratio.
DiPIT can reduce the influence of the current memory
technology bottleneck, even it has false positive
Hash table has the limitations at the table size and the
performance speed, but no extra network load
10 A Bloom Filter based distributed PIT system Wei You
13. Hardware challenge for the hash-based PIT
Memory chip Trade-off: Processing speed OR Storage capacity
Technology Access time(ns) Cost ($/MB) Max. size
SRAM (on-chip) 1 27 50Mb
SRAM (off-chip) 4 27 250Mb
RLDRAM 15 0.27 2Gb
DRAM 55 0.016 10GB
Table size and cost vs. Interest arriving rate
4 interfaces, in = [0 ~ 200Mpck/s], RTT = 80ms
Content name length = 128bits
H-bit = 24/32/64 bits, interface identifier = 2Bytes
SRAM (fast for processing) RLDRAM(large size for memory)
13 A Bloom Filter based distributed PIT system Wei You
14. DiPIT: a Distributed Bloom-filter based PIT table
Bloom Filter
For testing the existence of the elements
Insert -- use k independent hash functions to insert all elements
in an empty vector, set all the hash result positions to 1
Testing if an element passed through all the hash functions
could have a result all 1, we can say that this element is in the
set
can have with counters for deleting
Advantage : space efficient
Drawback: false positive
How to retrieve the information?
14 A Bloom Filter based distributed PIT system Wei You
15. DiPIT: a Distributed Bloom-filter based PIT table
Algorithm
Wei You
15 A Bloom Filter based distributed PIT system
16. Evaluation results
Analyze the required table size on function of false positive
probability
Only when k=3 and fp < 0.00003%, hash table uses less
memory space
16 A Bloom Filter based distributed PIT system Wei You
17. Evaluation results
Analyze of the traffic burst
traffic follows the Poisson distribution
DiPIT and hash table are both designed to handle 100 Mpck/s
Interest
the PLR of hash table increases faster after 100 Mpck/s than
the false positive of DiPIT
17 A Bloom Filter based distributed PIT system Wei You