The document summarizes a research paper on efficiently identifying heavy hitters in data streams using cached and packed group testing techniques. The paper proposes using packed bidirectional counter arrays to implement the operations of combinatorial group testing (CGT) in constant time. This improves the time complexity of CGT for updating frequencies and querying heavy hitters from O(log(n)) to O(1), eliminating dependency on the size of the data universe n. Experimental results show the proposed method achieves competitive precision, update throughput, and query throughput compared to existing CGT and hierarchical count-min sketch approaches.
2. 2
The φ-Heavy Hitters Problem
[Cormode, Muthukrishnan, ACM TODS 2005]
§Tracking φ-heavy hitters in a dynamic multiset S of elements.
• φ-heavy hitter: element in universe U = [0..n) with its frequency more than φ|S|.
• Challenges: large alphabet, output sensitivity, high-speed operation
Input: data stream of pairs (xi, Δi) ∈ U × {±1} and real numbers ε, φ in [0, 1).
Task: Maintain frequency information of elements for supporting
• QUERY(): return a set R ⊆ U such that R includes (1) all φ-heavy hitters and
(2) no others whose frequency is no more than (φ − ε)N for N = Σi Δi.
• INSERT(x)/DELETE(x): increment/decrement the frequency Nx of element x.
The (ε-approximate) φ-Heavy Hitters Problem in the turnstile model
Model of computation: The standard w-bit word RAM
3. 3
Large Universes in Mobile Networks
The operation time of existing practical methods depends on
log |U| = log n (Large in practice!)
Combinatorial Group Testing [Cormode, Muthukrishnan, ACM TODS 2005]
Hierarchical Count-Min Sketch [Cormode, Muthukrishnan, LATIN 2005]
IPv4 IPv6
Examples of universe U log |U| = log n
IP addresses 32 128
Pairs of IP addresses 64 256
Five tuples (source/destination IP/Port + Protocol) 104 296
Q. Can we eliminate dependency on log n from operation time?
4. 4
Main results
Key technique: Packed Bidirectional Counter Arrays
Our paper also proposes "cached candidate technique” for improving CGT for arbitrary updates.
This study CGT: Combinatorial Group Testing
[Cormode, Muthukrishnan, ACM TODS 2015]
Update O(r) amortized O(log(n)r) O(logb(n)r)
Query O(r2/ε) O((log(n)+r)r/ε) O((blogb(n)+r)r/ε)
Space O(log(n)r/ε) O(log(n)r/ε) O(blogb(n)r/ε)
n: size of universe δ: failure probability r = log(1/(δφ)) b: any integer in [2..n]
Model of computation: The standard w-bit word RAM
5. 5
Related Work: Packed Counters
Maintaining an array ofm = O(w) counters on the w-bit word RAM.
§Textbook solution for a single counter [Mehlhorn & Sanders, 2008]:
• Ops = inc/test or dec/test: O(1) space; O(m) amortized time.
§Nested counters [Grabowski & Fredriksson, IPL 2008]:
• Ops = inc/test: O(m) space; O(1) amortized time
§Trit counters [Bille & Thorup, SODA 2010]
• Ops = dec/reset/test: O(m) space; O(1) amortized time.
§Bidirectional counters [This talk]
• Ops = inc/dec/test: O(m) space; O(1) time for inc/dec (amortized) and test.
• Naïve bidirectional counters: O(m) space; O(m) time for all operations.
"test": ispositive (C[i] > 0), iszero (C[i] = 0), or isnegative (C[i] < 0)
7. 7
CGT: A Practical Data Structure
§Reports all φ-heavy hitters with probability at least 1 – δ for a specified δ.
§Idea: Random partition of U into d = 2/ε
subsets via each hash function hi.
• A φ-heavy hitter x can be identified from each
C[i, hi(x), 0..m] with probability at least 1/2.
• Setting r = log(1/(δφ) results in a desired failure
probability δ of missing any φ-heavy hitter.
Combinatorial Group Testing [CM, ACM TODS 2005]
1. Three-dimensional counter array: C[1..r, 1..d, 0..m]
2. A set of universal hash functions: h1, ..., hr: U → [1..d]
r = log(1/(δφ))
d = 2/ε
m = 1 + lg n
8. 8
CGT: A Practical Data Structure
§Reports all φ-heavy hitters with probability at least 1 – δ for a specified δ.
Combinatorial Group Testing [CM, ACM TODS 2005]
INCREMENT(C, x):
C[i] = C[i] + bit(x, i)
for every i in [1..m].
DECREMENT(C, x):
C[i] = C[i] − bit(x, i)
for every i in [1..m].
ISPOSITIVE(C)
Return z = Σi [C[i] > 0] · 2i
[X] is 1 (resp. 0) if X is true (resp. false)
CGT reduces both QUERY and UPDATE
to three basic operations on bidirectional counter array C[1..m]:
CGT: A Practical Data Structure
1. Three-dimensional counter array: C[1..r, 1..d, 0..m]
2. A set of universal hash functions: h1, ..., hr: U → [1..d]
9. 9
CGT: A Practical Data Structure
Combinatorial Group Testing [CM, ACM TODS 2005]
UPDATE(x, Δ): O(log(n)r) time
1. Add Δ to N
2. for i in [1..r] do:
3. Add Δ to C[i, hi(x), 0]
4. if Δ < 0 then: x ← ~x
5. INCREMENT(C[i, hi(x), 1..m], x)
6. DECREMENT(C[i, hi(x), 1..m], ~x)
QUERY(): O((log(n)+r)r/ε) time
1. for i in [1..r] do:
2. for j in [1..d] do:
3. // C[i, j, k] 2C[i, j, k] – C[i, j, 0]
4. x ← ISPOSITIVE(C[i, j, 1..m])
5. if mini C[i, hi(x), 0] > φN then:
6. report x as a φ-heavy hitter
INCREMENT(C, x):
C[i] = C[i] + bit(x, i)
for every i in [1..m].
DECREMENT(C, x):
C[i] = C[i] − bit(x, i)
for every i in [1..m].
ISPOSITIVE(C)
Return z = Σi [C[i] > 0] · 2i
[X] is 1 (resp. 0) if X is true (resp. false)
10. 10
CGT: A Practical Data Structure
Combinatorial Group Testing [CM, ACM TODS 2005]
UPDATE(x, Δ): O(log(n)r) time
1. Add Δ to N
2. for i in [1..r] do:
3. Add Δ to C[i, hi(x), 0]
4. if Δ < 0 then: x ← ~x
5. INCREMENT(C[i, hi(x), 1..m], x)
6. DECREMENT(C[i, hi(x), 1..m], ~x)
QUERY(): O((log(n)+r)r/ε) time
1. for i in [1..r] do:
2. for j in [1..d] do:
3. // C[i, j, k] 2C[i, j, k] – C[i, j, 0]
4. x ← ISPOSITIVE(C[i, j, 1..m])
5. if mini C[i, hi(x), 0] > φN then:
6. report x as a φ-heavy hitter
INCREMENT(C, x):
C[i] = C[i] + bit(x, i)
for every i in [1..m].
DECREMENT(C, x):
C[i] = C[i] − bit(x, i)
for every i in [1..m].
ISPOSITIVE(C)
Return z = Σi [C[i] > 0] 2i.
[X] is 1 (resp. 0) if X is true (resp. false)
Q. Can we implement
INCREMENT/DECREMENT/ISPOSITIVE in o(m) time?
11. 11
Packed Bidirectional Counter Arrays
§Basic idea: Exploiting word-level parallelism of the w-bit word RAM
• Redundant binary representation of C[1..m] using digits {0, ±1, ±2}.
• The corresponding k-th digits of C[1..m] are packed into O(1) words.
• The packed k-th digits of C[1..m] are updated in O(1) time, once in 2k times.
INCREMENT(C, x):
C[i] = C[i] + bit(x, i)
for every i in [1..m].
DECREMENT(C, x):
C[i] = C[i] − bit(x, i)
for every i in [1..m].
ISPOSITIVE(C)
Return z = Σi [C[i] > 0] 2i.
[X] is 1 (resp. 0) if X is true (resp. false)
O(1) amortized time O(1) amortized time O(1) time
using O(m) space (compact!) for m = O(w)
12. 12
Packed Bidirectional Counter Arrays
§Basic idea: Exploiting word-level parallelism of the w-bit word RAM
• Redundant binary representation of C[1..m] using digits {0, ±1, ±2}.
• The corresponding k-th digits of C[1..m] are packed into O(1) words.
• The packed k-th digits of C[1..m] are updated in O(1) time, once in 2k times.
··· ···
m × w = O(w2) bits: O(w) time to access
Naïve bidirectional counters
C[1] C[i] C[m]
Packed bidirectional counter array
m × O(1) = O(w) bits: O(1) time to access
C[1]
···
C[i]
···
C[m]
wdigits
13. 13
Packed Bidirectional Counter Arrays
= 1
= 0
in {0, ±1}
in {0, ±1, ±2}
Fixed-schedule
carry propagation
in O(1) amortized time
[GF, IPL 2008][BT, SODA 2010]
Packed
redundant binary counters
using digits {0, ±1, ±2}
Packed
orders of magnitudes
for detecting sign inversion
···
t
0
1
2
···
level(t)
···
1 2 3 4 5 6 7 8
3
9
level(t) = min{i | t mod 2i = 0}
1. Propagate carry bits 2. Fix orders of magnitudesThe k-th digits are updated
once in 2k times
Never
overflow
The t-th update:
14. 14
Lemma (Packed Bidirectional Counters)
There exists an O(m)-space data structure for representing
an array C[1..m] of m bidirectional counters supporting
§INCREMENT/DECREMENT in O(1) amortized time
§ISPOSITIVE in O(1) time
on the standard w-bit word RAM with w ≥ m.
15. 15
Theorem
§Plugging packed bidirectional counters into CGT, we obtain:
There exists an O(lg(n)r/ε)-space randomized data structure
for solving the ε-approximate φ-heavy hitters problem with
§INSERT/DELETE in O(r) amortized time
§QUERY in O(r2/ε) time with probability at least 1 - δ
on the standard w-bit word RAM with w ≥ lg n. Here, n is
the universe size, δ is a failure probability, and r = lg(1/(δφ)).
16. 16
Experiments: Setup
§Data: 14 datasets of 10 M integers
• Universe: [0, 264).
• Zipf distribution of skewness z in { 0.8, 1.0, 1.2, 1.4, 1.6, 1.8, 2.0 }.
• Threshold φ (= ε) in { 0.0001, 0.0005, 0.001, 0.005, 0.01}.
§Methods:
• Ours [This work]: Our proposed method with #rounds r = 4.
• CGT(b) [CM, TODS 2005]: Combinatorial Group Testing with branching factor b in { 2, 16 }.
• CMH(b) [CM, LATIN 2005]: Hierarchical Count-Min Sketch with branching factor b in { 2/16 }.
CGT(b) and CMH(b) were configured as in [Cormode, Hadjieleftheriou, PVLDB 2008].
§Hardware:
• MacBook Pro with Intel® Core™ i7-8559 (2.7GHz) and 16GB main memory.
17. 17
Experiments: Precision
§Ours achieved competitive precisions for skewness z ≥ 1.4.
• Ours output more false positives than others for skewness z < 1.4.
• For z < 1.4, ours should have used larger ε to suppress false positives.
• Recalls of all methods were 100%.
0.8 1.2 1.6 2.0
0
20
40
60
80
100
Precision(%)
= 0.0001
0.8 1.2 1.6 2.0
= 0.0005
0.8 1.2 1.6 2.0
Skewness
= 0.001
Ours CGT2 CGT16 CMH2 CMH16
0.8 1.2 1.6 2.0
= 0.005
0.8 1.2 1.6 2.0
= 0.01
1.2 1.6 2.0
= 0.0005
0.8 1.2 1.6 2.0
Skewness
= 0.001
Ours CGT2 CGT16 CMH2 CMH16
0.8 1.2 1.6
= 0.005
CMH(16)CMH(2)CGT(16)Ours CGT(2)
[CM, TODS 2005] [CM, LATIN 2005]
18. 18
Experiments: Update time
§Ours achieved competitive update throughputs with CMH(16).
• CMH(16) achieved best and stable update throughputs.
• CGT(16) had heavy dependence on φ even if it doesn’t in theory.
• CGT(2) and CMH(2) were not competitive.
0.8 1.2 1.6 2.0
0
5000
10000
15000
20000
25000
30000
Updates/msec
= 0.0001
0.8 1.2 1.6 2.0
= 0.0005
0.8 1.2 1.6 2.0
Skewness
= 0.001
Ours CGT2 CGT16 CMH2 CMH16
0.8 1.2 1.6 2.0
= 0.005
0.8 1.2 1.6 2.0
= 0.01
Note: Median of 5 measured times is reported
1.2 1.6 2.0
= 0.0005
0.8 1.2 1.6 2.0
Skewness
= 0.001
Ours CGT2 CGT16 CMH2 CMH16
0.8 1.2 1.6
= 0.005
CMH(16)CMH(2)CGT(16)Ours CGT(2)
[CM, TODS 2005] [CM, LATIN 2005]
19. 19
Experiments: Query time
§Ours achieved best query throughputs except for φ = 0.0001.
• Note: ε = φ and r = O(1) in our experiments.
• CGT family (including ours) must examine Θ(1/φ) candidates of heavy hitters.
• CMH family is output sensitive: it is fast if # of heavy hitters is less than 1/φ.
0.8 1.2 1.6 2.0
0
1
2
3
4
5
Queries/msec
= 0.0001
0.8 1.2 1.6 2.0
0
5
10
15
20
= 0.0005
0.8 1.2 1.6 2.0
Skewness
0
10
20
30
40
= 0.001
Ours CGT2 CGT16 CMH2 CMH16
0.8 1.2 1.6 2.0
0
50
100
150
200
250
= 0.005
0.8 1.2 1.6 2.0
0
200
400
600
800
1000
1200
= 0.01
1.2 1.6 2.0
= 0.0005
0.8 1.2 1.6 2.0
Skewness
= 0.001
Ours CGT2 CGT16 CMH2 CMH16
0.8 1.2 1.6
= 0.005
CMH(16)CMH(2)CGT(16)Ours CGT(2)
[CM, TODS 2005] [CM, LATIN 2005]
20. 20
Conclusion
§The φ-Heavy Hitters Problem in the strict turnstile model.
We improved CGT [CM, ACM TODS 2005] in
• Update time: from O(log(n)r) to amortized O(r)
• Query time: from O((log(n)+r)r/ε) to O(r2/ε)
using the same O(log(n)r/ε) space for a universe of size n and r = log(1/(δφ)).
§Packed Bidirectional Counter Array:
• Extension of [GF, IPL 2008] and [BT, SODA 2010] to bidirectional counters.
• Ops = inc/dec/test: O(1) amortized inc/dec and O(1) test in compact space.
§Future work
• Extension of our method to arbitrary updates.