This document discusses deduplication in resistive content-addressable memory (ReCAM) solid state drives. It describes how deduplication works traditionally using RAM and CPU, which requires complex data structures and computations. ReCAM allows for simpler deduplication by enabling the comparison of all data blocks simultaneously using the ReCAM crossbar. Simulation results show ReCAM provides over 100x higher throughput and similar or lower energy consumption than traditional deduplication approaches using RAM and CPU. ReCAM could be useful for deduplicating data in storage systems.
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
Roman Kaplan, Graduate Student,Technion
1. May 9, 2016
1
May 9, 2016
Deduplication in Resistive CAM
Based SSD
Roman Kaplan, Leonid Yavits,
Amir Morad, Ran Ginosar
2015
2. May 9, 2016
2
Outline
1. What is ReCAM ?
2. What is deduplication ?
– How is it done today?
3. Deduplication in ReCAM
– How is it simpler?
4. Simulation results
3. May 9, 2016
3
Resistive CAM – What is it?
• CAM = Content Addressable Memory
1. Search for data in the entire array
2. Store address explicitly function like RAM
• Memristors:
5. May 9, 2016
5
Resistive CAM – Operations
What can ReCAM do:
1. Compare all its contents to a specific word
2. Write to specific columns in parallel
3. Write to specific rows in parallel
6. May 9, 2016
6
What is Deduplication?
1. Data is broken into fixed blocks
2. A fingerprint (FP) is calculated for each block
7. May 9, 2016
7
What is Deduplication?
1. Data is broken into fixed blocks
2. A fingerprint (FP) is calculated for each block
3. Identical blocks aren’t stored (deduplicated)
8. May 9, 2016
8
Deduplication Uses
1. Useful when there is repeating data
– Virtual machines
– WAN optimizations (networking)
– Backups
2. Compression ratio depend on type of data –
can reach up to 40x
9. May 9, 2016
9
Deduplication using RAM+CPU: Write
1. Calculate FP (Hash)
2. Search for it in the
chunk index (takes
very long time)
3. Act accordingly
(next slides)
Data
Hash
2
…
1
…
1
PA(A)
…
PA(B)
…
PA(C)
Hash(A)
….
Hash(B)
…..
Hash(C)
Chunk Index
Fingerprint Physical
Address
?
CNT
1
2
1
2
10. May 9, 2016
10
RAM+CPU Deduplication: Write (Case 1)
Case 1: If the FP is found
Data block already exists
I. Add LA+PA to ATT
II. Increment FP counter
in chunk index
1
…
1
…
1
PA(A)
…
PA(B)
…
PA(C)
Hash(A)
….
Hash(B)
…..
Hash(C)
Chunk Index
Fingerprint Physical
Address
CNT
Hash(D)
Address
Decoder
A
B
C
Data Blocks
Storage
D
PA(D) 1
Address
Translation
Table
𝐿𝐴(D)
𝐿𝐴(A)
𝐿𝐴 B
𝐿𝐴(C)
PA(D)
PA(A)
PA(B)
PA(C)
Logical
Address
Physical
Address
𝐿𝐴2(D) PA(D) 2A
A
B
B
11. May 9, 2016
11
RAM+CPU Deduplication: Write (Case 2)
Case 2: If the FP is not
found
A unique data block
I. Write block to storage
II. Add LA+PA to ATT
III. Add FP to chunk index
1
…
1
…
1
PA(A)
…
PA(B)
…
PA(C)
Hash(A)
….
Hash(B)
…..
Hash(C)
Chunk Index
Fingerprint Physical
Address
CNT
Hash(D)
Address
Decoder
A
B
C
Data Blocks
Storage
D
PA(D) 1
A
B
C
A
C
Address
Translation
Table
𝐿𝐴(A)
𝐿𝐴 B
𝐿𝐴(C)
PA(A)
PA(B)
PA(C)
Logical
Address
Physical
Address
𝐿𝐴(D) PA(D)B
12. May 9, 2016
12
Deduplication is Hard with RAM+CPU
• Delete is even more complicated than write
• Requires complex data structures &
computations Large memory & CPU
• Example: EMC XtremIO Xbrick
• 5TB all-flash storage
• 256GB RAM
• Quad-core CPU
13. May 9, 2016
13
Deduplication in ReCAM
• Much simpler than with RAM
• Chunk index is not required anymore
• Allows to compare all data blocks in storage
simultaneously
– If found, store only address-pointers
Chunk Index
14. May 9, 2016
14
Deduplication in ReCAM
1. Search for new data
block in the storage
2. Act accordingly
(next slides)
Data
Hash
A
B
C
Data Blocks
Storage
PA(A)
PA(B)
PA(C)
Physical
Address
?
15. May 9, 2016
15
Deduplication in ReCAM
Case 1: If the Data is found
Data block already exists
I. Add address to ATT
Storage
PA(A)
PA(B)
PA(C)
Physical
Address
A
B
C
Data Blocks
DPA(D)
Logical
Address
Physical
Address
Address
Translation
Table
𝐿𝐴(D)
𝐿𝐴(A)
𝐿𝐴 B
𝐿𝐴(C)
PA(D)
PA(A)
PA(B)
PA(C)
𝐿𝐴2(D) PA(D)
16. May 9, 2016
16
Deduplication in ReCAM
Case 2: If the Data is not
found
New Data block
I. Write Data to storage
II. Add address to ATT
Address Translation Table
Storage
PA(A)
PA(B)
PA(C)
Physical
Address
A
B
C
Data Blocks
DPA(D)
A
A
B
𝐿𝐴(A)
𝐿𝐴 B
𝐿𝐴(C)
PA(A)
PA(B)
PA(C)
Logical
Address
Physical
Address
𝐿𝐴(D) PA(D)B
17. May 9, 2016
17
Deduplication in ReCAM
Much Simpler than with RAM
• Write:
1. Compare the entire array data simultaneously
2. If match, save only a pointer
3. If not, save the data block + pointer
• Delete isn’t more complicated than write
– If no addresses pointing to the data delete
18. May 9, 2016
18
Simulations
• ReCAM
– Cycle-accurate simulator: Size = 256GB, Clock = 1GHz
– SPICE each cycle power + performance
• Opendedup for comparison
– Intel PCM for CPU+DRAM energy
– Only deduplication energy was measured
– Per-block processing time for performance
• 50GB of writes
– Varying % of duplicate data
21. May 9, 2016
21
Conclusions
• ReCAM has 100x higher throughput than
deduplication with RAM+CPU
• Energy consumption is similar or lower for the
common block sizes (4 & 8KB)
• Can be used as cache in hybrid storage systems
• Future technology may allow for TBs of storage on a
single chip