3. Why extend main memory with flash?
• To overcome DRAM scaling limitations and offer large working memory
• To reduce total cost of ownership (acquisition and operation)
• Flash has no seek time
• Flash has faster latency than HDD
Two approaches toward memory extension
• Non-transparent approach: Application has to change
• Transparent approach: Application is NOT aware of the underlying flash
Introduction
4. Current swap algorithm is optimized for HDD
Paging for the Fast device
• Fast and Simple vs. Heavy and Accurate
Motivation
5. Swap entry search
• A new search algorithm
I/O path optimization
• Swap read-ahead
• I/O scheduler
• Swappiness
Swap device as backing store: Inclusive vs. Exclusive
• We adjust the swap entry free policy to enforce that the swap device
“includes” all swapped out pages
Optimized SWAP
6. Tree search
• “Bit tree”, no pointer, a node size is just one byte
• Fan-out degree is 8 (one bit is pointing a child node)
• 8-level tree covers multi-terabytes of swap space.
• Search cost: 2O(log N)
• Reduce swap structure size
– Roughly current swap mechanism vs. O-Swap = 10MB vs. 2MB (to support 32GB
swap space)
Optimized SWAP
0 2 4 61 3 5 7 8 9
7. Read-ahead
• No read-ahead (due to randomness)
• Note also that SSD has no seek time
I/O scheduler
• NOOP (due to randomness and fast response requirements)
• Bypass
Swappiness
• swappiness : 0
Swap entry reclaim policy
• Do not free swap entries as much as possible
Optimized SWAP
8. Evaluation - Memcached
System
CPU Xeon E5-2665 (HT disabled)
# Core 16
Network 10Gb Ethernet
SSD Samsung XS1715 (NVME)
Workload
YCSB
DB Size 30GB
Value Length 2048B
# memcached threads 64
# Clients 320
Get : Update 95% : 5%
Memory
SWAP OSWAP Full DRAM
DRAM 8GB
SSD Swap 32GB
DRAM 8GB
SSD Swap 32GB
DRAM 32GB
15. Rack scale architecture
High performance memory + High capacity memory
Future Work
CPUs
DRAM
DRAM
DRAM
Compute
PCIe <-> Ctrl Ctrl
Memory
Memory
Memorycable
Memory Device