2. About Myself
• Senior System Software Architect at Hewlett Packard Enterprise
• Ph.D. Candidate at NTU CSIE
• Father of
3. Agenda
• Traditional Storage vs Memory-Style Storage
• Introduction to Persistent Memory / NVM / SCM
• PMEM Support in Linux
• Emulating PMEM in Linux
• References
7. 107 - 108
105 - 106
NAND Flash
HDD
Mass storage
Mass storage, archive
Memory Hierarchy Shift in NVM Era
1
10
100
Register
SRAM
DRAM
L1 cache
L2,L3 cache
Main memory
Fast,
byte-addressable
Volatile
High refresh power
Slow
Large, low-cost,
non-volatile
No refresh
Memory-Storage GapNVM
ReRAM
PCM
3D XPoint
103 - 105
Fast,
byte-addressable
Large, low-cost,
non-volatile
NVM blurs the line between Memory and Storage
8. 3D XPoint
1000X
FASTERTHAN NAND
1000X
ENDURANCEOF NAND
10X
DENSERTHAN CONVENTIONAL MEMORY
Source: http://www.intel.com/content/www/us/en/architecture-and-technology/3d-xpoint-unveiled-video.html
Intel 3D XPoint
9. Storage Architecture Shift in NVM Era
• We can attach NVM to the traditional I/O
bus as a drop-in replacement for Storage
• Or, directly attach NVM to the fast memory
bus as Storage
• Best way to unlock the performance of NVM
• NVDIMM - Non-Volatile Dual-Inline Memory
Module
DRAM
CPU
Main Memory
Memorybus
Flash/
Disk
Northbridge Southbridge
Storage
I/Obus
NVM
11. DAX-enabled File System (1/2)
• Page Cache
• System software to mediate between
fast memory and slow storage
• Elegant Design
• Read-ahead
• Write-back policy
• …
• Valid for decades until now,
• Becoming NEW bottleneck
Storage
Memory
Page CacheCPU
Read
Write
12. DAX-enabled File System (2/2)
• DAX-enabled
CPU
DRAM NVM
Memory Controller
cache
file
data
• Before DAX
CPU
DRAM NVM
Memory Controller
cache
file
data
Page Cache:
Un-necessary data copy
• DAX – Direct Access
• Use existing mmap
semantics
• In-place update
• No storage stack involved
• True device performance
13. Is DAX all enough?
• Simple answer: No
• In-place update
• Cannot do Journaling
• Out-place required for Journaling
• What? Crash consistency not guaranteed!
14. Crash Consistency
Crash!
strcpy(pmem, “Hello World!”);
• “Ensure that the file system keeps the on-disk image in a reasonable
state given that crashes can occur at arbitrary points in time.”
[Remiz14]
X Hello
V Hello World!
V
15. Load/store
How to ensure Crash Consistency?
Application
NVM-Lib*
NVM
mmap
Direct Access
* NVM-Lib: Helper Library for user-handled data consistency due to file system bypassing
Ext4-DAX maps the physical
page on the NVM to user space
directly
Consistent is not guaranteed
when system crash or power
loss; needs NVM-Lib
Ext4-DAX
Requires Program Change!
Big Issue!
TX_BEGIN(TX_LOCK_MUTEX, &op->lock) {
TX_STRCPY(buffer, ”Hello World!”);
} TX_END
16. Playing with DAX / Emulating PM in Linux
• PM is supported in Linux since
v4.2
• Can be emulated by DRAM
• Can test the new software
stack
• # vi /etc/default/grub
GRUB_CMDLINE_LINUX="memmap=16G!16G”
• # update-grub2
• # mkdir /mnt/pmem
• # mkfs.ext4 /dev/pmem0
• # mount -o dax /dev/pmem0 /mnt/pmem
• # echo ”Hello World!” > /mnt/pmem/Hello.txt
• # cat /mnt/pmem/Hello.txt
17. Key Takeaways!
• New memory technology makes storage ultra-fast!
• DRAM-like Performance
• Storage-like Persistence
• System Software becoming the NEW bottleneck!
• New Programming Model (sometimes may not be a good idea)
• New File System!
• New Operating System?
• Try it now!
首先,我們看到傳統的 Memory Hierarchy:最上層是 Register,只需要一個 CPU cycle 就可以存取,接下來是只需要10或數10個 CPU cycles 就可以存取的 L1/L2/L3 Cache,通常用 SRAM 的技術來製作。接著往下一層是由DRAM所組成的Main Memory,大約需要100個CPU cycles就可以存取。再往下,則是由 NAND Flash 或 HDD 所組成的Mass Storage層,其存取速度大約是 10^5~10^8個 CPU cycle。在越接近 Hierarchy 上方的各層,他們的特性是快速、以byte為定址單位、揮發性(Volatile)以及需要進行定時且高耗電的refresh;而在 Hierarchy 下方的各層,則是具有大空間、低成本、非揮發性以及不需要定時refresh的特性。
DRAM 跟 Mass Storage 當中高達10^3以上的速度差異,我們稱之為 Memory Storage Gap。
隨著新興 NVM 技術的開發與演進,像是 RRAM, PCM 以及最近由 Intel 所推出的 3D XPoint 技術也逐漸成熟而推出市面。這類記憶體的特性是快速、以byte定址、大空間、低成本及非揮發。速度方面則是落在10^3~10^5個CPU cycle之間。因此在 Memory Hierarchy 上就多出了一層,我們稱它為 Storage class memory(SCM),也就是高效能、高容量、非揮發的新型態記憶體。
因此我們說,由於NVM的總總特性,基本上已經模糊了Memory及Storage之間的界線。
Message: Intel/Micron invented a breakthrough in memory technology that is 1000x faster and 1000x greater endurance than NAND and 10x denser than DRAM. This puts Micron in a position to help our customers drive innovative new computing architectures.