SlideShare una empresa de Scribd logo
1 de 19
Introduction to Memory-
Style Storage in Linux
Clay Chang
COSCUP’17
6-Aug-2017
About Myself
• Senior System Software Architect at Hewlett Packard Enterprise
• Ph.D. Candidate at NTU CSIE
• Father of
Agenda
• Traditional Storage vs Memory-Style Storage
• Introduction to Persistent Memory / NVM / SCM
• PMEM Support in Linux
• Emulating PMEM in Linux
• References
Traditional Storage Stack
• Application
• VFS
• Page Cache
• Block Layer
• SCSI Upper-Level Drivers
• SCSI Low-Level Drivers
• Physical Devices
Application
VFS
(ext4, xfs, nfs, …)
Page Cache
Block Layer
(I/O scheduler, blkmq)
SCSI Upper-Level Drivers
(/dev/sd*, …)
SCSI Low-Level Drivers
(libata, ahci, …)
Physical Devices
(HDD, SSD, …)
open
read
write
close
….
mmap
Block I/Os (BIOs)
Driver Requests
* https://www.thomas-krenn.com/en/wiki/Linux_Storage_Stack_Diagram
Memory-Style Storage
• Application
• VFS
• Page Cache
• Block Layer
• SCSI Upper-Level Drivers
• SCSI Low-Level Drivers
• Physical Devices (Persistent
Memory)
Application
VFS
(ext4, xfs, nfs, …)
Page Cache
Block Layer
(I/O scheduler, blkmq)
SCSI Upper-Level Drivers
(/dev/sd*, …)
SCSI Low-Level Drivers
(libata, ahci, …)
Physical Devices
(Persistent Memory)
open
read
write
close
….
mmap
Block I/Os (BIOs)
Driver Requests
Load/Store
* https://www.thomas-krenn.com/en/wiki/Linux_Storage_Stack_Diagram
Introduction to Persistent
Memory / NVM / SCM
• Emerging Memory Technology
• Intel 3D XPoint, Phase Change
Memory (PCM), Memoristor, etc…
• DRAM-like Performance
• Storage-like Persistence
• No refresh charge / Non-volatile
• High Capacity / Low Cost
• Byte-addressable
• NVM: Non-Volatile Memory
• SCM: Storage Class Memory
107 - 108
105 - 106
NAND Flash
HDD
Mass storage
Mass storage, archive
Memory Hierarchy Shift in NVM Era
1
10
100
Register
SRAM
DRAM
L1 cache
L2,L3 cache
Main memory
Fast,
byte-addressable
Volatile
High refresh power
Slow
Large, low-cost,
non-volatile
No refresh
Memory-Storage GapNVM
ReRAM
PCM
3D XPoint
103 - 105
Fast,
byte-addressable
Large, low-cost,
non-volatile
NVM blurs the line between Memory and Storage
3D XPoint
1000X
FASTERTHAN NAND
1000X
ENDURANCEOF NAND
10X
DENSERTHAN CONVENTIONAL MEMORY
Source: http://www.intel.com/content/www/us/en/architecture-and-technology/3d-xpoint-unveiled-video.html
Intel 3D XPoint
Storage Architecture Shift in NVM Era
• We can attach NVM to the traditional I/O
bus as a drop-in replacement for Storage
• Or, directly attach NVM to the fast memory
bus as Storage
• Best way to unlock the performance of NVM
• NVDIMM - Non-Volatile Dual-Inline Memory
Module
DRAM
CPU
Main Memory
Memorybus
Flash/
Disk
Northbridge Southbridge
Storage
I/Obus
NVM
Software Architecture for Persistent Memory
DAX-enabled File System (1/2)
• Page Cache
• System software to mediate between
fast memory and slow storage
• Elegant Design
• Read-ahead
• Write-back policy
• …
• Valid for decades until now,
• Becoming NEW bottleneck
Storage
Memory
Page CacheCPU
Read
Write
DAX-enabled File System (2/2)
• DAX-enabled
CPU
DRAM NVM
Memory Controller
cache
file
data
• Before DAX
CPU
DRAM NVM
Memory Controller
cache
file
data
Page Cache:
Un-necessary data copy
• DAX – Direct Access
• Use existing mmap
semantics
• In-place update
• No storage stack involved
• True device performance
Is DAX all enough?
• Simple answer: No
• In-place update
• Cannot do Journaling
• Out-place required for Journaling
• What? Crash consistency not guaranteed!
Crash Consistency
Crash!
strcpy(pmem, “Hello World!”);
• “Ensure that the file system keeps the on-disk image in a reasonable
state given that crashes can occur at arbitrary points in time.”
[Remiz14]
X Hello
V Hello World!
V
Load/store
How to ensure Crash Consistency?
Application
NVM-Lib*
NVM
mmap
Direct Access
* NVM-Lib: Helper Library for user-handled data consistency due to file system bypassing
Ext4-DAX maps the physical
page on the NVM to user space
directly
Consistent is not guaranteed
when system crash or power
loss; needs NVM-Lib
Ext4-DAX
Requires Program Change!
Big Issue!
TX_BEGIN(TX_LOCK_MUTEX, &op->lock) {
TX_STRCPY(buffer, ”Hello World!”);
} TX_END
Playing with DAX / Emulating PM in Linux
• PM is supported in Linux since
v4.2
• Can be emulated by DRAM
• Can test the new software
stack
• # vi /etc/default/grub
GRUB_CMDLINE_LINUX="memmap=16G!16G”
• # update-grub2
• # mkdir /mnt/pmem
• # mkfs.ext4 /dev/pmem0
• # mount -o dax /dev/pmem0 /mnt/pmem
• # echo ”Hello World!” > /mnt/pmem/Hello.txt
• # cat /mnt/pmem/Hello.txt
Key Takeaways!
• New memory technology makes storage ultra-fast!
• DRAM-like Performance
• Storage-like Persistence
• System Software becoming the NEW bottleneck!
• New Programming Model (sometimes may not be a good idea)
• New File System!
• New Operating System?
• Try it now!
References
• http://pmem.io
• http://pmem.io/2016/02/22/pm-emulation.html
• http://events.linuxfoundation.org/sites/events/files/slides/Managing
%20Persistent%20Memory_0.pdf
• http://snia.org
• https://www.usenix.org/system/files/conference/fast16/fast16-
papers-xu.pdf - NOVA
• [Remiz14] http://pages.cs.wisc.edu/~remzi/OSTEP/file-journaling.pdf
Questions?
Thank You!

Más contenido relacionado

La actualidad más candente

S4 xen hypervisor_20080622
S4 xen hypervisor_20080622S4 xen hypervisor_20080622
S4 xen hypervisor_20080622
Todd Deshane
 
Virtualization which isn't: LXC (Linux Containers)
Virtualization which isn't: LXC (Linux Containers)Virtualization which isn't: LXC (Linux Containers)
Virtualization which isn't: LXC (Linux Containers)
Dobrica Pavlinušić
 

La actualidad más candente (20)

SELinux by Example
SELinux by ExampleSELinux by Example
SELinux by Example
 
(Free and Net) BSD Xen Roadmap
(Free and Net) BSD Xen Roadmap(Free and Net) BSD Xen Roadmap
(Free and Net) BSD Xen Roadmap
 
S4 xen hypervisor_20080622
S4 xen hypervisor_20080622S4 xen hypervisor_20080622
S4 xen hypervisor_20080622
 
LOAD BALANCING OF APPLICATIONS USING XEN HYPERVISOR
LOAD BALANCING OF APPLICATIONS  USING XEN HYPERVISORLOAD BALANCING OF APPLICATIONS  USING XEN HYPERVISOR
LOAD BALANCING OF APPLICATIONS USING XEN HYPERVISOR
 
Introduction to linux containers
Introduction to linux containersIntroduction to linux containers
Introduction to linux containers
 
KVM tools and enterprise usage
KVM tools and enterprise usageKVM tools and enterprise usage
KVM tools and enterprise usage
 
Containerization Is More than the New Virtualization
Containerization Is More than the New VirtualizationContainerization Is More than the New Virtualization
Containerization Is More than the New Virtualization
 
MongoDB World 2015 - A Technical Introduction to WiredTiger
MongoDB World 2015 - A Technical Introduction to WiredTigerMongoDB World 2015 - A Technical Introduction to WiredTiger
MongoDB World 2015 - A Technical Introduction to WiredTiger
 
Virtualization which isn't: LXC (Linux Containers)
Virtualization which isn't: LXC (Linux Containers)Virtualization which isn't: LXC (Linux Containers)
Virtualization which isn't: LXC (Linux Containers)
 
Linux Kernel Init Process
Linux Kernel Init ProcessLinux Kernel Init Process
Linux Kernel Init Process
 
TechDay - Cambridge 2016 - OpenNebula Corona
TechDay - Cambridge 2016 - OpenNebula CoronaTechDay - Cambridge 2016 - OpenNebula Corona
TechDay - Cambridge 2016 - OpenNebula Corona
 
Lxc – next gen virtualization for cloud intro (cloudexpo)
Lxc – next gen virtualization for cloud   intro (cloudexpo)Lxc – next gen virtualization for cloud   intro (cloudexpo)
Lxc – next gen virtualization for cloud intro (cloudexpo)
 
EuroSec2011 Slide "Memory Deduplication as a Threat to the Guest OS" by Kuniy...
EuroSec2011 Slide "Memory Deduplication as a Threat to the Guest OS" by Kuniy...EuroSec2011 Slide "Memory Deduplication as a Threat to the Guest OS" by Kuniy...
EuroSec2011 Slide "Memory Deduplication as a Threat to the Guest OS" by Kuniy...
 
Containers and Cloud: From LXC to Docker to Kubernetes
Containers and Cloud: From LXC to Docker to KubernetesContainers and Cloud: From LXC to Docker to Kubernetes
Containers and Cloud: From LXC to Docker to Kubernetes
 
RHEVM - Live Storage Migration
RHEVM - Live Storage MigrationRHEVM - Live Storage Migration
RHEVM - Live Storage Migration
 
MongoDB: Advantages of an Open Source NoSQL Database
MongoDB: Advantages of an Open Source NoSQL DatabaseMongoDB: Advantages of an Open Source NoSQL Database
MongoDB: Advantages of an Open Source NoSQL Database
 
Realizing Linux Containers (LXC)
Realizing Linux Containers (LXC)Realizing Linux Containers (LXC)
Realizing Linux Containers (LXC)
 
Driver_linux
Driver_linuxDriver_linux
Driver_linux
 
Linux Containers From Scratch
Linux Containers From ScratchLinux Containers From Scratch
Linux Containers From Scratch
 
High Performance Storage Devices in the Linux Kernel
High Performance Storage Devices in the Linux KernelHigh Performance Storage Devices in the Linux Kernel
High Performance Storage Devices in the Linux Kernel
 

Similar a Introduction to Memory-Style Storage in Linux

Алексей Лесовский "Тюнинг Linux для баз данных. "
Алексей Лесовский "Тюнинг Linux для баз данных. "Алексей Лесовский "Тюнинг Linux для баз данных. "
Алексей Лесовский "Тюнинг Linux для баз данных. "
Tanya Denisyuk
 
Linux操作系统01 简介
Linux操作系统01 简介Linux操作系统01 简介
Linux操作系统01 简介
lclsg123
 
Wheeler w 0450_linux_file_systems1
Wheeler w 0450_linux_file_systems1Wheeler w 0450_linux_file_systems1
Wheeler w 0450_linux_file_systems1
sprdd
 
Wheeler w 0450_linux_file_systems1
Wheeler w 0450_linux_file_systems1Wheeler w 0450_linux_file_systems1
Wheeler w 0450_linux_file_systems1
sprdd
 

Similar a Introduction to Memory-Style Storage in Linux (20)

UNIT 4-UNDERSTANDING VIRTUAL MEMORY.pptx
UNIT 4-UNDERSTANDING VIRTUAL MEMORY.pptxUNIT 4-UNDERSTANDING VIRTUAL MEMORY.pptx
UNIT 4-UNDERSTANDING VIRTUAL MEMORY.pptx
 
EECI 2013 - ExpressionEngine Performance & Optimization - Laying a Solid Foun...
EECI 2013 - ExpressionEngine Performance & Optimization - Laying a Solid Foun...EECI 2013 - ExpressionEngine Performance & Optimization - Laying a Solid Foun...
EECI 2013 - ExpressionEngine Performance & Optimization - Laying a Solid Foun...
 
Tuning Linux for Databases.
Tuning Linux for Databases.Tuning Linux for Databases.
Tuning Linux for Databases.
 
Алексей Лесовский "Тюнинг Linux для баз данных. "
Алексей Лесовский "Тюнинг Linux для баз данных. "Алексей Лесовский "Тюнинг Linux для баз данных. "
Алексей Лесовский "Тюнинг Linux для баз данных. "
 
Deployment Strategies (Mongo Austin)
Deployment Strategies (Mongo Austin)Deployment Strategies (Mongo Austin)
Deployment Strategies (Mongo Austin)
 
Linux操作系统01 简介
Linux操作系统01 简介Linux操作系统01 简介
Linux操作系统01 简介
 
Elastic storage in the cloud session 5224 final v2
Elastic storage in the cloud session 5224 final v2Elastic storage in the cloud session 5224 final v2
Elastic storage in the cloud session 5224 final v2
 
SUSE Enterprise Storage
SUSE Enterprise StorageSUSE Enterprise Storage
SUSE Enterprise Storage
 
Things I wish I knew about GemStone
Things I wish I knew about GemStoneThings I wish I knew about GemStone
Things I wish I knew about GemStone
 
Wheeler w 0450_linux_file_systems1
Wheeler w 0450_linux_file_systems1Wheeler w 0450_linux_file_systems1
Wheeler w 0450_linux_file_systems1
 
Wheeler w 0450_linux_file_systems1
Wheeler w 0450_linux_file_systems1Wheeler w 0450_linux_file_systems1
Wheeler w 0450_linux_file_systems1
 
Kfs presentation
Kfs presentationKfs presentation
Kfs presentation
 
Drupal performance
Drupal performanceDrupal performance
Drupal performance
 
Tachyon workshop 2015-07-19
Tachyon workshop 2015-07-19Tachyon workshop 2015-07-19
Tachyon workshop 2015-07-19
 
Serve like a boss (part one)
Serve like a boss (part one)Serve like a boss (part one)
Serve like a boss (part one)
 
Disaggregated Container Attached Storage - Yet Another Topology with What Pur...
Disaggregated Container Attached Storage - Yet Another Topology with What Pur...Disaggregated Container Attached Storage - Yet Another Topology with What Pur...
Disaggregated Container Attached Storage - Yet Another Topology with What Pur...
 
Disaggregated Container Attached Storage - Yet Another Topology with What Pur...
Disaggregated Container Attached Storage - Yet Another Topology with What Pur...Disaggregated Container Attached Storage - Yet Another Topology with What Pur...
Disaggregated Container Attached Storage - Yet Another Topology with What Pur...
 
Dustin Black - Red Hat Storage Server Administration Deep Dive
Dustin Black - Red Hat Storage Server Administration Deep DiveDustin Black - Red Hat Storage Server Administration Deep Dive
Dustin Black - Red Hat Storage Server Administration Deep Dive
 
Mini-Training: To cache or not to cache
Mini-Training: To cache or not to cacheMini-Training: To cache or not to cache
Mini-Training: To cache or not to cache
 
FreeBSD hosting
FreeBSD hostingFreeBSD hosting
FreeBSD hosting
 

Último

Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
chiefasafspells
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
shinachiaurasa2
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
masabamasaba
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
VictoriaMetrics
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
masabamasaba
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
masabamasaba
 

Último (20)

%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 
What Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationWhat Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the Situation
 
Artyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxArtyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptx
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go Platformless
 
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 

Introduction to Memory-Style Storage in Linux

  • 1. Introduction to Memory- Style Storage in Linux Clay Chang COSCUP’17 6-Aug-2017
  • 2. About Myself • Senior System Software Architect at Hewlett Packard Enterprise • Ph.D. Candidate at NTU CSIE • Father of
  • 3. Agenda • Traditional Storage vs Memory-Style Storage • Introduction to Persistent Memory / NVM / SCM • PMEM Support in Linux • Emulating PMEM in Linux • References
  • 4. Traditional Storage Stack • Application • VFS • Page Cache • Block Layer • SCSI Upper-Level Drivers • SCSI Low-Level Drivers • Physical Devices Application VFS (ext4, xfs, nfs, …) Page Cache Block Layer (I/O scheduler, blkmq) SCSI Upper-Level Drivers (/dev/sd*, …) SCSI Low-Level Drivers (libata, ahci, …) Physical Devices (HDD, SSD, …) open read write close …. mmap Block I/Os (BIOs) Driver Requests * https://www.thomas-krenn.com/en/wiki/Linux_Storage_Stack_Diagram
  • 5. Memory-Style Storage • Application • VFS • Page Cache • Block Layer • SCSI Upper-Level Drivers • SCSI Low-Level Drivers • Physical Devices (Persistent Memory) Application VFS (ext4, xfs, nfs, …) Page Cache Block Layer (I/O scheduler, blkmq) SCSI Upper-Level Drivers (/dev/sd*, …) SCSI Low-Level Drivers (libata, ahci, …) Physical Devices (Persistent Memory) open read write close …. mmap Block I/Os (BIOs) Driver Requests Load/Store * https://www.thomas-krenn.com/en/wiki/Linux_Storage_Stack_Diagram
  • 6. Introduction to Persistent Memory / NVM / SCM • Emerging Memory Technology • Intel 3D XPoint, Phase Change Memory (PCM), Memoristor, etc… • DRAM-like Performance • Storage-like Persistence • No refresh charge / Non-volatile • High Capacity / Low Cost • Byte-addressable • NVM: Non-Volatile Memory • SCM: Storage Class Memory
  • 7. 107 - 108 105 - 106 NAND Flash HDD Mass storage Mass storage, archive Memory Hierarchy Shift in NVM Era 1 10 100 Register SRAM DRAM L1 cache L2,L3 cache Main memory Fast, byte-addressable Volatile High refresh power Slow Large, low-cost, non-volatile No refresh Memory-Storage GapNVM ReRAM PCM 3D XPoint 103 - 105 Fast, byte-addressable Large, low-cost, non-volatile NVM blurs the line between Memory and Storage
  • 8. 3D XPoint 1000X FASTERTHAN NAND 1000X ENDURANCEOF NAND 10X DENSERTHAN CONVENTIONAL MEMORY Source: http://www.intel.com/content/www/us/en/architecture-and-technology/3d-xpoint-unveiled-video.html Intel 3D XPoint
  • 9. Storage Architecture Shift in NVM Era • We can attach NVM to the traditional I/O bus as a drop-in replacement for Storage • Or, directly attach NVM to the fast memory bus as Storage • Best way to unlock the performance of NVM • NVDIMM - Non-Volatile Dual-Inline Memory Module DRAM CPU Main Memory Memorybus Flash/ Disk Northbridge Southbridge Storage I/Obus NVM
  • 10. Software Architecture for Persistent Memory
  • 11. DAX-enabled File System (1/2) • Page Cache • System software to mediate between fast memory and slow storage • Elegant Design • Read-ahead • Write-back policy • … • Valid for decades until now, • Becoming NEW bottleneck Storage Memory Page CacheCPU Read Write
  • 12. DAX-enabled File System (2/2) • DAX-enabled CPU DRAM NVM Memory Controller cache file data • Before DAX CPU DRAM NVM Memory Controller cache file data Page Cache: Un-necessary data copy • DAX – Direct Access • Use existing mmap semantics • In-place update • No storage stack involved • True device performance
  • 13. Is DAX all enough? • Simple answer: No • In-place update • Cannot do Journaling • Out-place required for Journaling • What? Crash consistency not guaranteed!
  • 14. Crash Consistency Crash! strcpy(pmem, “Hello World!”); • “Ensure that the file system keeps the on-disk image in a reasonable state given that crashes can occur at arbitrary points in time.” [Remiz14] X Hello V Hello World! V
  • 15. Load/store How to ensure Crash Consistency? Application NVM-Lib* NVM mmap Direct Access * NVM-Lib: Helper Library for user-handled data consistency due to file system bypassing Ext4-DAX maps the physical page on the NVM to user space directly Consistent is not guaranteed when system crash or power loss; needs NVM-Lib Ext4-DAX Requires Program Change! Big Issue! TX_BEGIN(TX_LOCK_MUTEX, &op->lock) { TX_STRCPY(buffer, ”Hello World!”); } TX_END
  • 16. Playing with DAX / Emulating PM in Linux • PM is supported in Linux since v4.2 • Can be emulated by DRAM • Can test the new software stack • # vi /etc/default/grub GRUB_CMDLINE_LINUX="memmap=16G!16G” • # update-grub2 • # mkdir /mnt/pmem • # mkfs.ext4 /dev/pmem0 • # mount -o dax /dev/pmem0 /mnt/pmem • # echo ”Hello World!” > /mnt/pmem/Hello.txt • # cat /mnt/pmem/Hello.txt
  • 17. Key Takeaways! • New memory technology makes storage ultra-fast! • DRAM-like Performance • Storage-like Persistence • System Software becoming the NEW bottleneck! • New Programming Model (sometimes may not be a good idea) • New File System! • New Operating System? • Try it now!
  • 18. References • http://pmem.io • http://pmem.io/2016/02/22/pm-emulation.html • http://events.linuxfoundation.org/sites/events/files/slides/Managing %20Persistent%20Memory_0.pdf • http://snia.org • https://www.usenix.org/system/files/conference/fast16/fast16- papers-xu.pdf - NOVA • [Remiz14] http://pages.cs.wisc.edu/~remzi/OSTEP/file-journaling.pdf

Notas del editor

  1. https://www.thomas-krenn.com/en/wiki/Linux_Storage_Stack_Diagram
  2. https://www.thomas-krenn.com/en/wiki/Linux_Storage_Stack_Diagram
  3. 首先,我們看到傳統的 Memory Hierarchy:最上層是 Register,只需要一個 CPU cycle 就可以存取,接下來是只需要10或數10個 CPU cycles 就可以存取的 L1/L2/L3 Cache,通常用 SRAM 的技術來製作。接著往下一層是由DRAM所組成的Main Memory,大約需要100個CPU cycles就可以存取。再往下,則是由 NAND Flash 或 HDD 所組成的Mass Storage層,其存取速度大約是 10^5~10^8個 CPU cycle。在越接近 Hierarchy 上方的各層,他們的特性是快速、以byte為定址單位、揮發性(Volatile)以及需要進行定時且高耗電的refresh;而在 Hierarchy 下方的各層,則是具有大空間、低成本、非揮發性以及不需要定時refresh的特性。 DRAM 跟 Mass Storage 當中高達10^3以上的速度差異,我們稱之為 Memory Storage Gap。 隨著新興 NVM 技術的開發與演進,像是 RRAM, PCM 以及最近由 Intel 所推出的 3D XPoint 技術也逐漸成熟而推出市面。這類記憶體的特性是快速、以byte定址、大空間、低成本及非揮發。速度方面則是落在10^3~10^5個CPU cycle之間。因此在 Memory Hierarchy 上就多出了一層,我們稱它為 Storage class memory(SCM),也就是高效能、高容量、非揮發的新型態記憶體。 因此我們說,由於NVM的總總特性,基本上已經模糊了Memory及Storage之間的界線。
  4. Message: Intel/Micron invented a breakthrough in memory technology that is 1000x faster and 1000x greater endurance than NAND and 10x denser than DRAM. This puts Micron in a position to help our customers drive innovative new computing architectures.