SlideShare una empresa de Scribd logo
1 de 21
Descargar para leer sin conexión
Towards Software-defined Persistent Memory:

Rethinking Software Support for Heterogeneous Memory Architectures
Swaminathan Sundararaman*

NishaTalagala*
Dhananjoy Das Amar Mudrankit*
Dulcardo Arteaga*
*Work done at Fusion-io/SanDisk
CONFIDENTIAL
PARALLEL MACHINES CONFIDENTIAL 2
Memory-Storage	Convergence (Trend 1)	 	 	
L1, L2, L3 CPU Caches DRAM Hard Drives
MicrosecondsNanoseconds
CPU WAIT CYCLES
Tiered Memory Solutions
Main Memory System Storage Systems
Milliseconds
ACCESS DELAY
2 cycles 1,000,000s100s 1,000s 10,000s
chasm
2
Flash

Memory
Persistent
Memories
PM blurs the line between storage and memory
CONFIDENTIAL
PARALLEL MACHINES CONFIDENTIAL 3
Challenges with Current Persistent Memory Solutions
Access Granularity Byte (Memory) Block (I/O) Hybrid
Memory Technology PCM
ReRAM/
Memristor
SRAM
(backed by Cap.)
NVDIMM
Capacity 1 - 100s GB 1 - 100s GB 32K – 2GB 4 – 32GB
Local Attach Point PCIe NVMe SAS DDR
Access Mechanism File System Object Store KV Store …
Memory Location Local Remote Replicated
Network Connect Infiniband Ethernet PCIe …
3
Many possible combinations!
CONFIDENTIAL
PARALLEL MACHINES CONFIDENTIAL 4
• Rewrite applications for different deployments
▪ Not practical given the number of scenarios
• What about existing applications / deployments?
• User data is constantly growing and needs not fit in persistent
memory
What Should Application Developers Do?
4
CONFIDENTIAL
PARALLEL MACHINES CONFIDENTIAL 5
Moving Towards a Software-defined World… (Trend 2)
Software-Defined Networking (SDN)
Enables administrators to manage network services
by abstracting higher level functionality
Abstraction of logical storage services and capabilities
from the underlying physical storage systems
Software-Defined Storage (SDS) Software-Defined Flash (SDF)
Abstract or expose flash specific details to enable software to
realize the raw bandwidth and storage capacity of Flash
Software-Defined Data Center (SDDC)
All elements of the infrastructure such as networking, storage,
CPU and security are virtualized and delivered as a service
5
Abstraction of logical storage services and capabilities from the
underlying physical persistent memory hardware and interconnect
Software-Defined Persistent Memory (SDPM)
CONFIDENTIAL
PARALLEL MACHINES CONFIDENTIAL 6
Our SDPM Solution
• The first instance of a software defined approach to PM that can bring
the benefits of PM to a gamut of practical deployments.
• Abstract the heterogeneity in PM hardware from applications
• Provide file system API & programming libraries to access PM
• Use currently available PM hardware to show the feasibility of an
SDPM
▪ PCIe & DDR4 attached PM (both local & remote attach)
▪ Using Infiniband & 10G Ethernet for remote access
• The prototype architecture provides good performance and near
optimal acceleration for a range of local and remote PM configurations
6
CONFIDENTIAL
PARALLEL MACHINES CONFIDENTIAL 7
• Introduction
• Design
• Evaluation
• Conclusion
Outline
7
CONFIDENTIAL
PARALLEL MACHINES CONFIDENTIAL 8
• Support a variety of local and remote attach points with differing
performance but identical functionality and semantic guarantees.
• Enable tiering of data between PM and flash, with caching in DRAM,
to enable different cost/performance configurations
• Support hybrid (i.e., both memory and I/O) access, traditional
storage management, and persistence guarantees to combine the
best of memory and storage worlds.
• Enable a single application programming model to work across a
variety of hardware
SDPM Design Goals
8
CONFIDENTIAL
PARALLEL MACHINES CONFIDENTIAL 9
Software Defined Persistent-Memory Architecture
Applications
Programming 

Libraries
File System
Persistent Memory

Manager
Block Device

Driver
SDPM
PCIe/

DDR
SAS/

SATA/

NVMe
PM
Infiniband/

Ethernet/

PCIe/…
Memory/

Block (I/O)
• Persistent Memory Manager
▪ Abstracts hardware and interconnect
details from the file system / applications
▪ Exports APIs to guarantee persistence
• File System
▪ Unified persistent namespace to PM &
Flash
▪ Hybrid access to PM
▪ Transparent & non-transparent
acceleration
• Programming Libraries
▪ Unified access APIs to applications
▪ OS bypass for remote access
CONFIDENTIAL
PARALLEL MACHINES CONFIDENTIAL 10
• A flash-optimized POSIX compliant Linux file system
▪ Extended NVMFS to support PM in addition to Flash
• Provides a unified and persistent namespace to both PM and flash
▪ Hybrid (memory & I/O) access to PM and applications can switch back and
forth
▪ Transparent application acceleration (by tiering data between PM & flash)
▪ Supports “direct” mmap mode to directly map and use PM without caching in
DRAM
▪ Supports a single programming model via a combination of application specific
libraries over direct mmap and transparent access via POSIX APIs
Non-Volatile Memory File System (NVMFS)
10
CONFIDENTIAL
PARALLEL MACHINES CONFIDENTIAL 11
Persistent Memory Manager: Hardware Abstractions
• Memory Mapping Types
▪ PM can be mapped in multiple ways depending on the hardware. We need to
ensure that each memory type is default mapped to the optimal model possible
for its physical attach (by default map it to write combining)
▪ Enable FS operations that allow the app. to control the per file memory mapping.
• Guaranteeing Persistence
▪ We need mechanism(s) to guarantee all acknowledged in-flight data (such as in
CPU caches, registers, etc.) have reached the PM device independent of its
attach point
▪ PMM provides a barrier() operation (to NVMFS and user space libraries) that
guarantees that all data is moved to the persistence domain as needed for the
attach point
11
CONFIDENTIAL
PARALLEL MACHINES CONFIDENTIAL 12
• Introduction
• Design
• Evaluation
• Conclusion
Outline
12
CONFIDENTIAL
PARALLEL MACHINES CONFIDENTIAL 13
Evaluation - Configuration
Configurations Attach Method Local or Remote
Config-1 DDR NVDIMM Local
Config-2 PCIe MMIO Local
Config-3 DDR NVDIMM Remote Ethernet
Config-4 PCIe MMIO Remote Ethernet
Config-5 DDR NVDIMM Remote Infiniband
Config-6 PCIe MMIO Remote Infiniband
Config-1, Config-2

(Local)
Config- 3, 4, 5, 6 

(Remote)
System Configuration HP DL380, 96GB DDR DRAM, x86_64 Linux 3.14 kernel
MySQL Version Percona 5.5
Flash Device (PCIe) Fusion-io Gen 2 ioMemory 1.2TB
Persistent Memory PCI-e: ACM (512KB)

DDR4: Viking NVDIMM (8GB)
Network Interconnect N/A Infiniband: ConnectX-3 56 Gbit IB Ethernet:
Intel 82599ES 10-Gigabit
13
CONFIDENTIAL
PARALLEL MACHINES CONFIDENTIAL 14
Evaluation - Setup
Host A

(Source)MySQL
Programming 

Libraires
NVMFS PMM
ioMemory VSL
Replication Src
PM
SDPM
Local
Config-2
Config-1
Host B

(Sink)
Programming 

Libraires
NVMFSPMM
ioMemory VSL
Replication Sink

(daemon)
PM
SDPM
Config-4
Config-3
10Gbit Ethernet
56Gbit Infiniband
Remote
14 Config-5
Config-6
CONFIDENTIAL
PARALLEL MACHINES CONFIDENTIAL 15
Local PCIe MMIO Vs DDR4 (config-1 Vs Config-2)
Bandwidth(MB/sec)
0
1000
2000
3000
4000
Size (bytes)
64 128 256 512 1024 2048 4096
DDR-BW MMIO-BW
Latency(nsec)
0
2000
4000
6000
8000
Size (bytes)
64 128 256 512 1024 2048 4096
DDR-Latency
MMIO-Latency
15
Smaller data sizes -> barrier operation 

Larger data sizes -> transport media
CONFIDENTIAL
PARALLEL MACHINES CONFIDENTIAL 16
Barrier Overheads: Local (config-1) Vs Remote (config-3 & config-5)
Latency(us)
0
12500
25000
37500
50000
Synchronization Frequency (#Ops)
1 10 100 1000
Ethernet FLUSH InfiniBand RDMA Local ACMPM
16
20x
4x
# of operations between synchronizations increases, the
performance becomes closer
CONFIDENTIAL
PARALLEL MACHINES CONFIDENTIAL
0
15,000
30,000
45,000
60,000
REMOTE LOCAL VANILLA NO_LOG
Bin & Tr Log Tr log ONLY
MySQL: Insert Heavy Workload
17
Config-3 Config-1 on Flash Infinitely fast logging
CONFIDENTIAL
PARALLEL MACHINES CONFIDENTIAL
Operations/sec(in
x1000)
0
7500
15000
22500
30000
MySQL: LinkBench 10x Workload
18
No Logs Vanilla Local

(config-1)
Remote

(config-5)
31%
17%
Facebook’s social graph workload

10x: 100 million nodes

Inserts, delete, update, and lookup

30% writes (insert/update/delete)

70% reads (lookup)

Infiniband performance for small updates is sufficient
3%
CONFIDENTIAL
PARALLEL MACHINES CONFIDENTIAL 19
• Introduction
• Design
• Evaluation
• Conclusion
Outline
19
CONFIDENTIAL
PARALLEL MACHINES CONFIDENTIAL
Conclusions
• PM is going to change the storage-memory landscape
• Many different forms / capacities / attach points / performance
• SDPM: a software-defined approach to using persistent memory
• Abstracts heterogeneity in memory hardware
• Applications can transparently run on local & remote persistent memory
• Selectively abstracts PM characteristics to provide optimal performance
• Transparently tier data between PM & Flash to exploit both existing & new
applications written to run on PM.
• Our evaluation shows near optimal performance for local & remote attach PM
20
CONFIDENTIAL
PARALLEL MACHINES CONFIDENTIAL 21
Thank You
Iif you are interested in trying out SDPM: Dhananjoy.Das@sandisk.com

Más contenido relacionado

La actualidad más candente

Constexprとtemplateでコンパイル時にfizz buzz
Constexprとtemplateでコンパイル時にfizz buzzConstexprとtemplateでコンパイル時にfizz buzz
Constexprとtemplateでコンパイル時にfizz buzz
京大 マイコンクラブ
 

La actualidad más candente (20)

Dockerセキュリティ: 今すぐ役に立つテクニックから,次世代技術まで
 Dockerセキュリティ: 今すぐ役に立つテクニックから,次世代技術まで Dockerセキュリティ: 今すぐ役に立つテクニックから,次世代技術まで
Dockerセキュリティ: 今すぐ役に立つテクニックから,次世代技術まで
 
ARMアーキテクチャにおけるセキュリティ機構の紹介
ARMアーキテクチャにおけるセキュリティ機構の紹介ARMアーキテクチャにおけるセキュリティ機構の紹介
ARMアーキテクチャにおけるセキュリティ機構の紹介
 
Dbts 分散olt pv2
Dbts 分散olt pv2Dbts 分散olt pv2
Dbts 分散olt pv2
 
http2 最速実装 v2
http2 最速実装 v2 http2 最速実装 v2
http2 最速実装 v2
 
Review of QNX
Review of QNXReview of QNX
Review of QNX
 
Cisco Connect Japan 2014:Cisco ASA 5500-X 次世代ファイアウォールの機能と、安定導入・運用方法
Cisco Connect Japan 2014:Cisco ASA 5500-X 次世代ファイアウォールの機能と、安定導入・運用方法Cisco Connect Japan 2014:Cisco ASA 5500-X 次世代ファイアウォールの機能と、安定導入・運用方法
Cisco Connect Japan 2014:Cisco ASA 5500-X 次世代ファイアウォールの機能と、安定導入・運用方法
 
FPGA+SoC+Linux実践勉強会資料
FPGA+SoC+Linux実践勉強会資料FPGA+SoC+Linux実践勉強会資料
FPGA+SoC+Linux実践勉強会資料
 
3種類のTEE比較(Intel SGX, ARM TrustZone, RISC-V Keystone)
3種類のTEE比較(Intel SGX, ARM TrustZone, RISC-V Keystone)3種類のTEE比較(Intel SGX, ARM TrustZone, RISC-V Keystone)
3種類のTEE比較(Intel SGX, ARM TrustZone, RISC-V Keystone)
 
ARM Trusted FirmwareのBL31を単体で使う!
ARM Trusted FirmwareのBL31を単体で使う!ARM Trusted FirmwareのBL31を単体で使う!
ARM Trusted FirmwareのBL31を単体で使う!
 
Micron CXL product and architecture update
Micron CXL product and architecture updateMicron CXL product and architecture update
Micron CXL product and architecture update
 
Scapyで作る・解析するパケット
Scapyで作る・解析するパケットScapyで作る・解析するパケット
Scapyで作る・解析するパケット
 
Constexprとtemplateでコンパイル時にfizz buzz
Constexprとtemplateでコンパイル時にfizz buzzConstexprとtemplateでコンパイル時にfizz buzz
Constexprとtemplateでコンパイル時にfizz buzz
 
Rootlinux17: Hypervisors on ARM - Overview and Design Choices by Julien Grall...
Rootlinux17: Hypervisors on ARM - Overview and Design Choices by Julien Grall...Rootlinux17: Hypervisors on ARM - Overview and Design Choices by Julien Grall...
Rootlinux17: Hypervisors on ARM - Overview and Design Choices by Julien Grall...
 
Fibre Channel 基礎講座
Fibre Channel 基礎講座Fibre Channel 基礎講座
Fibre Channel 基礎講座
 
Rust で RTOS を考える
Rust で RTOS を考えるRust で RTOS を考える
Rust で RTOS を考える
 
不揮発メモリ(NVDIMM)とLinuxの対応動向について
不揮発メモリ(NVDIMM)とLinuxの対応動向について不揮発メモリ(NVDIMM)とLinuxの対応動向について
不揮発メモリ(NVDIMM)とLinuxの対応動向について
 
Linux女子部 systemd徹底入門
Linux女子部 systemd徹底入門Linux女子部 systemd徹底入門
Linux女子部 systemd徹底入門
 
xrdpで変える!社内のPC環境
xrdpで変える!社内のPC環境xrdpで変える!社内のPC環境
xrdpで変える!社内のPC環境
 
いまさら聞けないarmを使ったNEONの基礎と活用事例
いまさら聞けないarmを使ったNEONの基礎と活用事例いまさら聞けないarmを使ったNEONの基礎と活用事例
いまさら聞けないarmを使ったNEONの基礎と活用事例
 
7 hands on
7 hands on7 hands on
7 hands on
 

Similar a Towards Software Defined Persistent Memory

Introduction to NVMe Over Fabrics-V3R
Introduction to NVMe Over Fabrics-V3RIntroduction to NVMe Over Fabrics-V3R
Introduction to NVMe Over Fabrics-V3R
Simon Huang
 
Ibm symp14 referent_marcus alexander mac dougall_ibm x6 und flex system
Ibm symp14 referent_marcus alexander mac dougall_ibm x6 und flex systemIbm symp14 referent_marcus alexander mac dougall_ibm x6 und flex system
Ibm symp14 referent_marcus alexander mac dougall_ibm x6 und flex system
IBM Switzerland
 

Similar a Towards Software Defined Persistent Memory (20)

C++ Programming and the Persistent Memory Developers Kit
C++ Programming and the Persistent Memory Developers KitC++ Programming and the Persistent Memory Developers Kit
C++ Programming and the Persistent Memory Developers Kit
 
@IBM Power roadmap 8
@IBM Power roadmap 8 @IBM Power roadmap 8
@IBM Power roadmap 8
 
Introduction to NVMe Over Fabrics-V3R
Introduction to NVMe Over Fabrics-V3RIntroduction to NVMe Over Fabrics-V3R
Introduction to NVMe Over Fabrics-V3R
 
SanDisk: Persistent Memory and Cassandra
SanDisk: Persistent Memory and CassandraSanDisk: Persistent Memory and Cassandra
SanDisk: Persistent Memory and Cassandra
 
Ibm symp14 referent_marcus alexander mac dougall_ibm x6 und flex system
Ibm symp14 referent_marcus alexander mac dougall_ibm x6 und flex systemIbm symp14 referent_marcus alexander mac dougall_ibm x6 und flex system
Ibm symp14 referent_marcus alexander mac dougall_ibm x6 und flex system
 
Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...
Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...
Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...
 
Presentation sparc m6 m5-32 server technical overview
Presentation   sparc m6 m5-32 server technical overviewPresentation   sparc m6 m5-32 server technical overview
Presentation sparc m6 m5-32 server technical overview
 
From Rack scale computers to Warehouse scale computers
From Rack scale computers to Warehouse scale computersFrom Rack scale computers to Warehouse scale computers
From Rack scale computers to Warehouse scale computers
 
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storageWebinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
 
Cisco connect montreal 2018 compute v final
Cisco connect montreal 2018   compute v finalCisco connect montreal 2018   compute v final
Cisco connect montreal 2018 compute v final
 
Heterogeneous Computing : The Future of Systems
Heterogeneous Computing : The Future of SystemsHeterogeneous Computing : The Future of Systems
Heterogeneous Computing : The Future of Systems
 
Red Hat Storage Day Boston - Supermicro Super Storage
Red Hat Storage Day Boston - Supermicro Super StorageRed Hat Storage Day Boston - Supermicro Super Storage
Red Hat Storage Day Boston - Supermicro Super Storage
 
Introduction to HPC & Supercomputing in AI
Introduction to HPC & Supercomputing in AIIntroduction to HPC & Supercomputing in AI
Introduction to HPC & Supercomputing in AI
 
IBM HPC Transformation with AI
IBM HPC Transformation with AI IBM HPC Transformation with AI
IBM HPC Transformation with AI
 
High Performance Object Storage in 30 Minutes with Supermicro and MinIO
High Performance Object Storage in 30 Minutes with Supermicro and MinIOHigh Performance Object Storage in 30 Minutes with Supermicro and MinIO
High Performance Object Storage in 30 Minutes with Supermicro and MinIO
 
Red hat Enterprise Linux 6.4 for IBM System z Technical Highlights
Red hat Enterprise Linux 6.4 for IBM System z Technical HighlightsRed hat Enterprise Linux 6.4 for IBM System z Technical Highlights
Red hat Enterprise Linux 6.4 for IBM System z Technical Highlights
 
Multiprocessor Architecture for Image Processing
Multiprocessor Architecture for Image ProcessingMultiprocessor Architecture for Image Processing
Multiprocessor Architecture for Image Processing
 
IME - Unlocking the Potential of NVMe
IME - Unlocking the Potential of NVMeIME - Unlocking the Potential of NVMe
IME - Unlocking the Potential of NVMe
 
AMD EPYC™ Microprocessor Architecture
AMD EPYC™ Microprocessor ArchitectureAMD EPYC™ Microprocessor Architecture
AMD EPYC™ Microprocessor Architecture
 
S104878 nvme-revolution-jburg-v1809b
S104878 nvme-revolution-jburg-v1809bS104878 nvme-revolution-jburg-v1809b
S104878 nvme-revolution-jburg-v1809b
 

Último

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Último (20)

MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 

Towards Software Defined Persistent Memory

  • 1. Towards Software-defined Persistent Memory: Rethinking Software Support for Heterogeneous Memory Architectures Swaminathan Sundararaman* NishaTalagala* Dhananjoy Das Amar Mudrankit* Dulcardo Arteaga* *Work done at Fusion-io/SanDisk
  • 2. CONFIDENTIAL PARALLEL MACHINES CONFIDENTIAL 2 Memory-Storage Convergence (Trend 1) L1, L2, L3 CPU Caches DRAM Hard Drives MicrosecondsNanoseconds CPU WAIT CYCLES Tiered Memory Solutions Main Memory System Storage Systems Milliseconds ACCESS DELAY 2 cycles 1,000,000s100s 1,000s 10,000s chasm 2 Flash
 Memory Persistent Memories PM blurs the line between storage and memory
  • 3. CONFIDENTIAL PARALLEL MACHINES CONFIDENTIAL 3 Challenges with Current Persistent Memory Solutions Access Granularity Byte (Memory) Block (I/O) Hybrid Memory Technology PCM ReRAM/ Memristor SRAM (backed by Cap.) NVDIMM Capacity 1 - 100s GB 1 - 100s GB 32K – 2GB 4 – 32GB Local Attach Point PCIe NVMe SAS DDR Access Mechanism File System Object Store KV Store … Memory Location Local Remote Replicated Network Connect Infiniband Ethernet PCIe … 3 Many possible combinations!
  • 4. CONFIDENTIAL PARALLEL MACHINES CONFIDENTIAL 4 • Rewrite applications for different deployments ▪ Not practical given the number of scenarios • What about existing applications / deployments? • User data is constantly growing and needs not fit in persistent memory What Should Application Developers Do? 4
  • 5. CONFIDENTIAL PARALLEL MACHINES CONFIDENTIAL 5 Moving Towards a Software-defined World… (Trend 2) Software-Defined Networking (SDN) Enables administrators to manage network services by abstracting higher level functionality Abstraction of logical storage services and capabilities from the underlying physical storage systems Software-Defined Storage (SDS) Software-Defined Flash (SDF) Abstract or expose flash specific details to enable software to realize the raw bandwidth and storage capacity of Flash Software-Defined Data Center (SDDC) All elements of the infrastructure such as networking, storage, CPU and security are virtualized and delivered as a service 5 Abstraction of logical storage services and capabilities from the underlying physical persistent memory hardware and interconnect Software-Defined Persistent Memory (SDPM)
  • 6. CONFIDENTIAL PARALLEL MACHINES CONFIDENTIAL 6 Our SDPM Solution • The first instance of a software defined approach to PM that can bring the benefits of PM to a gamut of practical deployments. • Abstract the heterogeneity in PM hardware from applications • Provide file system API & programming libraries to access PM • Use currently available PM hardware to show the feasibility of an SDPM ▪ PCIe & DDR4 attached PM (both local & remote attach) ▪ Using Infiniband & 10G Ethernet for remote access • The prototype architecture provides good performance and near optimal acceleration for a range of local and remote PM configurations 6
  • 7. CONFIDENTIAL PARALLEL MACHINES CONFIDENTIAL 7 • Introduction • Design • Evaluation • Conclusion Outline 7
  • 8. CONFIDENTIAL PARALLEL MACHINES CONFIDENTIAL 8 • Support a variety of local and remote attach points with differing performance but identical functionality and semantic guarantees. • Enable tiering of data between PM and flash, with caching in DRAM, to enable different cost/performance configurations • Support hybrid (i.e., both memory and I/O) access, traditional storage management, and persistence guarantees to combine the best of memory and storage worlds. • Enable a single application programming model to work across a variety of hardware SDPM Design Goals 8
  • 9. CONFIDENTIAL PARALLEL MACHINES CONFIDENTIAL 9 Software Defined Persistent-Memory Architecture Applications Programming Libraries File System Persistent Memory Manager Block Device Driver SDPM PCIe/ DDR SAS/ SATA/ NVMe PM Infiniband/ Ethernet/ PCIe/… Memory/ Block (I/O) • Persistent Memory Manager ▪ Abstracts hardware and interconnect details from the file system / applications ▪ Exports APIs to guarantee persistence • File System ▪ Unified persistent namespace to PM & Flash ▪ Hybrid access to PM ▪ Transparent & non-transparent acceleration • Programming Libraries ▪ Unified access APIs to applications ▪ OS bypass for remote access
  • 10. CONFIDENTIAL PARALLEL MACHINES CONFIDENTIAL 10 • A flash-optimized POSIX compliant Linux file system ▪ Extended NVMFS to support PM in addition to Flash • Provides a unified and persistent namespace to both PM and flash ▪ Hybrid (memory & I/O) access to PM and applications can switch back and forth ▪ Transparent application acceleration (by tiering data between PM & flash) ▪ Supports “direct” mmap mode to directly map and use PM without caching in DRAM ▪ Supports a single programming model via a combination of application specific libraries over direct mmap and transparent access via POSIX APIs Non-Volatile Memory File System (NVMFS) 10
  • 11. CONFIDENTIAL PARALLEL MACHINES CONFIDENTIAL 11 Persistent Memory Manager: Hardware Abstractions • Memory Mapping Types ▪ PM can be mapped in multiple ways depending on the hardware. We need to ensure that each memory type is default mapped to the optimal model possible for its physical attach (by default map it to write combining) ▪ Enable FS operations that allow the app. to control the per file memory mapping. • Guaranteeing Persistence ▪ We need mechanism(s) to guarantee all acknowledged in-flight data (such as in CPU caches, registers, etc.) have reached the PM device independent of its attach point ▪ PMM provides a barrier() operation (to NVMFS and user space libraries) that guarantees that all data is moved to the persistence domain as needed for the attach point 11
  • 12. CONFIDENTIAL PARALLEL MACHINES CONFIDENTIAL 12 • Introduction • Design • Evaluation • Conclusion Outline 12
  • 13. CONFIDENTIAL PARALLEL MACHINES CONFIDENTIAL 13 Evaluation - Configuration Configurations Attach Method Local or Remote Config-1 DDR NVDIMM Local Config-2 PCIe MMIO Local Config-3 DDR NVDIMM Remote Ethernet Config-4 PCIe MMIO Remote Ethernet Config-5 DDR NVDIMM Remote Infiniband Config-6 PCIe MMIO Remote Infiniband Config-1, Config-2 (Local) Config- 3, 4, 5, 6 (Remote) System Configuration HP DL380, 96GB DDR DRAM, x86_64 Linux 3.14 kernel MySQL Version Percona 5.5 Flash Device (PCIe) Fusion-io Gen 2 ioMemory 1.2TB Persistent Memory PCI-e: ACM (512KB) DDR4: Viking NVDIMM (8GB) Network Interconnect N/A Infiniband: ConnectX-3 56 Gbit IB Ethernet: Intel 82599ES 10-Gigabit 13
  • 14. CONFIDENTIAL PARALLEL MACHINES CONFIDENTIAL 14 Evaluation - Setup Host A (Source)MySQL Programming Libraires NVMFS PMM ioMemory VSL Replication Src PM SDPM Local Config-2 Config-1 Host B (Sink) Programming Libraires NVMFSPMM ioMemory VSL Replication Sink (daemon) PM SDPM Config-4 Config-3 10Gbit Ethernet 56Gbit Infiniband Remote 14 Config-5 Config-6
  • 15. CONFIDENTIAL PARALLEL MACHINES CONFIDENTIAL 15 Local PCIe MMIO Vs DDR4 (config-1 Vs Config-2) Bandwidth(MB/sec) 0 1000 2000 3000 4000 Size (bytes) 64 128 256 512 1024 2048 4096 DDR-BW MMIO-BW Latency(nsec) 0 2000 4000 6000 8000 Size (bytes) 64 128 256 512 1024 2048 4096 DDR-Latency MMIO-Latency 15 Smaller data sizes -> barrier operation Larger data sizes -> transport media
  • 16. CONFIDENTIAL PARALLEL MACHINES CONFIDENTIAL 16 Barrier Overheads: Local (config-1) Vs Remote (config-3 & config-5) Latency(us) 0 12500 25000 37500 50000 Synchronization Frequency (#Ops) 1 10 100 1000 Ethernet FLUSH InfiniBand RDMA Local ACMPM 16 20x 4x # of operations between synchronizations increases, the performance becomes closer
  • 17. CONFIDENTIAL PARALLEL MACHINES CONFIDENTIAL 0 15,000 30,000 45,000 60,000 REMOTE LOCAL VANILLA NO_LOG Bin & Tr Log Tr log ONLY MySQL: Insert Heavy Workload 17 Config-3 Config-1 on Flash Infinitely fast logging
  • 18. CONFIDENTIAL PARALLEL MACHINES CONFIDENTIAL Operations/sec(in x1000) 0 7500 15000 22500 30000 MySQL: LinkBench 10x Workload 18 No Logs Vanilla Local (config-1) Remote (config-5) 31% 17% Facebook’s social graph workload 10x: 100 million nodes Inserts, delete, update, and lookup 30% writes (insert/update/delete) 70% reads (lookup) Infiniband performance for small updates is sufficient 3%
  • 19. CONFIDENTIAL PARALLEL MACHINES CONFIDENTIAL 19 • Introduction • Design • Evaluation • Conclusion Outline 19
  • 20. CONFIDENTIAL PARALLEL MACHINES CONFIDENTIAL Conclusions • PM is going to change the storage-memory landscape • Many different forms / capacities / attach points / performance • SDPM: a software-defined approach to using persistent memory • Abstracts heterogeneity in memory hardware • Applications can transparently run on local & remote persistent memory • Selectively abstracts PM characteristics to provide optimal performance • Transparently tier data between PM & Flash to exploit both existing & new applications written to run on PM. • Our evaluation shows near optimal performance for local & remote attach PM 20
  • 21. CONFIDENTIAL PARALLEL MACHINES CONFIDENTIAL 21 Thank You Iif you are interested in trying out SDPM: Dhananjoy.Das@sandisk.com