Flash Memory technology, deployed as server-side PCIe or solid state disks (SSDs), is emerging as a critical tool for performance and efficiency in data centers of all scales. This presentation will discuss how the use of Flash impacts Cassandra deployments in terms of configuration, DRAM requirements and performance expectations. Ideas on leveraging C*'s cutting-edge data-center awareness to blend flash and disk storage nodes for cost and workload efficiency will also be shared. Flash media itself will be examined from a physical perspective to understand endurance issues. Data on write amplification under bulk-load and operational workload conditions will be presented to explain the impact to Flash of C*'s Log Structured Merge Tree architecture and the associated compactions. Finally, we will examine strategies to make Cassandra more Flash-aware using both conventional techniques as well as emerging Non-volatile memory (NVM) programming capabilities. Lessons learned from real-world customer deployments will be shared to complete this presentation.
2. Switch your database to flash
now. Or you’re doing it wrong.
Brian Bulkowski, Aerospike Founder and CTO
June 18, 2013 2#Cassandra13
http://highscalability.com/blog/2012/12/10/switch-your-
databases-to-flash-storage-now-or-youre-doing-it.html
6. NAND Flash Memory
June 18, 2013 6
Flash is a persistent memory technology invented by
Dr. Fujio Masuoka at Toshiba in 1980.
Bit
Line
Source
Line Word Line
Control Gate
Float Gate
NPN
#Cassandra13
9. Direct Cut Through Architecture
June 18, 2013 #Cassandra13 9
PCIe
DRAM
Host
CPU
App
OS
LEGACY APPROACH FUSION DIRECT APPROACH
PCIeSAS
DRAM
Data path
Controller
NAND
Host
CPU
RAID
Controller
App
OS
Goal of every I/O operation to move data to/from DRAM and flash.
SC
Super
Capacitors
10. June 18, 2013 10#Cassandra13
How can we use
it in Cassandra?
15. When do we scale out?
June 18, 2013 15
▸ A typical server…
CPU Cores: 32 with HT
Memory: 128 GB
…is your working set > 128GB?
#Cassandra13
16. Is there a better way?
June 18, 2013 16
▸ With NoSQL Databases, we tend to scale out for
DRAM
Combined Resources
CPU Cores: 96
Memory: 384 GB
More cores than needed to serve reads and writes.
#Cassandra13
17. Flash Offers A New Architectural
Choice
June 18, 2013 #Cassandra13 17
Milliseconds 10-3 Microseconds 10-6 Nanoseconds 10-9
CPU Cache DRAM
Disk Drives
Server-based Flash
18. Three Deployment Options
June 18, 2013 18
1. All Flash
2. Data Placement (CASSANDRA-2749)
3. Use Logical Data Centers
#Cassandra13
19. Cassandra with All-Flash Storage
June 18, 2013 #Cassandra13 19
Step 1: Mount ioMemory at /var/lib/cassandra/data
Step 2:
20. Data Placement
June 18, 2013 20
▸ https://issues.apache.org/jira/browse/CASSANDRA-2749
• Thanks Marcus!
▸ Takes advantage of filesystem hierarchy
▸ Use mount points to pin Keyspaces or Column
Families to flash:
• /var/lib/cassandra/data/{Keyspace}/{CF}
▸ Use flash for high performance needs, disk for
capacity needs
#Cassandra13
21. Data Centers for Storage Control
June 18, 2013 21
DC1
(Interactive requests)
DC3
(High density replicas)
DC2
(Hadoop MR Jobs)
PERFORMANCE
CAPACITY/NODE
HIGH
MEDIUM
LOW
HIGH
Cassandra cluster
#Cassandra13
33. f u s i o n i o . c o m | R E D E F I N E W H A T ’ S P O S S I B L E
THANK YOU
f u s i o n i o . c o m | R E D E F I N E W H A T ’ S P O S S I B L E
THANK YOU