SlideShare una empresa de Scribd logo
1 de 36
Descargar para leer sin conexión
Evaluating NVMe drives for
accelerating HBase
Nicolas Poggi and David Grier
June 2017
BSC Data Centric Computing – Rackspace collaboration
Outline
1. Intro and motivation
2. Technologies explained
3. Disk benchmarks
4. HBase use case with NVE
1. Read-only
2. Full workload
5. Summary
2
Max of bw (MB/s)
Max of lat (us)
BYTES/S
REQ SIZE
Dataworks2017 3
Barcelona Supercomputing Center (BSC)
• Peak performance capacity of 13.7 Petaflop/s
• 390 TB Total RAM
• 14 PB storage (GPFS)
• Heterogeneous architecture:
• Intel Xeon: KNL and NKH
• IBM Power9 + Latest NVIDIA, ARMv8
• Networking: InfiniBand EDR / Omni-path
TBA July 1st
ALOJA: towards cost-effective Big Data
• Research project for automating characterization and
optimization of Big Data deployments
• Open source Benchmarking-to-Insights platform and tools
• Largest Big Data public repository (70,000+ jobs)
• Community collaboration with industry and academia
http://aloja.bsc.es
Big Data
Benchmarking
Online
Repository
Web / ML
Analytics
Motivation and objectives
• Hadoop PoC cluster
• Explore use cases where NVMe devices
can speedup Big Data apps
• Poor initial results…
• Measure the possibilities of NVMe
devices
• Improve benchmarking techniques
• Extend aloja into I/O testing
First tests:
6
8512 8667 8523
9668
0
2000
4000
6000
8000
10000
12000
Seconds
Lower is better
Running time of terasort (1TB) under different
disks
terasort
teragen
Marginal improvement!!!
NVMe technology providing
communication to SSD devices
through PCIe
• Standard, open source drivers
• Lower power consumption
• Lower cost / TB than DRAM (but
slower)
• Street prices for 1.6TB $2K to
$12K
7
PoC cluster and drive specs
All nodes (x5)
Operating System CentOS 7.2
Memory 128GB
CPU Single Octo-core (16 threads)
Disk Config OS 2x600GB SAS RAID1
Network 10Gb/10Gb redundant
Master node (x1, extra nodes for HA not used in these tests)
Disk Config Master storage 4x600GB SAS RAID10
XSF partition
Data Nodes (x4)
NVMe (cache) 1.6TB Intel DC P3608 NVMe SSD (PCIe)
Disk config HDFS data storage 10x 0.6TB NL-SAS/SATA JBOD
PCIe3 x8, 12Gb/s SAS RAID
SEAGATE ST3600057SS 15K (XFS partition)
8
Tested drives (3NVMe + 1 SAS JBOD)
PoC cluster
1. Intel DC P3608 (2015)
• PCI-Express 3.0 x8 lanes
• Capacity: 1.6TB (two drives)
• Seq R/W BW:
• 5000/2000 MB/s, (128k req)
• Random 4k R/W IOPS:
• 850k/150k
• Random 8k R/W IOPS:
• 500k/60k, 8 workers
2. HP 15K RPM JBOD
• 10x 0.6TB NL-SAS/SATA JBOD
• PCIe3 x8, 12Gb/s SAS RAID
• SEAGATE ST3600057SS 15K (XFS partition)
Other NVMe
3. HGST Ultrastar SN150 (2016)
• PCI-Express 3.0 x4 lanes
• Capacity: 1.6TB (two drives)
• Seq R/W BW:
• 6000/3000 MB/s, (128k req)
• Random 4k R/W IOPS:
• 743k/160k (per disk)
4. LSI Nytro WarpDrive 4-400 (old gen 2012)
• PCI-Express 2.0 x8 lanes
• Capacity: 1.6TB (two drives)
• Seq. R/W BW:
• 2000/1000 MB/s (256k req)
• R/W IOPS:
• 185k/120k (8k req)
9
FIO Benchmarks
Objectives:
• Assert vendor specs (BW, IOPS, Latency)
• Seq R/W, Random R/W
• Verify driver/firmware and OS
• Set performance expectations
Commands on reference on last slides 10
Max of bw (MB/s)
Max of lat (us)
0
5000000
10000000
15000000
20000000
25000000
524288
1048576
2097152
4194304
Bytes/s
Req Size
NVMe Disk BW at 1MB req size (high concurrency)
Higher is better 11
Disk BW at 1MB req size (mid concurrency)
Higher is better 12
IOPS at 4KB (mid concurrency)
Higher is better 13
Bucket Cache
14
HBase in a nutshell
• Highly scalable Big data key-value store
• On top of Hadoop (HDFS)
• Based on Google’s Bigtable
• Real-time and random access
• Indexed
• Low-latency
• Block cache and Bloom Filters
• Linear, modular scalability
• Automatic sharding of tables and failover
• Strictly consistent reads and writes.
• failover support
• Production ready and battle tested
• Building block of other projects
HBase R/W architecture
15
Source: HDP doc
JVM Heap
L2 Bucket Cache (BC) in HBase
Region server (worker) memory with BC
16
• Adds a second “block” storage for HFiles
• Use case: L2 cache and replaces OS buffer
cache
• Does copy-on‐read
• Fixed sized, reserved on startup
• 3 different modes:
• Heap
• Marginal improvement
• Divides mem with the block cache
• Offheap (in RAM)
• Uses Java NIO’s Direct ByteBuffer
• File
• Any local file / device
• Bypasses HDFS
• Saves RAM
HBASE-11425
faster results to be delivered through the utilization of
bucketcache. The changes provide direct access to the
bucketcache on the device it resides on rather than a copy to
heap.
17
L2-BucketCache experiments summary
Tested configurations for HBase v1.2.4
1. HBase default (baseline)
2. HBase w/ Bucket cache Offheap
1. Size: 32GB /work node
3. HBase w/ Bucket cache in RAM disk
1. Size: 32GB /work node
4. HBase w/ Bucket cache in NVMe disk
1. Size: 250GB / worker node
• All using same Hadoop and HDFS
configuration
• On JBOD (10 SAS disks)
• 1 Replica, short-circuit reads
Experiments
1. Read-only (workload C)
1. RAM at 128GB / node
2. RAM at 32GB / node
2. Full YCSB (workloads A-F)
3. RAM at 32GB / node
• Payload:
• YCSB 250M records.
• ~2TB HDFS raw
18
Experiment 1: Read-only (gets)
YCSB Workload C: 25M records, 500 threads, ~2TB HDFS
19
E 1.1: Throughput of the 4 configurations
(128GB RAM)
Higher is better 20
Results:
• Ops/Sec improve with BC
• Offheap 1.6x
• RAMd 1.5x
• NVMe 2.3x
• AVG latency as well
• (req time)
• First run slower on 4
cases (writes OS cache
and BC)
• 3rd run not faster, cache
already loaded
• Tested onheap config:
• > 8GB failed
• 8GB slower than
baseline
0
50000
100000
150000
200000
250000
300000
Baseline BucketCache Offheap BucketCache RAM disk BucketCache NVMe
Ops/Sec
Throughput of 3 consecutive iterations of Workload C (128GB)
E 1.1 Cluster resource consumption: Baseline
21
CPU % (AVG)
Disk R/W MB/s (SUM)
Mem Usage KB (AVG)
NET R/W Mb/s (SUM)
Notes:
• Java heap and OS buffer cache holds 100% of WL
• Data is read from disks (HDFS) only in first part of
the run,
• then throughput stabilizes (see NET and CPU)
• Free resources,
• Bottleneck in application and OS path (not shown)
Write 3Gb/s
Read 3 Gb/s
E 1.1 Cluster resource consumption: Bucket Cache strategies
22
Offheap (32GB) Disk R/W RAM disk (tmpfs 32GB) Disk R/W
NVMe (250GB) Disk R/WNotes:
• 3 BC strategies faster than baseline
• BC LRU more effective than OS buffer
• Offheap slightly more efficient ran RAM drive (same size)
• But seems to take longer to fill (different per node)
• And more take more space for same payload (plus java heap)
• NVM can hold the complete WL in the BC
• Read from and Writes to MEM not captured by charts
BC fills on 1st run
WL doesn’t fit
completely
E 1.2-3
Challenge:
Limit OS buffer cache effect on experiments
1st approach, larger payload. Cons: high execution time
2nd limit available RAM (using stress tool)
3rd clear buffer cache periodically (drop caches)
23
E1.2: Throughput of the 4 configurations
(32GB RAM)
Higher is betterfor Ops (Y1), lowerfor latency (Y2) 24
Results:
• Ops/Sec improve only with
NVMe up to 8X
• RAMd performs close to
baseline
• First run same on baseline and
RAMd
• tmpfs “blocks” as RAM is
needed
• At lower capacity, external BC
shows more improvement
20578
14715.493
21520
109488
20598
16995
21534
166588
0
20000
40000
60000
80000
100000
120000
140000
160000
180000
Baseline BucketCache Offheap BucketCache RAM disk BucketCache NVMe
Ops/Sec
Throughput of 2 consecutive iterations of Workload C (32GB)
E 1.1 Cluster CPU% AVG: Bucket Cache (32GB RAM)
25
Baseline
RAM disk (/dev/shm 8GB)
Offheap (4GB)
NVMe (250GB)
Read disk throughput: 2.4GB/s Read disk throughput: 2.8GB/s
Read disk throughput: 2.5GB/s Read disk throughput: 38.GB/s
BC failure
Slowest
Experiment 2: All workloads (-E)
YCSB workloads A-D, F: 25M records, 1000 threads
26
Benchmark suite: The Yahoo! Cloud Serving Benchmark (YCSB)
• Open source specification and kit, for comparing NoSQL DBs. (since 2010)
• Core workloads:
• A: Update heavy workload
• 50/50 R/W.
• B: Read mostly workload
• 95/5 R/W mix.
• C: Read only
• 100% read.
• D: Read latest workload
• Inserts new records and reads them
• Workload E: Short ranges (Not used, takes too long to run SCAN type)
• Short ranges of records are queried, instead of individual records
• F: Read-modify-write
• read a record, modify it, and write back.
https://github.com/brianfrankcooper/YCSB/wiki/Core-Workloads 27
E2.1: Throughput and Speedup ALL
(32GB RAM)
Baseline BucketCache RAM disk BucketCache NVMe
Datagen 68821 60819 67737
WL A 55682 56702 76315
WL B 33551 43978 79895
WL C 30631 43420 81245
WL D 85725 94631 128881
WL F 25540 32464 50568
0
20000
40000
60000
80000
100000
120000
140000
Ops/Sec
Throughput of workloads A-D,F (128GB, 1 iteration)
Higher is better 29
Results:
• Datagen slower with the RAMd
(less OS RAM) Overall: Ops/Sec
improve with BC
• RAMd 17%
• NVMe 87%
• WL C gets higher speedup with
NVMe
• WL F now faster with NVMe
• Need to run more iterations to
see max improvement
E2.1: CPU and Disk for ALL (32GB RAM)
30
Baseline CPU %
NVMe CPU%
Baseline Disk R/w
NVMe Disk R/W
High I/O wait
Read disk throughput: 1.8GB/s
Moderate I/O wait (2x less time)
Read disk throughput: up to 25GB/s
Summary
Lessons learned, findings, conclusions, references
31
Bucket Cache results recap (medium sized WL)
• Full cluster (128GB RAM / node)
• WL-C up to 2.7x speedup (warm cache)
• Full benchmark (CRUD) from 0.3 to 0.9x speedup (cold cache)
• Limiting resources
• 32GB RAM / node
• WL-C NVMe gets up to 8X improvement (warm cache)
• Other techniques failed/poor results
• Full benchmark between 0.4 and 2.7x speedup (cold cache)
• Latency reduces significantly with cached results
• Onheap BC not recommended
• just give more RAM to BlockCache
32
Advances in Hardware
Intel currently provides 3D Xpoint
• Demonstrated at 2.4−3× performance improvement over PCIe NAND
Flash storage
https://en.wikipedia.org/wiki/3D_XPoint
https://arstechnica.com/information-technology/2017/02/specs-for-
first-intel-3d-xpoint-ssd-so-so-transfer-speed-awesome-random-io/
33
To conclude…
• NVMe offers significant BW and Latency improvement over SAS/SATA,
but
• JBODs still perform well for seq R/W
• Also cheaper $/TB
• Big Data apps still designed for rotational (avoid random I/O)
• Full tiered-storage support is missing by Big Data frameworks
• Byte addressable vs. block access
• Need to rely on external tools/file systems
• Alluxio (Tachyon), Triple-H, New file systems (SSDFS), …
• OS buffer cache is highly effective, but not as the LRU BC
• Fast devices speedup, but still caching is the common use case…
34
References
• ALOJA
• http://aloja.bsc.es
• https://github.com/aloja/aloja
• Bucket cache and HBase
• BlockCache (and Bucket Cache 101) http://www.n10k.com/blog/blockcache-101/
• Intel brief on bucket cache:
http://www.intel.com/content/dam/www/public/us/en/documents/solution-briefs/apache-
hbase-block-cache-testing-brief.pdf
• http://www.slideshare.net/larsgeorge/hbase-status-report-hadoop-summit-europe-2014
• HBase performance: http://www.slideshare.net/bijugs/h-base-performance
• https://blogs.apache.org/hbase/entry/offheaping_the_read_path_in
• Benchmarks
• FIO https://github.com/axboe/fio
• Brian F. Cooper, et. Al. 2010. Benchmarking cloud serving systems with YCSB.
http://dx.doi.org/10.1145/1807128.1807152
35
Thanks, questions?
Follow up / feedback : Nicolas.Poggi@bsc.es and David.Grier@objectrocket.com
Twitter: @ni_po and @grierdavid
Evaluating NVMe drives for accelerating HBase
FIO commands:
• Sequential read
• fio --name=read --directory=${dir} --ioengine=libaio --direct=1 --bs=${bs} --rw=read --iodepth=${iodepth} --numjobs=1 --buffered=0 --size=2gb --
runtime=30 --time_based --randrepeat=0 --norandommap --refill_buffers --output-format=json
• Random read
• fio --name=randread --directory=${dir} --ioengine=libaio --direct=1 --bs=${bs} --rw=randread --iodepth=${iodepth} --numjobs=${numjobs} --
buffered=0 --size=2gb --runtime=30 --time_based --randrepeat=0 --norandommap --refill_buffers --output-format=json
• Sequential read on raw devices:
• fio --name=read --filename=${dev} --ioengine=libaio --direct=1 --bs=${bs} --rw=read --iodepth=${iodepth} --numjobs=1 --buffered=0 --size=2gb --
runtime=30 --time_based --randrepeat=0 --norandommap --refill_buffers --output-format=json
• Sequential write
• fio --name=write --directory=${dir} --ioengine=libaio --direct=1 --bs=${bs} --rw=write --iodepth=${iodepth} --numjobs=1 --buffered=0 --size=2gb --
runtime=30 --time_based --randrepeat=0 --norandommap --refill_buffers --output-format=json
• Random write
• fio --name=randwrite --directory=${dir} --ioengine=libaio --direct=1 --bs=${bs} --rw=randwrite --iodepth=${iodepth} --numjobs=${numjobs} --
buffered=0 --size=2gb --runtime=30 --time_based --randrepeat=0 --norandommap --refill_buffers --output-format=json
• Random read on raw devices:
• fio --name=randread --filename=${dev} --ioengine=libaio --direct=1 --bs=${bs} --rw=randread --iodepth=${iodepth} --numjobs=${numjobs} --
buffered=0 --size=2gb --runtime=30 --time_based --randrepeat=0 --norandommap --refill_buffers --offset_increment=2gb --output-format=json
https://github.com/axboe/fio 37

Más contenido relacionado

Último

6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...Dr Arash Najmaei ( Phd., MBA, BSc)
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data VisualizationKianJazayeri1
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksdeepakthakur548787
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelDecoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelBoston Institute of Analytics
 
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxThe Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxTasha Penwell
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...Jack Cole
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024Susanna-Assunta Sansone
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
 
SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxHaritikaChhatwal1
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter
 
Rithik Kumar Singh codealpha pythohn.pdf
Rithik Kumar Singh codealpha pythohn.pdfRithik Kumar Singh codealpha pythohn.pdf
Rithik Kumar Singh codealpha pythohn.pdfrahulyadav957181
 

Último (20)

6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data Visualization
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing works
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
Data Analysis Project: Stroke Prediction
Data Analysis Project: Stroke PredictionData Analysis Project: Stroke Prediction
Data Analysis Project: Stroke Prediction
 
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelDecoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
 
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxThe Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 
SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptx
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
 
Rithik Kumar Singh codealpha pythohn.pdf
Rithik Kumar Singh codealpha pythohn.pdfRithik Kumar Singh codealpha pythohn.pdf
Rithik Kumar Singh codealpha pythohn.pdf
 

Destacado

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by HubspotMarius Sescu
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTExpeed Software
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 

Destacado (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

Evaluating NVMe drives for accelerating HBase

  • 1. Evaluating NVMe drives for accelerating HBase Nicolas Poggi and David Grier June 2017 BSC Data Centric Computing – Rackspace collaboration
  • 2. Outline 1. Intro and motivation 2. Technologies explained 3. Disk benchmarks 4. HBase use case with NVE 1. Read-only 2. Full workload 5. Summary 2 Max of bw (MB/s) Max of lat (us) BYTES/S REQ SIZE
  • 4. Barcelona Supercomputing Center (BSC) • Peak performance capacity of 13.7 Petaflop/s • 390 TB Total RAM • 14 PB storage (GPFS) • Heterogeneous architecture: • Intel Xeon: KNL and NKH • IBM Power9 + Latest NVIDIA, ARMv8 • Networking: InfiniBand EDR / Omni-path TBA July 1st
  • 5. ALOJA: towards cost-effective Big Data • Research project for automating characterization and optimization of Big Data deployments • Open source Benchmarking-to-Insights platform and tools • Largest Big Data public repository (70,000+ jobs) • Community collaboration with industry and academia http://aloja.bsc.es Big Data Benchmarking Online Repository Web / ML Analytics
  • 6. Motivation and objectives • Hadoop PoC cluster • Explore use cases where NVMe devices can speedup Big Data apps • Poor initial results… • Measure the possibilities of NVMe devices • Improve benchmarking techniques • Extend aloja into I/O testing First tests: 6 8512 8667 8523 9668 0 2000 4000 6000 8000 10000 12000 Seconds Lower is better Running time of terasort (1TB) under different disks terasort teragen Marginal improvement!!!
  • 7. NVMe technology providing communication to SSD devices through PCIe • Standard, open source drivers • Lower power consumption • Lower cost / TB than DRAM (but slower) • Street prices for 1.6TB $2K to $12K 7
  • 8. PoC cluster and drive specs All nodes (x5) Operating System CentOS 7.2 Memory 128GB CPU Single Octo-core (16 threads) Disk Config OS 2x600GB SAS RAID1 Network 10Gb/10Gb redundant Master node (x1, extra nodes for HA not used in these tests) Disk Config Master storage 4x600GB SAS RAID10 XSF partition Data Nodes (x4) NVMe (cache) 1.6TB Intel DC P3608 NVMe SSD (PCIe) Disk config HDFS data storage 10x 0.6TB NL-SAS/SATA JBOD PCIe3 x8, 12Gb/s SAS RAID SEAGATE ST3600057SS 15K (XFS partition) 8
  • 9. Tested drives (3NVMe + 1 SAS JBOD) PoC cluster 1. Intel DC P3608 (2015) • PCI-Express 3.0 x8 lanes • Capacity: 1.6TB (two drives) • Seq R/W BW: • 5000/2000 MB/s, (128k req) • Random 4k R/W IOPS: • 850k/150k • Random 8k R/W IOPS: • 500k/60k, 8 workers 2. HP 15K RPM JBOD • 10x 0.6TB NL-SAS/SATA JBOD • PCIe3 x8, 12Gb/s SAS RAID • SEAGATE ST3600057SS 15K (XFS partition) Other NVMe 3. HGST Ultrastar SN150 (2016) • PCI-Express 3.0 x4 lanes • Capacity: 1.6TB (two drives) • Seq R/W BW: • 6000/3000 MB/s, (128k req) • Random 4k R/W IOPS: • 743k/160k (per disk) 4. LSI Nytro WarpDrive 4-400 (old gen 2012) • PCI-Express 2.0 x8 lanes • Capacity: 1.6TB (two drives) • Seq. R/W BW: • 2000/1000 MB/s (256k req) • R/W IOPS: • 185k/120k (8k req) 9
  • 10. FIO Benchmarks Objectives: • Assert vendor specs (BW, IOPS, Latency) • Seq R/W, Random R/W • Verify driver/firmware and OS • Set performance expectations Commands on reference on last slides 10 Max of bw (MB/s) Max of lat (us) 0 5000000 10000000 15000000 20000000 25000000 524288 1048576 2097152 4194304 Bytes/s Req Size
  • 11. NVMe Disk BW at 1MB req size (high concurrency) Higher is better 11
  • 12. Disk BW at 1MB req size (mid concurrency) Higher is better 12
  • 13. IOPS at 4KB (mid concurrency) Higher is better 13
  • 15. HBase in a nutshell • Highly scalable Big data key-value store • On top of Hadoop (HDFS) • Based on Google’s Bigtable • Real-time and random access • Indexed • Low-latency • Block cache and Bloom Filters • Linear, modular scalability • Automatic sharding of tables and failover • Strictly consistent reads and writes. • failover support • Production ready and battle tested • Building block of other projects HBase R/W architecture 15 Source: HDP doc JVM Heap
  • 16. L2 Bucket Cache (BC) in HBase Region server (worker) memory with BC 16 • Adds a second “block” storage for HFiles • Use case: L2 cache and replaces OS buffer cache • Does copy-on‐read • Fixed sized, reserved on startup • 3 different modes: • Heap • Marginal improvement • Divides mem with the block cache • Offheap (in RAM) • Uses Java NIO’s Direct ByteBuffer • File • Any local file / device • Bypasses HDFS • Saves RAM
  • 17. HBASE-11425 faster results to be delivered through the utilization of bucketcache. The changes provide direct access to the bucketcache on the device it resides on rather than a copy to heap. 17
  • 18. L2-BucketCache experiments summary Tested configurations for HBase v1.2.4 1. HBase default (baseline) 2. HBase w/ Bucket cache Offheap 1. Size: 32GB /work node 3. HBase w/ Bucket cache in RAM disk 1. Size: 32GB /work node 4. HBase w/ Bucket cache in NVMe disk 1. Size: 250GB / worker node • All using same Hadoop and HDFS configuration • On JBOD (10 SAS disks) • 1 Replica, short-circuit reads Experiments 1. Read-only (workload C) 1. RAM at 128GB / node 2. RAM at 32GB / node 2. Full YCSB (workloads A-F) 3. RAM at 32GB / node • Payload: • YCSB 250M records. • ~2TB HDFS raw 18
  • 19. Experiment 1: Read-only (gets) YCSB Workload C: 25M records, 500 threads, ~2TB HDFS 19
  • 20. E 1.1: Throughput of the 4 configurations (128GB RAM) Higher is better 20 Results: • Ops/Sec improve with BC • Offheap 1.6x • RAMd 1.5x • NVMe 2.3x • AVG latency as well • (req time) • First run slower on 4 cases (writes OS cache and BC) • 3rd run not faster, cache already loaded • Tested onheap config: • > 8GB failed • 8GB slower than baseline 0 50000 100000 150000 200000 250000 300000 Baseline BucketCache Offheap BucketCache RAM disk BucketCache NVMe Ops/Sec Throughput of 3 consecutive iterations of Workload C (128GB)
  • 21. E 1.1 Cluster resource consumption: Baseline 21 CPU % (AVG) Disk R/W MB/s (SUM) Mem Usage KB (AVG) NET R/W Mb/s (SUM) Notes: • Java heap and OS buffer cache holds 100% of WL • Data is read from disks (HDFS) only in first part of the run, • then throughput stabilizes (see NET and CPU) • Free resources, • Bottleneck in application and OS path (not shown) Write 3Gb/s Read 3 Gb/s
  • 22. E 1.1 Cluster resource consumption: Bucket Cache strategies 22 Offheap (32GB) Disk R/W RAM disk (tmpfs 32GB) Disk R/W NVMe (250GB) Disk R/WNotes: • 3 BC strategies faster than baseline • BC LRU more effective than OS buffer • Offheap slightly more efficient ran RAM drive (same size) • But seems to take longer to fill (different per node) • And more take more space for same payload (plus java heap) • NVM can hold the complete WL in the BC • Read from and Writes to MEM not captured by charts BC fills on 1st run WL doesn’t fit completely
  • 23. E 1.2-3 Challenge: Limit OS buffer cache effect on experiments 1st approach, larger payload. Cons: high execution time 2nd limit available RAM (using stress tool) 3rd clear buffer cache periodically (drop caches) 23
  • 24. E1.2: Throughput of the 4 configurations (32GB RAM) Higher is betterfor Ops (Y1), lowerfor latency (Y2) 24 Results: • Ops/Sec improve only with NVMe up to 8X • RAMd performs close to baseline • First run same on baseline and RAMd • tmpfs “blocks” as RAM is needed • At lower capacity, external BC shows more improvement 20578 14715.493 21520 109488 20598 16995 21534 166588 0 20000 40000 60000 80000 100000 120000 140000 160000 180000 Baseline BucketCache Offheap BucketCache RAM disk BucketCache NVMe Ops/Sec Throughput of 2 consecutive iterations of Workload C (32GB)
  • 25. E 1.1 Cluster CPU% AVG: Bucket Cache (32GB RAM) 25 Baseline RAM disk (/dev/shm 8GB) Offheap (4GB) NVMe (250GB) Read disk throughput: 2.4GB/s Read disk throughput: 2.8GB/s Read disk throughput: 2.5GB/s Read disk throughput: 38.GB/s BC failure Slowest
  • 26. Experiment 2: All workloads (-E) YCSB workloads A-D, F: 25M records, 1000 threads 26
  • 27. Benchmark suite: The Yahoo! Cloud Serving Benchmark (YCSB) • Open source specification and kit, for comparing NoSQL DBs. (since 2010) • Core workloads: • A: Update heavy workload • 50/50 R/W. • B: Read mostly workload • 95/5 R/W mix. • C: Read only • 100% read. • D: Read latest workload • Inserts new records and reads them • Workload E: Short ranges (Not used, takes too long to run SCAN type) • Short ranges of records are queried, instead of individual records • F: Read-modify-write • read a record, modify it, and write back. https://github.com/brianfrankcooper/YCSB/wiki/Core-Workloads 27
  • 28. E2.1: Throughput and Speedup ALL (32GB RAM) Baseline BucketCache RAM disk BucketCache NVMe Datagen 68821 60819 67737 WL A 55682 56702 76315 WL B 33551 43978 79895 WL C 30631 43420 81245 WL D 85725 94631 128881 WL F 25540 32464 50568 0 20000 40000 60000 80000 100000 120000 140000 Ops/Sec Throughput of workloads A-D,F (128GB, 1 iteration) Higher is better 29 Results: • Datagen slower with the RAMd (less OS RAM) Overall: Ops/Sec improve with BC • RAMd 17% • NVMe 87% • WL C gets higher speedup with NVMe • WL F now faster with NVMe • Need to run more iterations to see max improvement
  • 29. E2.1: CPU and Disk for ALL (32GB RAM) 30 Baseline CPU % NVMe CPU% Baseline Disk R/w NVMe Disk R/W High I/O wait Read disk throughput: 1.8GB/s Moderate I/O wait (2x less time) Read disk throughput: up to 25GB/s
  • 30. Summary Lessons learned, findings, conclusions, references 31
  • 31. Bucket Cache results recap (medium sized WL) • Full cluster (128GB RAM / node) • WL-C up to 2.7x speedup (warm cache) • Full benchmark (CRUD) from 0.3 to 0.9x speedup (cold cache) • Limiting resources • 32GB RAM / node • WL-C NVMe gets up to 8X improvement (warm cache) • Other techniques failed/poor results • Full benchmark between 0.4 and 2.7x speedup (cold cache) • Latency reduces significantly with cached results • Onheap BC not recommended • just give more RAM to BlockCache 32
  • 32. Advances in Hardware Intel currently provides 3D Xpoint • Demonstrated at 2.4−3× performance improvement over PCIe NAND Flash storage https://en.wikipedia.org/wiki/3D_XPoint https://arstechnica.com/information-technology/2017/02/specs-for- first-intel-3d-xpoint-ssd-so-so-transfer-speed-awesome-random-io/ 33
  • 33. To conclude… • NVMe offers significant BW and Latency improvement over SAS/SATA, but • JBODs still perform well for seq R/W • Also cheaper $/TB • Big Data apps still designed for rotational (avoid random I/O) • Full tiered-storage support is missing by Big Data frameworks • Byte addressable vs. block access • Need to rely on external tools/file systems • Alluxio (Tachyon), Triple-H, New file systems (SSDFS), … • OS buffer cache is highly effective, but not as the LRU BC • Fast devices speedup, but still caching is the common use case… 34
  • 34. References • ALOJA • http://aloja.bsc.es • https://github.com/aloja/aloja • Bucket cache and HBase • BlockCache (and Bucket Cache 101) http://www.n10k.com/blog/blockcache-101/ • Intel brief on bucket cache: http://www.intel.com/content/dam/www/public/us/en/documents/solution-briefs/apache- hbase-block-cache-testing-brief.pdf • http://www.slideshare.net/larsgeorge/hbase-status-report-hadoop-summit-europe-2014 • HBase performance: http://www.slideshare.net/bijugs/h-base-performance • https://blogs.apache.org/hbase/entry/offheaping_the_read_path_in • Benchmarks • FIO https://github.com/axboe/fio • Brian F. Cooper, et. Al. 2010. Benchmarking cloud serving systems with YCSB. http://dx.doi.org/10.1145/1807128.1807152 35
  • 35. Thanks, questions? Follow up / feedback : Nicolas.Poggi@bsc.es and David.Grier@objectrocket.com Twitter: @ni_po and @grierdavid Evaluating NVMe drives for accelerating HBase
  • 36. FIO commands: • Sequential read • fio --name=read --directory=${dir} --ioengine=libaio --direct=1 --bs=${bs} --rw=read --iodepth=${iodepth} --numjobs=1 --buffered=0 --size=2gb -- runtime=30 --time_based --randrepeat=0 --norandommap --refill_buffers --output-format=json • Random read • fio --name=randread --directory=${dir} --ioengine=libaio --direct=1 --bs=${bs} --rw=randread --iodepth=${iodepth} --numjobs=${numjobs} -- buffered=0 --size=2gb --runtime=30 --time_based --randrepeat=0 --norandommap --refill_buffers --output-format=json • Sequential read on raw devices: • fio --name=read --filename=${dev} --ioengine=libaio --direct=1 --bs=${bs} --rw=read --iodepth=${iodepth} --numjobs=1 --buffered=0 --size=2gb -- runtime=30 --time_based --randrepeat=0 --norandommap --refill_buffers --output-format=json • Sequential write • fio --name=write --directory=${dir} --ioengine=libaio --direct=1 --bs=${bs} --rw=write --iodepth=${iodepth} --numjobs=1 --buffered=0 --size=2gb -- runtime=30 --time_based --randrepeat=0 --norandommap --refill_buffers --output-format=json • Random write • fio --name=randwrite --directory=${dir} --ioengine=libaio --direct=1 --bs=${bs} --rw=randwrite --iodepth=${iodepth} --numjobs=${numjobs} -- buffered=0 --size=2gb --runtime=30 --time_based --randrepeat=0 --norandommap --refill_buffers --output-format=json • Random read on raw devices: • fio --name=randread --filename=${dev} --ioengine=libaio --direct=1 --bs=${bs} --rw=randread --iodepth=${iodepth} --numjobs=${numjobs} -- buffered=0 --size=2gb --runtime=30 --time_based --randrepeat=0 --norandommap --refill_buffers --offset_increment=2gb --output-format=json https://github.com/axboe/fio 37