SlideShare una empresa de Scribd logo
1 de 59
Descargar para leer sin conexión
Enlightening the I/O Path:
A Holistic Approach for Application Performance
Sangwook Kim13, Hwanju Kim2, Joonwon Lee3, and Jinkyu Jeong3
Apposha1
Dell EMC2
Sungkyunkwan University3
Data-Intensive Applications
2
Relational
Key-value
SearchColumn
Document
Data-Intensive Applications
• Common structure
3
Storage Device
Operating System
T1
Client
T2
I/O
T3 T4
Request Response
I/O I/O I/O
Application
Application
performance
* Example: MongoDB
- Client (foreground)
- Checkpointer
- Log writer
- Eviction worker
- …
Data-Intensive Applications
• Common structure
4
Storage Device
Operating System
T1
Client
T2
I/O
T3 T4
Request Response
I/O I/O I/O
Application
Application
performance
* Example: MongoDB
- Server (client)
- Checkpointer
- Log writer
- Evict worker
- …
Background tasks are problematic
for application performance
Application Impact
5
• Illustrative experiment
• YCSB update-heavy workload against MongoDB
Application Impact
6
• Illustrative experiment
• YCSB update-heavy workload against MongoDB
0
5000
10000
15000
20000
25000
30000
0 200 400 600 800 1000 1200 1400 1600 1800
Operationthroughput
(ops/sec)
Elapsed time (sec)
CFQ
Regular
checkpoint task
30 seconds latency
at 99.99th percentile
0
5000
10000
15000
20000
25000
30000
0 200 400 600 800 1000 1200 1400 1600 1800
Operationthroughput
(ops/sec)
Elapsed time (sec)
CFQ CFQ-IDLE
Application Impact
7
• Illustrative experiment
• YCSB update-heavy workload against MongoDB
I/O priority does
not help
0
5000
10000
15000
20000
25000
30000
0 200 400 600 800 1000 1200 1400 1600 1800
Operationthroughput
(ops/sec)
Elapsed time (sec)
CFQ CFQ-IDLE SPLIT-A SPLIT-D QASIO
Application Impact
8
• Illustrative experiment
• YCSB update-heavy workload against MongoDB
State-of-the-art
schedulers do
not help much
What’s the Problem?
• Independent policies in multiple layers
• Each layer processes I/Os w/ limited information
• I/O priority inversion
• Background I/Os can arbitrarily delay foreground tasks
9
What’s the Problem?
• Independent policies in multiple layers
• Each layer processes I/Os w/ limited information
• I/O priority inversion
• Background I/Os can arbitrarily delay foreground tasks
10
Multiple Independent Layers
• Independent I/O processing
11
Storage Device
Caching Layer
Application
File System Layer
Block Layer
Abstraction
Multiple Independent Layers
12
Storage Device
Caching Layer
Application
File System Layer
Block Layer
Buffer Cache
read() write()
admission
control
Abstraction
• Independent I/O processing
Multiple Independent Layers
• Independent I/O processing
13
Storage Device
Caching Layer
Application
File System Layer
Block Layer
Buffer Cache
read() write()
Block-level Q
admission
control
Abstraction
Multiple Independent Layers
• Independent I/O processing
14
Storage Device
Caching Layer
Application
File System Layer
Block Layer
Abstraction
Buffer Cache
read() write()
reorder
FG FG BGBG
Multiple Independent Layers
• Independent I/O processing
15
Storage Device
Caching Layer
Application
File System Layer
Block Layer
Abstraction
Buffer Cache
read() write()
FG FGBG
Device-internal Q
admission
control
Multiple Independent Layers
• Independent I/O processing
16
Storage Device
Caching Layer
Application
File System Layer
Block Layer
Abstraction
Buffer Cache
read() write()
FG FGBG
BG FG BGBG
reorder
What’s the Problem?
• Independent policies in multiple layers
• Each layer processes I/Os w/ limited information
• I/O priority inversion
• Background I/Os can arbitrarily delay foreground tasks
17
I/O Priority Inversion
• Task dependency
18
Storage Device
Caching Layer
Application
File System Layer
Block Layer
Locks
Condition variables
I/O Priority Inversion
• Task dependency
19
Storage Device
Caching Layer
Application
File System Layer
Block Layer
Condition variables
I/OFG
lock
BG
wait
I/O Priority Inversion
• Task dependency
20
Storage Device
Caching Layer
Application
File System Layer
Block Layer
I/OFG
lock
BG
wait
FG
wait
wait
BGvar
wake
I/O Priority Inversion
• Task dependency
21
Storage Device
Caching Layer
Application
File System Layer
Block Layer
I/O
FG
wait
wait
BGuser
var
wake
FG
wait
I/O Priority Inversion
• I/O dependency
22
Storage Device
Caching Layer
Application
File System Layer
Block Layer
Outstanding I/Os
I/O Priority Inversion
• I/O dependency
23
Storage Device
Caching Layer
Application
File System Layer
Block Layer
I/OFG
wait
For ensuring consistency
and/or durability
24
100 ms latency at
99.99th percentile
0
5000
10000
15000
20000
25000
30000
35000
0 200 400 600 800 1000 1200 1400 1600 1800
Operationthroughput
(ops/sec)
Elapsed time (sec)
CFQ CFQ-IDLE SPLIT-A SPLIT-D QASIO RCP
• Request-centric I/O prioritization (RCP)
• Critical I/O: I/O in the critical path of request handling
• Policy: holistically prioritizes critical I/Os along the I/O path
Our Approach
Challenges
• How to accurately identify I/O criticality
• How to effectively enforce I/O criticality
25
Critical I/O Detection
• Enlightenment API
• Interface for tagging foreground tasks
• I/O priority inheritance
• Handling task dependency
• Handling I/O dependency
26
I/O Priority Inheritance
• Handling task dependency
• Locks
• Condition variables
27
FG
lock
BG I/OFG
inherit
BG
submit
complete
FG BG
unlock
FG
wait
BG
register
BG
inherit
FG BGI/O
submit
complete
wake
CV CV CV
I/O Priority Inheritance
• Handling I/O dependency
28
I/O Priority Inheritance
• Handling I/O dependency
29
Block Layer
I/OI/O
I/O Priority Inheritance
• Handling I/O dependency
30
Block Layer
Q admission stage
I/OI/O
Sched queueing stage
I/O Priority Inheritance
• Handling I/O dependency
31
Block Layer
PER-DEV
ROOT
NCIO NCIO
NCIO NCIO
Q admission stage
I/OI/O
Sched queueing stage
Non-critical I/O tracking
Descriptor
Location
Resolver
Sector #
I/O Priority Inheritance
• Handling I/O dependency
32
Block Layer
Descriptor
Location
Resolver
Sector #
allocate
Q admission stage
I/O
I/O
I/O
Sched queueing stage
Non-critical I/O tracking
PER-DEV
ROOT
NCIO NCIO
NCIO NCIO
I/O Priority Inheritance
• Handling I/O dependency
33
Block Layer
Q admission stage
I/O
I/O
I/O
Sched queueing stage
Non-critical I/O tracking
update
Descriptor
Location
Resolver
Sector #
PER-DEV
ROOT
NCIO NCIO
NCIO NCIO
I/O Priority Inheritance
• Handling I/O dependency
34
Block Layer
Q admission stage
I/O
I/O
Sched queueing stage
I/O
Non-critical I/O tracking
update Descriptor
Location
Resolver
Sector #
PER-DEV
ROOT
NCIO NCIO
NCIO NCIO
I/O Priority Inheritance
• Handling I/O dependency
35
Block Layer
Q admission stage
I/O
I/O
Sched queueing stage
I/O
Non-critical I/O tracking
update
Descriptor
Location
Resolver
Sector #
PER-DEV
ROOT
NCIO NCIO
NCIO NCIO
I/O Priority Inheritance
• Handling I/O dependency
36
Block Layer
Q admission stage
I/O
I/O
Sched queueing stage
I/O
Non-critical I/O tracking
update
Descriptor
Location
Resolver
Sector #
PER-DEV
ROOT
NCIO NCIO
NCIO NCIO
I/O Priority Inheritance
• Handling I/O dependency
37
Block Layer
Q admission stage
I/O
I/O
Sched queueing stage
I/O
Non-critical I/O tracking
Descriptor
Location
Resolver
Sector #
PER-DEV
ROOT
NCIO NCIO
NCIO NCIO
delete on
completion
Handling Transitive Dependency
• Possible states of dependent task
38
FG
inherit
BG BG
Blocked
on task
I/OFG
inherit
BG
wait wait
Blocked
on I/O
FG
inherit
BG
wait
Blocked at
admission stage
Handling Transitive Dependency
• Recording blocking status
39
FG
inherit
BG BG
Blocked
on task
I/OFG
inherit
BG
wait wait
Blocked
on I/O
FG
inherit
BG
wait
Blocked at
admission stage
Handling Transitive Dependency
• Recording blocking status
40
FG
inherit
BG BG
Task is
recorded
I/OFG
inherit
BG
wait
Blocked
on I/O
FG
inherit
BG
wait
Blocked at
admission stage
inherit
Handling Transitive Dependency
• Recording blocking status
41
FG
inherit
BG BG I/OFG
inherit
BG
reprio
I/O is
recorded
FG
inherit
BG
wait
Blocked at
admission stage
Task is
recorded
inherit
Handling Transitive Dependency
• Recording blocking status
42
FG
inherit
BG BG I/OFG
inherit
BG FG
inherit
BG
retryreprio
I/O is
recorded
Task is
recorded
inherit
Challenges
• How to accurately identify I/O criticality
• Enlightenment API
• I/O priority inheritance
• Recording blocking status
• How to effectively enforce I/O criticality
43
Criticality-Aware I/O Prioritization
• Caching layer
• Apply low dirty ratio for non-critical writes (1% by default)
• Block layer
• Isolate allocation of block queue slots
• Maintain 2 FIFO queues
• Schedule critical I/O first
• Limit # of outstanding non-critical I/Os (1 by default)
• Support queue promotion to resolve I/O dependency
44
Evaluation
• Implementation on Linux 3.13 w/ ext4
• Application studies
• PostgreSQL relational database
• Backend processes as foreground tasks
• I/O priority inheritance on LWLocks (semop)
• MongoDB document store
• Client threads as foreground tasks
• I/O priority inheritance on Pthread mutex and condition vars (futex)
• Redis key-value store
• Master process as foreground task
45
Evaluation
• Experimental setup
• 2 Dell PowerEdge R530 (server & client)
• 1TB Micron MX200 SSD
• I/O prioritization schemes
• CFQ (default), CFQ-IDLE
• SPLIT-A (priority), SPLIT-D (deadline) [SOSP’15]
• QASIO [FAST’15]
• RCP
46
Application Throughput
• PostgreSQL w/ TPC-C workload
47
0
1000
2000
3000
4000
5000
6000
7000
10GB dataset 60GB dataset 200GB dataset
Transactionthroughput
(trx/sec)
CFQ CFQ-IDLE
Application Throughput
• PostgreSQL w/ TPC-C workload
48
0
1000
2000
3000
4000
5000
6000
7000
10GB dataset 60GB dataset 200GB dataset
Transactionthroughput
(trx/sec)
CFQ CFQ-IDLE SPLIT-A
Application Throughput
• PostgreSQL w/ TPC-C workload
49
0
1000
2000
3000
4000
5000
6000
7000
10GB dataset 60GB dataset 200GB dataset
Transactionthroughput
(trx/sec)
CFQ CFQ-IDLE SPLIT-A SPLIT-D
Application Throughput
• PostgreSQL w/ TPC-C workload
50
0
1000
2000
3000
4000
5000
6000
7000
10GB dataset 60GB dataset 200GB dataset
Transactionthroughput
(trx/sec)
CFQ CFQ-IDLE SPLIT-A SPLIT-D QASIO
Application Throughput
• PostgreSQL w/ TPC-C workload
51
0
1000
2000
3000
4000
5000
6000
7000
10GB dataset 60GB dataset 200GB dataset
Transactionthroughput
(trx/sec)
CFQ CFQ-IDLE SPLIT-A SPLIT-D QASIO RCP
37%
31%
28%
Application Throughput
• Impact on background task
52
0
5
10
15
20
25
30
35
0 200 400 600 800 1000 1200 1400 1600 1800
Transactionlogsize(GB)
Elapsed time (sec)
CFQ CFQ-IDLE QASIO
Application Throughput
• Impact on background task
53
0
5
10
15
20
25
30
35
0 200 400 600 800 1000 1200 1400 1600 1800
Transactionlogsize(GB)
Elapsed time (sec)
CFQ CFQ-IDLE SPLIT-A SPLIT-D QASIO
Application Throughput
• Impact on background task
54
Our scheme improves
application throughput
w/o penalizing
background tasks
0
5
10
15
20
25
30
35
0 200 400 600 800 1000 1200 1400 1600 1800
Transactionlogsize(GB)
Elapsed time (sec)
CFQ CFQ-IDLE SPLIT-A SPLIT-D QASIO RCP
Application Latency
• PostgreSQL w/ TPC-C workload
55
1.E-05
1.E-04
1.E-03
1.E-02
1.E-01
1.E+00
0 1000 2000 3000 4000 5000 6000
CCDFP[X>=x]
Transaction latency (msec)
CFQ CFQ-IDLE SPLIT-A SPLIT-D QASIO
100
10-1
10-2
10-3
10-4
10-5
Over 2 sec
at 99.9th
0th
90th
99th
99.9th
99.99th
99.999th
Application Latency
• PostgreSQL w/ TPC-C workload
56
1.E-05
1.E-04
1.E-03
1.E-02
1.E-01
1.E+00
0 1000 2000 3000 4000 5000 6000
CCDFP[X>=x]
Transaction latency (msec)
CFQ CFQ-IDLE SPLIT-A SPLIT-D QASIO RCP
100
10-1
10-2
10-3
10-4
10-5
300 msec
at 99.999th
Our scheme is effective
for improving tail latency
Over 2 sec
at 99.9th
100
10-1
10-2
10-3
10-4
10-5
0th
90th
99th
99.9th
99.99th
99.999th
Summary of Other Results
• Performance results
• MongoDB: 12%-201% throughput, 5x-20x latency at 99.9th
• Redis: 7%-49% throughput, 2x-20x latency at 99.9th
• Analysis results
• System latency analysis using LatencyTOP
• System throughput vs. Application latency
• Need for holistic approach
57
Conclusions
• Key observation
• All the layers in the I/O path should be considered as a whole with I/O
priority inversion in mind for effective I/O prioritization
• Request-centric I/O prioritization
• Enlightens the I/O path solely for application performance
• Improves throughput and latency of real applications
• Ongoing work
• Practicalizing implementation
• Applying RCP to database cluster with multiple replicas
58
Thank You!
• Questions and comments
• Contact
• sangwook@apposha.io
59

Más contenido relacionado

Destacado

Best Practices for Integrating Active Directory with AWS Workloads
Best Practices for Integrating Active Directory with AWS WorkloadsBest Practices for Integrating Active Directory with AWS Workloads
Best Practices for Integrating Active Directory with AWS WorkloadsAmazon Web Services
 
Gs08 modernize your data platform with sql technologies wash dc
Gs08 modernize your data platform with sql technologies   wash dcGs08 modernize your data platform with sql technologies   wash dc
Gs08 modernize your data platform with sql technologies wash dcBob Ward
 
Developing streaming applications with apache apex (strata + hadoop world)
Developing streaming applications with apache apex (strata + hadoop world)Developing streaming applications with apache apex (strata + hadoop world)
Developing streaming applications with apache apex (strata + hadoop world)Apache Apex
 
2017 iosco research report on financial technologies (fintech)
2017 iosco research report on  financial technologies (fintech)2017 iosco research report on  financial technologies (fintech)
2017 iosco research report on financial technologies (fintech)Ian Beckett
 
Your App Deserves More – The Art of App Modernization
Your App Deserves More – The Art of App ModernizationYour App Deserves More – The Art of App Modernization
Your App Deserves More – The Art of App ModernizationKlaus Bild
 
Web engineering notes unit 3
Web engineering notes unit 3Web engineering notes unit 3
Web engineering notes unit 3inshu1890
 
Webinar - Bringing Game Changing Insights with Graph Databases
Webinar - Bringing Game Changing Insights with Graph DatabasesWebinar - Bringing Game Changing Insights with Graph Databases
Webinar - Bringing Game Changing Insights with Graph DatabasesDataStax
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
Tugas 4 0317-imelda felicia-1412510545
Tugas 4 0317-imelda felicia-1412510545Tugas 4 0317-imelda felicia-1412510545
Tugas 4 0317-imelda felicia-1412510545imeldafelicia
 
What’s New in Amazon Aurora for MySQL and PostgreSQL
What’s New in Amazon Aurora for MySQL and PostgreSQLWhat’s New in Amazon Aurora for MySQL and PostgreSQL
What’s New in Amazon Aurora for MySQL and PostgreSQLAmazon Web Services
 

Destacado (10)

Best Practices for Integrating Active Directory with AWS Workloads
Best Practices for Integrating Active Directory with AWS WorkloadsBest Practices for Integrating Active Directory with AWS Workloads
Best Practices for Integrating Active Directory with AWS Workloads
 
Gs08 modernize your data platform with sql technologies wash dc
Gs08 modernize your data platform with sql technologies   wash dcGs08 modernize your data platform with sql technologies   wash dc
Gs08 modernize your data platform with sql technologies wash dc
 
Developing streaming applications with apache apex (strata + hadoop world)
Developing streaming applications with apache apex (strata + hadoop world)Developing streaming applications with apache apex (strata + hadoop world)
Developing streaming applications with apache apex (strata + hadoop world)
 
2017 iosco research report on financial technologies (fintech)
2017 iosco research report on  financial technologies (fintech)2017 iosco research report on  financial technologies (fintech)
2017 iosco research report on financial technologies (fintech)
 
Your App Deserves More – The Art of App Modernization
Your App Deserves More – The Art of App ModernizationYour App Deserves More – The Art of App Modernization
Your App Deserves More – The Art of App Modernization
 
Web engineering notes unit 3
Web engineering notes unit 3Web engineering notes unit 3
Web engineering notes unit 3
 
Webinar - Bringing Game Changing Insights with Graph Databases
Webinar - Bringing Game Changing Insights with Graph DatabasesWebinar - Bringing Game Changing Insights with Graph Databases
Webinar - Bringing Game Changing Insights with Graph Databases
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Tugas 4 0317-imelda felicia-1412510545
Tugas 4 0317-imelda felicia-1412510545Tugas 4 0317-imelda felicia-1412510545
Tugas 4 0317-imelda felicia-1412510545
 
What’s New in Amazon Aurora for MySQL and PostgreSQL
What’s New in Amazon Aurora for MySQL and PostgreSQLWhat’s New in Amazon Aurora for MySQL and PostgreSQL
What’s New in Amazon Aurora for MySQL and PostgreSQL
 

Similar a Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)

Kernel Recipes 2015: Solving the Linux storage scalability bottlenecks
Kernel Recipes 2015: Solving the Linux storage scalability bottlenecksKernel Recipes 2015: Solving the Linux storage scalability bottlenecks
Kernel Recipes 2015: Solving the Linux storage scalability bottlenecksAnne Nicolas
 
Java performance monitoring
Java performance monitoringJava performance monitoring
Java performance monitoringSimon Ritter
 
Coverage Solutions on Emulators
Coverage Solutions on EmulatorsCoverage Solutions on Emulators
Coverage Solutions on EmulatorsDVClub
 
The Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast StorageThe Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast StorageKernel TLV
 
Flink Forward SF 2017: Stephan Ewen - Experiences running Flink at Very Large...
Flink Forward SF 2017: Stephan Ewen - Experiences running Flink at Very Large...Flink Forward SF 2017: Stephan Ewen - Experiences running Flink at Very Large...
Flink Forward SF 2017: Stephan Ewen - Experiences running Flink at Very Large...Flink Forward
 
Query and audit logging in cassandra
Query and audit logging in cassandraQuery and audit logging in cassandra
Query and audit logging in cassandraVinay Kumar Chella
 
Getting started with RISC-V verification what's next after compliance testing
Getting started with RISC-V verification what's next after compliance testingGetting started with RISC-V verification what's next after compliance testing
Getting started with RISC-V verification what's next after compliance testingRISC-V International
 
CrikeyCon 2015 - iOS Runtime Hacking Crash Course
CrikeyCon 2015 - iOS Runtime Hacking Crash CourseCrikeyCon 2015 - iOS Runtime Hacking Crash Course
CrikeyCon 2015 - iOS Runtime Hacking Crash Courseeightbit
 
Where is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in FlinkWhere is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in FlinkFlink Forward
 
Performance Enhancement with Pipelining
Performance Enhancement with PipeliningPerformance Enhancement with Pipelining
Performance Enhancement with PipeliningAneesh Raveendran
 
Wahckon[2] - iOS Runtime Hacking Crash Course
Wahckon[2] - iOS Runtime Hacking Crash CourseWahckon[2] - iOS Runtime Hacking Crash Course
Wahckon[2] - iOS Runtime Hacking Crash Courseeightbit
 
Introduction to Robot Framework – Exove
Introduction to Robot Framework – ExoveIntroduction to Robot Framework – Exove
Introduction to Robot Framework – ExoveExove
 
Java Concurrency and Performance | Multi Threading | Concurrency | Java Conc...
Java Concurrency and Performance | Multi Threading  | Concurrency | Java Conc...Java Concurrency and Performance | Multi Threading  | Concurrency | Java Conc...
Java Concurrency and Performance | Multi Threading | Concurrency | Java Conc...Anand Narayanan
 
Adobe AEM - From Eventing to Job Processing
Adobe AEM - From Eventing to Job ProcessingAdobe AEM - From Eventing to Job Processing
Adobe AEM - From Eventing to Job ProcessingCarsten Ziegeler
 
Integrating Oracle Argus Safety with other Clinical Systems Using Argus Inter...
Integrating Oracle Argus Safety with other Clinical Systems Using Argus Inter...Integrating Oracle Argus Safety with other Clinical Systems Using Argus Inter...
Integrating Oracle Argus Safety with other Clinical Systems Using Argus Inter...Perficient
 
VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight
VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log InsightVMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight
VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log InsightVMworld
 
Kernel Recipes 2018 - Live (Kernel) Patching: status quo and status futurus -...
Kernel Recipes 2018 - Live (Kernel) Patching: status quo and status futurus -...Kernel Recipes 2018 - Live (Kernel) Patching: status quo and status futurus -...
Kernel Recipes 2018 - Live (Kernel) Patching: status quo and status futurus -...Anne Nicolas
 
Understanding and Measuring I/O Performance
Understanding and Measuring I/O PerformanceUnderstanding and Measuring I/O Performance
Understanding and Measuring I/O PerformanceGlenn K. Lockwood
 

Similar a Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17) (20)

Kernel Recipes 2015: Solving the Linux storage scalability bottlenecks
Kernel Recipes 2015: Solving the Linux storage scalability bottlenecksKernel Recipes 2015: Solving the Linux storage scalability bottlenecks
Kernel Recipes 2015: Solving the Linux storage scalability bottlenecks
 
Java performance monitoring
Java performance monitoringJava performance monitoring
Java performance monitoring
 
Coverage Solutions on Emulators
Coverage Solutions on EmulatorsCoverage Solutions on Emulators
Coverage Solutions on Emulators
 
Java Performance Tuning
Java Performance TuningJava Performance Tuning
Java Performance Tuning
 
Datastage Online Training
Datastage Online TrainingDatastage Online Training
Datastage Online Training
 
The Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast StorageThe Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast Storage
 
Flink Forward SF 2017: Stephan Ewen - Experiences running Flink at Very Large...
Flink Forward SF 2017: Stephan Ewen - Experiences running Flink at Very Large...Flink Forward SF 2017: Stephan Ewen - Experiences running Flink at Very Large...
Flink Forward SF 2017: Stephan Ewen - Experiences running Flink at Very Large...
 
Query and audit logging in cassandra
Query and audit logging in cassandraQuery and audit logging in cassandra
Query and audit logging in cassandra
 
Getting started with RISC-V verification what's next after compliance testing
Getting started with RISC-V verification what's next after compliance testingGetting started with RISC-V verification what's next after compliance testing
Getting started with RISC-V verification what's next after compliance testing
 
CrikeyCon 2015 - iOS Runtime Hacking Crash Course
CrikeyCon 2015 - iOS Runtime Hacking Crash CourseCrikeyCon 2015 - iOS Runtime Hacking Crash Course
CrikeyCon 2015 - iOS Runtime Hacking Crash Course
 
Where is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in FlinkWhere is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in Flink
 
Performance Enhancement with Pipelining
Performance Enhancement with PipeliningPerformance Enhancement with Pipelining
Performance Enhancement with Pipelining
 
Wahckon[2] - iOS Runtime Hacking Crash Course
Wahckon[2] - iOS Runtime Hacking Crash CourseWahckon[2] - iOS Runtime Hacking Crash Course
Wahckon[2] - iOS Runtime Hacking Crash Course
 
Introduction to Robot Framework – Exove
Introduction to Robot Framework – ExoveIntroduction to Robot Framework – Exove
Introduction to Robot Framework – Exove
 
Java Concurrency and Performance | Multi Threading | Concurrency | Java Conc...
Java Concurrency and Performance | Multi Threading  | Concurrency | Java Conc...Java Concurrency and Performance | Multi Threading  | Concurrency | Java Conc...
Java Concurrency and Performance | Multi Threading | Concurrency | Java Conc...
 
Adobe AEM - From Eventing to Job Processing
Adobe AEM - From Eventing to Job ProcessingAdobe AEM - From Eventing to Job Processing
Adobe AEM - From Eventing to Job Processing
 
Integrating Oracle Argus Safety with other Clinical Systems Using Argus Inter...
Integrating Oracle Argus Safety with other Clinical Systems Using Argus Inter...Integrating Oracle Argus Safety with other Clinical Systems Using Argus Inter...
Integrating Oracle Argus Safety with other Clinical Systems Using Argus Inter...
 
VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight
VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log InsightVMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight
VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight
 
Kernel Recipes 2018 - Live (Kernel) Patching: status quo and status futurus -...
Kernel Recipes 2018 - Live (Kernel) Patching: status quo and status futurus -...Kernel Recipes 2018 - Live (Kernel) Patching: status quo and status futurus -...
Kernel Recipes 2018 - Live (Kernel) Patching: status quo and status futurus -...
 
Understanding and Measuring I/O Performance
Understanding and Measuring I/O PerformanceUnderstanding and Measuring I/O Performance
Understanding and Measuring I/O Performance
 

Último

Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendArshad QA
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 

Último (20)

Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and Backend
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Exploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the ProcessExploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the Process
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 

Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)

  • 1. Enlightening the I/O Path: A Holistic Approach for Application Performance Sangwook Kim13, Hwanju Kim2, Joonwon Lee3, and Jinkyu Jeong3 Apposha1 Dell EMC2 Sungkyunkwan University3
  • 3. Data-Intensive Applications • Common structure 3 Storage Device Operating System T1 Client T2 I/O T3 T4 Request Response I/O I/O I/O Application Application performance * Example: MongoDB - Client (foreground) - Checkpointer - Log writer - Eviction worker - …
  • 4. Data-Intensive Applications • Common structure 4 Storage Device Operating System T1 Client T2 I/O T3 T4 Request Response I/O I/O I/O Application Application performance * Example: MongoDB - Server (client) - Checkpointer - Log writer - Evict worker - … Background tasks are problematic for application performance
  • 5. Application Impact 5 • Illustrative experiment • YCSB update-heavy workload against MongoDB
  • 6. Application Impact 6 • Illustrative experiment • YCSB update-heavy workload against MongoDB 0 5000 10000 15000 20000 25000 30000 0 200 400 600 800 1000 1200 1400 1600 1800 Operationthroughput (ops/sec) Elapsed time (sec) CFQ Regular checkpoint task 30 seconds latency at 99.99th percentile
  • 7. 0 5000 10000 15000 20000 25000 30000 0 200 400 600 800 1000 1200 1400 1600 1800 Operationthroughput (ops/sec) Elapsed time (sec) CFQ CFQ-IDLE Application Impact 7 • Illustrative experiment • YCSB update-heavy workload against MongoDB I/O priority does not help
  • 8. 0 5000 10000 15000 20000 25000 30000 0 200 400 600 800 1000 1200 1400 1600 1800 Operationthroughput (ops/sec) Elapsed time (sec) CFQ CFQ-IDLE SPLIT-A SPLIT-D QASIO Application Impact 8 • Illustrative experiment • YCSB update-heavy workload against MongoDB State-of-the-art schedulers do not help much
  • 9. What’s the Problem? • Independent policies in multiple layers • Each layer processes I/Os w/ limited information • I/O priority inversion • Background I/Os can arbitrarily delay foreground tasks 9
  • 10. What’s the Problem? • Independent policies in multiple layers • Each layer processes I/Os w/ limited information • I/O priority inversion • Background I/Os can arbitrarily delay foreground tasks 10
  • 11. Multiple Independent Layers • Independent I/O processing 11 Storage Device Caching Layer Application File System Layer Block Layer Abstraction
  • 12. Multiple Independent Layers 12 Storage Device Caching Layer Application File System Layer Block Layer Buffer Cache read() write() admission control Abstraction • Independent I/O processing
  • 13. Multiple Independent Layers • Independent I/O processing 13 Storage Device Caching Layer Application File System Layer Block Layer Buffer Cache read() write() Block-level Q admission control Abstraction
  • 14. Multiple Independent Layers • Independent I/O processing 14 Storage Device Caching Layer Application File System Layer Block Layer Abstraction Buffer Cache read() write() reorder FG FG BGBG
  • 15. Multiple Independent Layers • Independent I/O processing 15 Storage Device Caching Layer Application File System Layer Block Layer Abstraction Buffer Cache read() write() FG FGBG Device-internal Q admission control
  • 16. Multiple Independent Layers • Independent I/O processing 16 Storage Device Caching Layer Application File System Layer Block Layer Abstraction Buffer Cache read() write() FG FGBG BG FG BGBG reorder
  • 17. What’s the Problem? • Independent policies in multiple layers • Each layer processes I/Os w/ limited information • I/O priority inversion • Background I/Os can arbitrarily delay foreground tasks 17
  • 18. I/O Priority Inversion • Task dependency 18 Storage Device Caching Layer Application File System Layer Block Layer Locks Condition variables
  • 19. I/O Priority Inversion • Task dependency 19 Storage Device Caching Layer Application File System Layer Block Layer Condition variables I/OFG lock BG wait
  • 20. I/O Priority Inversion • Task dependency 20 Storage Device Caching Layer Application File System Layer Block Layer I/OFG lock BG wait FG wait wait BGvar wake
  • 21. I/O Priority Inversion • Task dependency 21 Storage Device Caching Layer Application File System Layer Block Layer I/O FG wait wait BGuser var wake FG wait
  • 22. I/O Priority Inversion • I/O dependency 22 Storage Device Caching Layer Application File System Layer Block Layer Outstanding I/Os
  • 23. I/O Priority Inversion • I/O dependency 23 Storage Device Caching Layer Application File System Layer Block Layer I/OFG wait For ensuring consistency and/or durability
  • 24. 24 100 ms latency at 99.99th percentile 0 5000 10000 15000 20000 25000 30000 35000 0 200 400 600 800 1000 1200 1400 1600 1800 Operationthroughput (ops/sec) Elapsed time (sec) CFQ CFQ-IDLE SPLIT-A SPLIT-D QASIO RCP • Request-centric I/O prioritization (RCP) • Critical I/O: I/O in the critical path of request handling • Policy: holistically prioritizes critical I/Os along the I/O path Our Approach
  • 25. Challenges • How to accurately identify I/O criticality • How to effectively enforce I/O criticality 25
  • 26. Critical I/O Detection • Enlightenment API • Interface for tagging foreground tasks • I/O priority inheritance • Handling task dependency • Handling I/O dependency 26
  • 27. I/O Priority Inheritance • Handling task dependency • Locks • Condition variables 27 FG lock BG I/OFG inherit BG submit complete FG BG unlock FG wait BG register BG inherit FG BGI/O submit complete wake CV CV CV
  • 28. I/O Priority Inheritance • Handling I/O dependency 28
  • 29. I/O Priority Inheritance • Handling I/O dependency 29 Block Layer I/OI/O
  • 30. I/O Priority Inheritance • Handling I/O dependency 30 Block Layer Q admission stage I/OI/O Sched queueing stage
  • 31. I/O Priority Inheritance • Handling I/O dependency 31 Block Layer PER-DEV ROOT NCIO NCIO NCIO NCIO Q admission stage I/OI/O Sched queueing stage Non-critical I/O tracking Descriptor Location Resolver Sector #
  • 32. I/O Priority Inheritance • Handling I/O dependency 32 Block Layer Descriptor Location Resolver Sector # allocate Q admission stage I/O I/O I/O Sched queueing stage Non-critical I/O tracking PER-DEV ROOT NCIO NCIO NCIO NCIO
  • 33. I/O Priority Inheritance • Handling I/O dependency 33 Block Layer Q admission stage I/O I/O I/O Sched queueing stage Non-critical I/O tracking update Descriptor Location Resolver Sector # PER-DEV ROOT NCIO NCIO NCIO NCIO
  • 34. I/O Priority Inheritance • Handling I/O dependency 34 Block Layer Q admission stage I/O I/O Sched queueing stage I/O Non-critical I/O tracking update Descriptor Location Resolver Sector # PER-DEV ROOT NCIO NCIO NCIO NCIO
  • 35. I/O Priority Inheritance • Handling I/O dependency 35 Block Layer Q admission stage I/O I/O Sched queueing stage I/O Non-critical I/O tracking update Descriptor Location Resolver Sector # PER-DEV ROOT NCIO NCIO NCIO NCIO
  • 36. I/O Priority Inheritance • Handling I/O dependency 36 Block Layer Q admission stage I/O I/O Sched queueing stage I/O Non-critical I/O tracking update Descriptor Location Resolver Sector # PER-DEV ROOT NCIO NCIO NCIO NCIO
  • 37. I/O Priority Inheritance • Handling I/O dependency 37 Block Layer Q admission stage I/O I/O Sched queueing stage I/O Non-critical I/O tracking Descriptor Location Resolver Sector # PER-DEV ROOT NCIO NCIO NCIO NCIO delete on completion
  • 38. Handling Transitive Dependency • Possible states of dependent task 38 FG inherit BG BG Blocked on task I/OFG inherit BG wait wait Blocked on I/O FG inherit BG wait Blocked at admission stage
  • 39. Handling Transitive Dependency • Recording blocking status 39 FG inherit BG BG Blocked on task I/OFG inherit BG wait wait Blocked on I/O FG inherit BG wait Blocked at admission stage
  • 40. Handling Transitive Dependency • Recording blocking status 40 FG inherit BG BG Task is recorded I/OFG inherit BG wait Blocked on I/O FG inherit BG wait Blocked at admission stage inherit
  • 41. Handling Transitive Dependency • Recording blocking status 41 FG inherit BG BG I/OFG inherit BG reprio I/O is recorded FG inherit BG wait Blocked at admission stage Task is recorded inherit
  • 42. Handling Transitive Dependency • Recording blocking status 42 FG inherit BG BG I/OFG inherit BG FG inherit BG retryreprio I/O is recorded Task is recorded inherit
  • 43. Challenges • How to accurately identify I/O criticality • Enlightenment API • I/O priority inheritance • Recording blocking status • How to effectively enforce I/O criticality 43
  • 44. Criticality-Aware I/O Prioritization • Caching layer • Apply low dirty ratio for non-critical writes (1% by default) • Block layer • Isolate allocation of block queue slots • Maintain 2 FIFO queues • Schedule critical I/O first • Limit # of outstanding non-critical I/Os (1 by default) • Support queue promotion to resolve I/O dependency 44
  • 45. Evaluation • Implementation on Linux 3.13 w/ ext4 • Application studies • PostgreSQL relational database • Backend processes as foreground tasks • I/O priority inheritance on LWLocks (semop) • MongoDB document store • Client threads as foreground tasks • I/O priority inheritance on Pthread mutex and condition vars (futex) • Redis key-value store • Master process as foreground task 45
  • 46. Evaluation • Experimental setup • 2 Dell PowerEdge R530 (server & client) • 1TB Micron MX200 SSD • I/O prioritization schemes • CFQ (default), CFQ-IDLE • SPLIT-A (priority), SPLIT-D (deadline) [SOSP’15] • QASIO [FAST’15] • RCP 46
  • 47. Application Throughput • PostgreSQL w/ TPC-C workload 47 0 1000 2000 3000 4000 5000 6000 7000 10GB dataset 60GB dataset 200GB dataset Transactionthroughput (trx/sec) CFQ CFQ-IDLE
  • 48. Application Throughput • PostgreSQL w/ TPC-C workload 48 0 1000 2000 3000 4000 5000 6000 7000 10GB dataset 60GB dataset 200GB dataset Transactionthroughput (trx/sec) CFQ CFQ-IDLE SPLIT-A
  • 49. Application Throughput • PostgreSQL w/ TPC-C workload 49 0 1000 2000 3000 4000 5000 6000 7000 10GB dataset 60GB dataset 200GB dataset Transactionthroughput (trx/sec) CFQ CFQ-IDLE SPLIT-A SPLIT-D
  • 50. Application Throughput • PostgreSQL w/ TPC-C workload 50 0 1000 2000 3000 4000 5000 6000 7000 10GB dataset 60GB dataset 200GB dataset Transactionthroughput (trx/sec) CFQ CFQ-IDLE SPLIT-A SPLIT-D QASIO
  • 51. Application Throughput • PostgreSQL w/ TPC-C workload 51 0 1000 2000 3000 4000 5000 6000 7000 10GB dataset 60GB dataset 200GB dataset Transactionthroughput (trx/sec) CFQ CFQ-IDLE SPLIT-A SPLIT-D QASIO RCP 37% 31% 28%
  • 52. Application Throughput • Impact on background task 52 0 5 10 15 20 25 30 35 0 200 400 600 800 1000 1200 1400 1600 1800 Transactionlogsize(GB) Elapsed time (sec) CFQ CFQ-IDLE QASIO
  • 53. Application Throughput • Impact on background task 53 0 5 10 15 20 25 30 35 0 200 400 600 800 1000 1200 1400 1600 1800 Transactionlogsize(GB) Elapsed time (sec) CFQ CFQ-IDLE SPLIT-A SPLIT-D QASIO
  • 54. Application Throughput • Impact on background task 54 Our scheme improves application throughput w/o penalizing background tasks 0 5 10 15 20 25 30 35 0 200 400 600 800 1000 1200 1400 1600 1800 Transactionlogsize(GB) Elapsed time (sec) CFQ CFQ-IDLE SPLIT-A SPLIT-D QASIO RCP
  • 55. Application Latency • PostgreSQL w/ TPC-C workload 55 1.E-05 1.E-04 1.E-03 1.E-02 1.E-01 1.E+00 0 1000 2000 3000 4000 5000 6000 CCDFP[X>=x] Transaction latency (msec) CFQ CFQ-IDLE SPLIT-A SPLIT-D QASIO 100 10-1 10-2 10-3 10-4 10-5 Over 2 sec at 99.9th 0th 90th 99th 99.9th 99.99th 99.999th
  • 56. Application Latency • PostgreSQL w/ TPC-C workload 56 1.E-05 1.E-04 1.E-03 1.E-02 1.E-01 1.E+00 0 1000 2000 3000 4000 5000 6000 CCDFP[X>=x] Transaction latency (msec) CFQ CFQ-IDLE SPLIT-A SPLIT-D QASIO RCP 100 10-1 10-2 10-3 10-4 10-5 300 msec at 99.999th Our scheme is effective for improving tail latency Over 2 sec at 99.9th 100 10-1 10-2 10-3 10-4 10-5 0th 90th 99th 99.9th 99.99th 99.999th
  • 57. Summary of Other Results • Performance results • MongoDB: 12%-201% throughput, 5x-20x latency at 99.9th • Redis: 7%-49% throughput, 2x-20x latency at 99.9th • Analysis results • System latency analysis using LatencyTOP • System throughput vs. Application latency • Need for holistic approach 57
  • 58. Conclusions • Key observation • All the layers in the I/O path should be considered as a whole with I/O priority inversion in mind for effective I/O prioritization • Request-centric I/O prioritization • Enlightens the I/O path solely for application performance • Improves throughput and latency of real applications • Ongoing work • Practicalizing implementation • Applying RCP to database cluster with multiple replicas 58
  • 59. Thank You! • Questions and comments • Contact • sangwook@apposha.io 59