SlideShare una empresa de Scribd logo
1 de 23
Descargar para leer sin conexión
Grizzly: Efficient Stream Processing
Through Adaptive Query Compilation
Philipp M. Grulich¹, Sebastian Breß², Steffen Zeuch¹², Jonas Traub¹,
Janis von Bleichert¹, Zongxiong Chen², Tilmann Rabl³, Volker Markl¹²
Technische Universität Berlin¹, DFKI GmbH², HPI & Universität Potsdam³
1
Sigmod 2020, Grizzly: Efficient Stream Processing Through Adaptive Query Compilation, Grulich et al.
Limitations of state-of-the-art SPEs
Current SPEs use hardware resources inefficiently [Zeuch et al., Zhang et al.]
2
Sigmod 2020, Grizzly: Efficient Stream Processing Through Adaptive Query Compilation, Grulich et al.
Limitations of state-of-the-art SPEs
3
1. Interpretation-based processing model causes poor cache utilization.
Current SPEs use hardware resources inefficiently [Zeuch et al., Zhang et al.]
Sigmod 2020, Grizzly: Efficient Stream Processing Through Adaptive Query Compilation, Grulich et al.
Limitations of state-of-the-art SPEs
4
1. Interpretation-based processing model causes poor cache utilization.
2. Upfront-Partitioning causes high overhead on single nodes.
Current SPEs use hardware resources inefficiently [Zeuch et al., Zhang et al.]
Sigmod 2020, Grizzly: Efficient Stream Processing Through Adaptive Query Compilation, Grulich et al.
Limitations of state-of-the-art SPEs
5
1. Interpretation-based processing model causes poor cache utilization.
2. Upfront-Partitioning causes high overhead on single nodes.
3. SPEs do not react to changing data-characteristics at runtime.
Current SPEs use hardware resources inefficiently [Zeuch et al., Zhang et al.]
Data Stream
Sigmod 2020, Grizzly: Efficient Stream Processing Through Adaptive Query Compilation, Grulich et al.
Limitations of state-of-the-art SPEs
6
1. Interpretation-based processing model causes poor cache utilization.
2. Upfront-Partitioning causes high overhead on single nodes.
3. SPEs do not react to changing data-characteristics.
An SPE should be hardware- and data-conscious.
Current SPEs use hardware resources inefficiently [Zeuch et al., Zhang et al.]
Sigmod 2020, Grizzly: Efficient Stream Processing Through Adaptive Query Compilation, Grulich et al.
Our Proposal
Grizzly: Efficient Stream Processing Through
Adaptive Query Compilation
7
Sigmod 2020, Grizzly: Efficient Stream Processing Through Adaptive Query Compilation, Grulich et al.
Grizzly’s Core Principles
Order Preserving
Task-based Parallelization
Continuous Adaptive
Optimizations
8
Query Compilation for
Stream Processing
Sigmod 2020, Grizzly: Efficient Stream Processing Through Adaptive Query Compilation, Grulich et al.
Grizzly’s Core Principles
Query Compilation for
Stream Processing
● Fuses operators to
compact code blocks.
● Support unique stream
processing operators.
9
Order Preserving
Task-based Parallelization
Continuous Adaptive
Optimizations
Sigmod 2020, Grizzly: Efficient Stream Processing Through Adaptive Query Compilation, Grulich et al.
Query Compilation
10
From(user_purchases )
.filter(origin=’Germany’)
.keyBy(userid)
.windowBy(TumblingWindow(days(7)), Max(price).as(max_price))
.filter(max_price > 42)
Sigmod 2020, Grizzly: Efficient Stream Processing Through Adaptive Query Compilation, Grulich et al.
Grizzly’s Core Principles
Query Compilation for
Stream Processing
● Fuses operators to
compact code blocks.
● How to support
combination of
window assignment,
function, and trigger?
Order Preserving
Task-based Parallelization
● Concurrent execution
on a global state.
● Supporting order
requirement of stream
processing.
● Exploiting
NUMA-configuration.
11
Sigmod 2020, Grizzly: Efficient Stream Processing Through Adaptive Query Compilation, Grulich et al.
Task-based Parallelization
12
● Input stream is processed in small batches (sized to network buffer).
● Pipelines are executed concurrently on a shared state.
Sigmod 2020, Grizzly: Efficient Stream Processing Through Adaptive Query Compilation, Grulich et al.
Task-based Parallelization
Lock-Free Window Processing
● Allows threads to process
windows concurrently.
● Lightweight coordination for
window triggering.
NUMA-awareness
● Pre-aggregate window results on
locally to minimize inter-NUMA
node communication.
13
Sigmod 2020, Grizzly: Efficient Stream Processing Through Adaptive Query Compilation, Grulich et al.
Grizzly’s Core Principles
Order Preserving
Task-based Parallelization
Continuous Adaptive
Optimizations
● Feedback loop between
code-generation and
query execution.
● Lightweight monitoring
at runtime.
14
Query Compilation for
Stream Processing
Sigmod 2020, Grizzly: Efficient Stream Processing Through Adaptive Query Compilation, Grulich et al.
Adaptive Re-Optimization
Generic Execution:
● Without data-dependent optimizations.
15
Instrumentalized Execution:
● Injects profiling code to collect statistics.
(predicate selectivity, value distribution)
Specialized Execution:
● Specialize operator implementation
(predication, fixed hash-tables)
Sigmod 2020, Grizzly: Efficient Stream Processing Through Adaptive Query Compilation, Grulich et al.
Adaptive Optimization
16
Deoptimization:
● Migrates from optimized to less optimized execution.
● Caused by violated assumptions or changed data characteristics.
Sigmod 2020, Grizzly: Efficient Stream Processing Through Adaptive Query Compilation, Grulich et al.
Evaluation
17
Sigmod 2020, Grizzly: Efficient Stream Processing Through Adaptive Query Compilation, Grulich et al.
Grizzly outperforms state-of-the-art SPEs by up-to 10x.
Evaluation: System Comparison
18
Sigmod 2020, Grizzly: Efficient Stream Processing Through Adaptive Query Compilation, Grulich et al.
Code generation is beneficial for a wide range of workloads.
Evaluation: Workloads
19
Sigmod 2020, Grizzly: Efficient Stream Processing Through Adaptive Query Compilation, Grulich et al.
Evaluation: Adaptive Optimizations
Adaptive optimizations are crucial to reach peak performance.
20
Sigmod 2020, Grizzly: Efficient Stream Processing Through Adaptive Query Compilation, Grulich et al.
Summary
www.nebula.stream
@NebulaStream
Grizzly:
● Query compilation for stream processing.
● Task-based parallelization while taking ordering
requirements into account.
● Adaptive optimization to reach to changing data
characteristics.
21
Sigmod 2020, Grizzly: Efficient Stream Processing Through Adaptive Query Compilation, Grulich et al.
Query Compilation
22
Sigmod 2020, Grizzly: Efficient Stream Processing Through Adaptive Query Compilation, Grulich et al.
System Architecture
23

Más contenido relacionado

Similar a Grizzly: Efficient Stream Processing Through Adaptive Query Compilation

Machine Learning, Graph, Text and Geospatial on Postgres and Greenplum - Gree...
Machine Learning, Graph, Text and Geospatial on Postgres and Greenplum - Gree...Machine Learning, Graph, Text and Geospatial on Postgres and Greenplum - Gree...
Machine Learning, Graph, Text and Geospatial on Postgres and Greenplum - Gree...VMware Tanzu
 
BlazingSQL & Graphistry - Netflow Demo
BlazingSQL & Graphistry - Netflow DemoBlazingSQL & Graphistry - Netflow Demo
BlazingSQL & Graphistry - Netflow DemoRodrigo Aramburu
 
Operationalizing AI at scale using MADlib Flow - Greenplum Summit 2019
Operationalizing AI at scale using MADlib Flow - Greenplum Summit 2019Operationalizing AI at scale using MADlib Flow - Greenplum Summit 2019
Operationalizing AI at scale using MADlib Flow - Greenplum Summit 2019VMware Tanzu
 
SigOpt for Machine Learning and AI
SigOpt for Machine Learning and AISigOpt for Machine Learning and AI
SigOpt for Machine Learning and AISigOpt
 
AI on Greenplum Using
 Apache MADlib and MADlib Flow - Greenplum Summit 2019
AI on Greenplum Using
 Apache MADlib and MADlib Flow - Greenplum Summit 2019AI on Greenplum Using
 Apache MADlib and MADlib Flow - Greenplum Summit 2019
AI on Greenplum Using
 Apache MADlib and MADlib Flow - Greenplum Summit 2019VMware Tanzu
 
Tuning for Systematic Trading: Talk 2: Deep Learning
Tuning for Systematic Trading: Talk 2: Deep LearningTuning for Systematic Trading: Talk 2: Deep Learning
Tuning for Systematic Trading: Talk 2: Deep LearningSigOpt
 
GOAI: GPU-Accelerated Data Science DataSciCon 2017
GOAI: GPU-Accelerated Data Science DataSciCon 2017GOAI: GPU-Accelerated Data Science DataSciCon 2017
GOAI: GPU-Accelerated Data Science DataSciCon 2017Joshua Patterson
 
Cloud-based dynamic distributed optimisation of integrated process planning a...
Cloud-based dynamic distributed optimisation of integrated process planning a...Cloud-based dynamic distributed optimisation of integrated process planning a...
Cloud-based dynamic distributed optimisation of integrated process planning a...Piotr Dziurzanski
 
How Data Volume Affects Spark Based Data Analytics on a Scale-up Server
How Data Volume Affects Spark Based Data Analytics on a Scale-up ServerHow Data Volume Affects Spark Based Data Analytics on a Scale-up Server
How Data Volume Affects Spark Based Data Analytics on a Scale-up ServerAhsan Javed Awan
 
Graph Gurus Episode 37: Modeling for Kaggle COVID-19 Dataset
Graph Gurus Episode 37: Modeling for Kaggle COVID-19 DatasetGraph Gurus Episode 37: Modeling for Kaggle COVID-19 Dataset
Graph Gurus Episode 37: Modeling for Kaggle COVID-19 DatasetTigerGraph
 
DA 592 - Term Project Report - Berker Kozan Can Koklu
DA 592 - Term Project Report - Berker Kozan Can KokluDA 592 - Term Project Report - Berker Kozan Can Koklu
DA 592 - Term Project Report - Berker Kozan Can KokluCan Köklü
 
Webinar: Cutting Time, Complexity and Cost from Data Science to Production
Webinar: Cutting Time, Complexity and Cost from Data Science to ProductionWebinar: Cutting Time, Complexity and Cost from Data Science to Production
Webinar: Cutting Time, Complexity and Cost from Data Science to Productioniguazio
 
AI Bridging Cloud Infrastructure (ABCI) and its communication performance
AI Bridging Cloud Infrastructure (ABCI) and its communication performanceAI Bridging Cloud Infrastructure (ABCI) and its communication performance
AI Bridging Cloud Infrastructure (ABCI) and its communication performanceinside-BigData.com
 
BIGDATA- Survey on Scheduling Methods in Hadoop MapReduce Framework
BIGDATA- Survey on Scheduling Methods in Hadoop MapReduce FrameworkBIGDATA- Survey on Scheduling Methods in Hadoop MapReduce Framework
BIGDATA- Survey on Scheduling Methods in Hadoop MapReduce FrameworkMahantesh Angadi
 
Purpose-built NoSQL Database for IoT by Basavaraj Soppannavar
Purpose-built NoSQL Database for IoT by Basavaraj SoppannavarPurpose-built NoSQL Database for IoT by Basavaraj Soppannavar
Purpose-built NoSQL Database for IoT by Basavaraj SoppannavarData Con LA
 
Graphics processing unit ppt
Graphics processing unit pptGraphics processing unit ppt
Graphics processing unit pptSandeep Singh
 
SigOpt at GTC - Tuning the Untunable
SigOpt at GTC - Tuning the UntunableSigOpt at GTC - Tuning the Untunable
SigOpt at GTC - Tuning the UntunableSigOpt
 
RAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needsRAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needsConnected Data World
 
NVIDIA Rapids presentation
NVIDIA Rapids presentationNVIDIA Rapids presentation
NVIDIA Rapids presentationtestSri1
 

Similar a Grizzly: Efficient Stream Processing Through Adaptive Query Compilation (20)

Machine Learning, Graph, Text and Geospatial on Postgres and Greenplum - Gree...
Machine Learning, Graph, Text and Geospatial on Postgres and Greenplum - Gree...Machine Learning, Graph, Text and Geospatial on Postgres and Greenplum - Gree...
Machine Learning, Graph, Text and Geospatial on Postgres and Greenplum - Gree...
 
BlazingSQL & Graphistry - Netflow Demo
BlazingSQL & Graphistry - Netflow DemoBlazingSQL & Graphistry - Netflow Demo
BlazingSQL & Graphistry - Netflow Demo
 
Operationalizing AI at scale using MADlib Flow - Greenplum Summit 2019
Operationalizing AI at scale using MADlib Flow - Greenplum Summit 2019Operationalizing AI at scale using MADlib Flow - Greenplum Summit 2019
Operationalizing AI at scale using MADlib Flow - Greenplum Summit 2019
 
SigOpt for Machine Learning and AI
SigOpt for Machine Learning and AISigOpt for Machine Learning and AI
SigOpt for Machine Learning and AI
 
AI on Greenplum Using
 Apache MADlib and MADlib Flow - Greenplum Summit 2019
AI on Greenplum Using
 Apache MADlib and MADlib Flow - Greenplum Summit 2019AI on Greenplum Using
 Apache MADlib and MADlib Flow - Greenplum Summit 2019
AI on Greenplum Using
 Apache MADlib and MADlib Flow - Greenplum Summit 2019
 
Tuning for Systematic Trading: Talk 2: Deep Learning
Tuning for Systematic Trading: Talk 2: Deep LearningTuning for Systematic Trading: Talk 2: Deep Learning
Tuning for Systematic Trading: Talk 2: Deep Learning
 
GOAI: GPU-Accelerated Data Science DataSciCon 2017
GOAI: GPU-Accelerated Data Science DataSciCon 2017GOAI: GPU-Accelerated Data Science DataSciCon 2017
GOAI: GPU-Accelerated Data Science DataSciCon 2017
 
Cloud-based dynamic distributed optimisation of integrated process planning a...
Cloud-based dynamic distributed optimisation of integrated process planning a...Cloud-based dynamic distributed optimisation of integrated process planning a...
Cloud-based dynamic distributed optimisation of integrated process planning a...
 
How Data Volume Affects Spark Based Data Analytics on a Scale-up Server
How Data Volume Affects Spark Based Data Analytics on a Scale-up ServerHow Data Volume Affects Spark Based Data Analytics on a Scale-up Server
How Data Volume Affects Spark Based Data Analytics on a Scale-up Server
 
Graph Gurus Episode 37: Modeling for Kaggle COVID-19 Dataset
Graph Gurus Episode 37: Modeling for Kaggle COVID-19 DatasetGraph Gurus Episode 37: Modeling for Kaggle COVID-19 Dataset
Graph Gurus Episode 37: Modeling for Kaggle COVID-19 Dataset
 
DA 592 - Term Project Report - Berker Kozan Can Koklu
DA 592 - Term Project Report - Berker Kozan Can KokluDA 592 - Term Project Report - Berker Kozan Can Koklu
DA 592 - Term Project Report - Berker Kozan Can Koklu
 
Webinar: Cutting Time, Complexity and Cost from Data Science to Production
Webinar: Cutting Time, Complexity and Cost from Data Science to ProductionWebinar: Cutting Time, Complexity and Cost from Data Science to Production
Webinar: Cutting Time, Complexity and Cost from Data Science to Production
 
AI Bridging Cloud Infrastructure (ABCI) and its communication performance
AI Bridging Cloud Infrastructure (ABCI) and its communication performanceAI Bridging Cloud Infrastructure (ABCI) and its communication performance
AI Bridging Cloud Infrastructure (ABCI) and its communication performance
 
BIGDATA- Survey on Scheduling Methods in Hadoop MapReduce Framework
BIGDATA- Survey on Scheduling Methods in Hadoop MapReduce FrameworkBIGDATA- Survey on Scheduling Methods in Hadoop MapReduce Framework
BIGDATA- Survey on Scheduling Methods in Hadoop MapReduce Framework
 
Purpose-built NoSQL Database for IoT by Basavaraj Soppannavar
Purpose-built NoSQL Database for IoT by Basavaraj SoppannavarPurpose-built NoSQL Database for IoT by Basavaraj Soppannavar
Purpose-built NoSQL Database for IoT by Basavaraj Soppannavar
 
Graphics processing unit ppt
Graphics processing unit pptGraphics processing unit ppt
Graphics processing unit ppt
 
SigOpt at GTC - Tuning the Untunable
SigOpt at GTC - Tuning the UntunableSigOpt at GTC - Tuning the Untunable
SigOpt at GTC - Tuning the Untunable
 
RAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needsRAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needs
 
Rapids: Data Science on GPUs
Rapids: Data Science on GPUsRapids: Data Science on GPUs
Rapids: Data Science on GPUs
 
NVIDIA Rapids presentation
NVIDIA Rapids presentationNVIDIA Rapids presentation
NVIDIA Rapids presentation
 

Último

TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...ssifa0344
 
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑Damini Dixit
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...chandars293
 
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Monika Rani
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICEayushi9330
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxRizalinePalanog2
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLkantirani197
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)Areesha Ahmad
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryAlex Henderson
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPirithiRaju
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Servicenishacall1
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learninglevieagacer
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​kaibalyasahoo82800
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticssakshisoni2385
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfSumit Kumar yadav
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxFarihaAbdulRasheed
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.Nitya salvi
 

Último (20)

TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
 
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
 
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 

Grizzly: Efficient Stream Processing Through Adaptive Query Compilation

  • 1. Grizzly: Efficient Stream Processing Through Adaptive Query Compilation Philipp M. Grulich¹, Sebastian Breß², Steffen Zeuch¹², Jonas Traub¹, Janis von Bleichert¹, Zongxiong Chen², Tilmann Rabl³, Volker Markl¹² Technische Universität Berlin¹, DFKI GmbH², HPI & Universität Potsdam³ 1
  • 2. Sigmod 2020, Grizzly: Efficient Stream Processing Through Adaptive Query Compilation, Grulich et al. Limitations of state-of-the-art SPEs Current SPEs use hardware resources inefficiently [Zeuch et al., Zhang et al.] 2
  • 3. Sigmod 2020, Grizzly: Efficient Stream Processing Through Adaptive Query Compilation, Grulich et al. Limitations of state-of-the-art SPEs 3 1. Interpretation-based processing model causes poor cache utilization. Current SPEs use hardware resources inefficiently [Zeuch et al., Zhang et al.]
  • 4. Sigmod 2020, Grizzly: Efficient Stream Processing Through Adaptive Query Compilation, Grulich et al. Limitations of state-of-the-art SPEs 4 1. Interpretation-based processing model causes poor cache utilization. 2. Upfront-Partitioning causes high overhead on single nodes. Current SPEs use hardware resources inefficiently [Zeuch et al., Zhang et al.]
  • 5. Sigmod 2020, Grizzly: Efficient Stream Processing Through Adaptive Query Compilation, Grulich et al. Limitations of state-of-the-art SPEs 5 1. Interpretation-based processing model causes poor cache utilization. 2. Upfront-Partitioning causes high overhead on single nodes. 3. SPEs do not react to changing data-characteristics at runtime. Current SPEs use hardware resources inefficiently [Zeuch et al., Zhang et al.] Data Stream
  • 6. Sigmod 2020, Grizzly: Efficient Stream Processing Through Adaptive Query Compilation, Grulich et al. Limitations of state-of-the-art SPEs 6 1. Interpretation-based processing model causes poor cache utilization. 2. Upfront-Partitioning causes high overhead on single nodes. 3. SPEs do not react to changing data-characteristics. An SPE should be hardware- and data-conscious. Current SPEs use hardware resources inefficiently [Zeuch et al., Zhang et al.]
  • 7. Sigmod 2020, Grizzly: Efficient Stream Processing Through Adaptive Query Compilation, Grulich et al. Our Proposal Grizzly: Efficient Stream Processing Through Adaptive Query Compilation 7
  • 8. Sigmod 2020, Grizzly: Efficient Stream Processing Through Adaptive Query Compilation, Grulich et al. Grizzly’s Core Principles Order Preserving Task-based Parallelization Continuous Adaptive Optimizations 8 Query Compilation for Stream Processing
  • 9. Sigmod 2020, Grizzly: Efficient Stream Processing Through Adaptive Query Compilation, Grulich et al. Grizzly’s Core Principles Query Compilation for Stream Processing ● Fuses operators to compact code blocks. ● Support unique stream processing operators. 9 Order Preserving Task-based Parallelization Continuous Adaptive Optimizations
  • 10. Sigmod 2020, Grizzly: Efficient Stream Processing Through Adaptive Query Compilation, Grulich et al. Query Compilation 10 From(user_purchases ) .filter(origin=’Germany’) .keyBy(userid) .windowBy(TumblingWindow(days(7)), Max(price).as(max_price)) .filter(max_price > 42)
  • 11. Sigmod 2020, Grizzly: Efficient Stream Processing Through Adaptive Query Compilation, Grulich et al. Grizzly’s Core Principles Query Compilation for Stream Processing ● Fuses operators to compact code blocks. ● How to support combination of window assignment, function, and trigger? Order Preserving Task-based Parallelization ● Concurrent execution on a global state. ● Supporting order requirement of stream processing. ● Exploiting NUMA-configuration. 11
  • 12. Sigmod 2020, Grizzly: Efficient Stream Processing Through Adaptive Query Compilation, Grulich et al. Task-based Parallelization 12 ● Input stream is processed in small batches (sized to network buffer). ● Pipelines are executed concurrently on a shared state.
  • 13. Sigmod 2020, Grizzly: Efficient Stream Processing Through Adaptive Query Compilation, Grulich et al. Task-based Parallelization Lock-Free Window Processing ● Allows threads to process windows concurrently. ● Lightweight coordination for window triggering. NUMA-awareness ● Pre-aggregate window results on locally to minimize inter-NUMA node communication. 13
  • 14. Sigmod 2020, Grizzly: Efficient Stream Processing Through Adaptive Query Compilation, Grulich et al. Grizzly’s Core Principles Order Preserving Task-based Parallelization Continuous Adaptive Optimizations ● Feedback loop between code-generation and query execution. ● Lightweight monitoring at runtime. 14 Query Compilation for Stream Processing
  • 15. Sigmod 2020, Grizzly: Efficient Stream Processing Through Adaptive Query Compilation, Grulich et al. Adaptive Re-Optimization Generic Execution: ● Without data-dependent optimizations. 15 Instrumentalized Execution: ● Injects profiling code to collect statistics. (predicate selectivity, value distribution) Specialized Execution: ● Specialize operator implementation (predication, fixed hash-tables)
  • 16. Sigmod 2020, Grizzly: Efficient Stream Processing Through Adaptive Query Compilation, Grulich et al. Adaptive Optimization 16 Deoptimization: ● Migrates from optimized to less optimized execution. ● Caused by violated assumptions or changed data characteristics.
  • 17. Sigmod 2020, Grizzly: Efficient Stream Processing Through Adaptive Query Compilation, Grulich et al. Evaluation 17
  • 18. Sigmod 2020, Grizzly: Efficient Stream Processing Through Adaptive Query Compilation, Grulich et al. Grizzly outperforms state-of-the-art SPEs by up-to 10x. Evaluation: System Comparison 18
  • 19. Sigmod 2020, Grizzly: Efficient Stream Processing Through Adaptive Query Compilation, Grulich et al. Code generation is beneficial for a wide range of workloads. Evaluation: Workloads 19
  • 20. Sigmod 2020, Grizzly: Efficient Stream Processing Through Adaptive Query Compilation, Grulich et al. Evaluation: Adaptive Optimizations Adaptive optimizations are crucial to reach peak performance. 20
  • 21. Sigmod 2020, Grizzly: Efficient Stream Processing Through Adaptive Query Compilation, Grulich et al. Summary www.nebula.stream @NebulaStream Grizzly: ● Query compilation for stream processing. ● Task-based parallelization while taking ordering requirements into account. ● Adaptive optimization to reach to changing data characteristics. 21
  • 22. Sigmod 2020, Grizzly: Efficient Stream Processing Through Adaptive Query Compilation, Grulich et al. Query Compilation 22
  • 23. Sigmod 2020, Grizzly: Efficient Stream Processing Through Adaptive Query Compilation, Grulich et al. System Architecture 23