SlideShare una empresa de Scribd logo
1 de 14
Dynamic filtering for
Presto join optimisation
Roman Zeyde
Presto Conference Israel 2019
Agenda
Existing join optimization techniques
Dynamic filtering description
Implementation details
Performance analysis
Existing join optimization techniques
Happen during planning phase:
• Join reordering
• Join distribution type (distributed vs. broadcast)
Depend on cost-based optimizer (need column
statistics)
• Should be enabled via session parameters
• Can be estimated using ANALYZE statement
Example: join reordering
SELECT * FROM items JOIN sales ON sales.item_id = items.id;
Prefer keeping the "smaller" table on the right-hand side of the join:
Join
(item_id=id)
Join
(item_id=id)
Scan
sales
Scan
items
Scan
items
Scan
sales
Example: broadcast join
If the right-hand side table is "small", it can be replicated to
all join workers - saving the CPU and network cost of left-
hand side repartitioning:
Join worker
Join worker
Join workerLeft-hand side
Right-hand side
Join worker
Join worker
Example: distributed join
Otherwise, both tables are repartitioned using the join key,
allowing joins with larger right-hand side tables:
Join workerLeft-hand side
Right-hand side
Dynamic filtering - introduction
Consider the following query:
SELECT * FROM sales JOIN items
ON sales.item_id = items.id
WHERE items.price > 1000;
Assumptions:
● sales table is large
● items scan results in a few rows
(due to predicate pushdown)
Most of the scanned sales rows will be
discarded during the join (i.e. high selectivity).
How can we optimize this use-case?
Join
(item_id=id)
Scan
sales
Scan
items
[price>1000]
Dynamic filtering - description
1. Collect relevant id values during items scan
2. Construct dynamic filter F using the collected ids
3. Apply dynamic predicate pushdown using F to
sales scan
Benefits:
• Connector may optimize the scan given F
• Most sales rows are not touched by Presto
• CPU & network savings for large tables
Requirements:
• F cannot be too large (memory-wise)
• F need to "back propagate" into sales scan in runtime
Join
(item_id=id)
Scan
items
[price>1000]
Scan
sales
[item_id∈F]
(3)
(1)
(2)
Construct
dynamic
filter F
Implementation details - Qubole et al.
Supports both distributed and broadcast joins, but
requires significant changes in Presto:
• Add plan nodes and optimizer rules for dynamic
filter collection and application
• New coordinator REST endpoint for dynamic filter
collection from worker nodes.
• Allow connectors to prune partitions during split
generation (when dynamic filter is ready)
More details can be found here:
qubole.com/blog/sql-join-optimizations-qubole-presto
(https://docs.google.com/document/d/1TOlxS8ZAXSIHR5ftHbPsgUkuUA-ky-odwmPdZARJrUQ)
Implementation - Varada
When broadcast join is used, sales'
ScanFilterAndProject and items' HashBuilder
operators run at the same process:
• Add a "pass-through" operator to collect
build-side ids.
• When ready, pushdown the resulting
predicate F into sales page source.
No changes needed at the planner, optimizer and
coordinator!
Implemented as a patch on top of
github.com/prestosql/presto (currently work-in-
progress).
ScanFilterAndProject
sales
[item_id∈F]
ScanFilterAndProject
items
[price>1000]
Exchange
Exchange
Collect
F:=F∪{id}
LookupJoin
[item_id=id]
HashBuilder
[id]
TaskOutput
Performance analysis - benchmark
Consider the following query (based on TPC-DS sf10000 dataset):
SELECT ss_item_sk FROM store_sales JOIN customer
ON ss_customer_sk = c_customer_sk
WHERE c_customer_id = 'AAAAAAAAMCOOKLCA';
• store_sales contains 27.7B rows
• customer contains 65M rows
• Query result contains 334 rows
Performance analysis - results
Regular join Dynamic filtering improvement
Execution time 25 sec 0.9 sec x27 faster
CPU time 57.4 min 7.8 sec x440 lower
Peak total memory 261 MB 2.2 MB x118 lower
Data read (from connector) 258 GB 3.3 kB x78M lower
Tested on Varada cluster (with CBO enabled):
Up next in Presto improvements
• Distributed Joins - extend dynamic filtering
• Aggregation Pushdown
• Coordinator HA
Thank you!

Más contenido relacionado

La actualidad más candente

Diving into Delta Lake: Unpacking the Transaction Log
Diving into Delta Lake: Unpacking the Transaction LogDiving into Delta Lake: Unpacking the Transaction Log
Diving into Delta Lake: Unpacking the Transaction LogDatabricks
 
Photon Technical Deep Dive: How to Think Vectorized
Photon Technical Deep Dive: How to Think VectorizedPhoton Technical Deep Dive: How to Think Vectorized
Photon Technical Deep Dive: How to Think VectorizedDatabricks
 
Apache Spark Core – Practical Optimization
Apache Spark Core – Practical OptimizationApache Spark Core – Practical Optimization
Apache Spark Core – Practical OptimizationDatabricks
 
The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesDatabricks
 
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiHow to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiFlink Forward
 
Processing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeekProcessing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeekVenkata Naga Ravi
 
Presto best practices for Cluster admins, data engineers and analysts
Presto best practices for Cluster admins, data engineers and analystsPresto best practices for Cluster admins, data engineers and analysts
Presto best practices for Cluster admins, data engineers and analystsShubham Tagra
 
Building an analytics workflow using Apache Airflow
Building an analytics workflow using Apache AirflowBuilding an analytics workflow using Apache Airflow
Building an analytics workflow using Apache AirflowYohei Onishi
 
Apache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang WangApache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang WangDatabricks
 
Spark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in Spark
Spark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in SparkSpark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in Spark
Spark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in SparkBo Yang
 
Adaptive Query Execution: Speeding Up Spark SQL at Runtime
Adaptive Query Execution: Speeding Up Spark SQL at RuntimeAdaptive Query Execution: Speeding Up Spark SQL at Runtime
Adaptive Query Execution: Speeding Up Spark SQL at RuntimeDatabricks
 
Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...
 Best Practice of Compression/Decompression Codes in Apache Spark with Sophia... Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...
Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...Databricks
 
The Apache Spark File Format Ecosystem
The Apache Spark File Format EcosystemThe Apache Spark File Format Ecosystem
The Apache Spark File Format EcosystemDatabricks
 
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...Databricks
 
Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkFlink Forward
 
Performance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaPerformance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaCloudera, Inc.
 
How Adobe uses Structured Streaming at Scale
How Adobe uses Structured Streaming at ScaleHow Adobe uses Structured Streaming at Scale
How Adobe uses Structured Streaming at ScaleDatabricks
 
Smarter Together - Bringing Relational Algebra, Powered by Apache Calcite, in...
Smarter Together - Bringing Relational Algebra, Powered by Apache Calcite, in...Smarter Together - Bringing Relational Algebra, Powered by Apache Calcite, in...
Smarter Together - Bringing Relational Algebra, Powered by Apache Calcite, in...Julian Hyde
 

La actualidad más candente (20)

Diving into Delta Lake: Unpacking the Transaction Log
Diving into Delta Lake: Unpacking the Transaction LogDiving into Delta Lake: Unpacking the Transaction Log
Diving into Delta Lake: Unpacking the Transaction Log
 
Photon Technical Deep Dive: How to Think Vectorized
Photon Technical Deep Dive: How to Think VectorizedPhoton Technical Deep Dive: How to Think Vectorized
Photon Technical Deep Dive: How to Think Vectorized
 
Apache Spark Core – Practical Optimization
Apache Spark Core – Practical OptimizationApache Spark Core – Practical Optimization
Apache Spark Core – Practical Optimization
 
The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization Opportunities
 
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiHow to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
 
Processing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeekProcessing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeek
 
Presto best practices for Cluster admins, data engineers and analysts
Presto best practices for Cluster admins, data engineers and analystsPresto best practices for Cluster admins, data engineers and analysts
Presto best practices for Cluster admins, data engineers and analysts
 
Building an analytics workflow using Apache Airflow
Building an analytics workflow using Apache AirflowBuilding an analytics workflow using Apache Airflow
Building an analytics workflow using Apache Airflow
 
Apache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang WangApache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
 
Spark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in Spark
Spark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in SparkSpark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in Spark
Spark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in Spark
 
Adaptive Query Execution: Speeding Up Spark SQL at Runtime
Adaptive Query Execution: Speeding Up Spark SQL at RuntimeAdaptive Query Execution: Speeding Up Spark SQL at Runtime
Adaptive Query Execution: Speeding Up Spark SQL at Runtime
 
Rds data lake @ Robinhood
Rds data lake @ Robinhood Rds data lake @ Robinhood
Rds data lake @ Robinhood
 
Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...
 Best Practice of Compression/Decompression Codes in Apache Spark with Sophia... Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...
Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...
 
The Apache Spark File Format Ecosystem
The Apache Spark File Format EcosystemThe Apache Spark File Format Ecosystem
The Apache Spark File Format Ecosystem
 
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
 
Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in Flink
 
Spark tuning
Spark tuningSpark tuning
Spark tuning
 
Performance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaPerformance Optimizations in Apache Impala
Performance Optimizations in Apache Impala
 
How Adobe uses Structured Streaming at Scale
How Adobe uses Structured Streaming at ScaleHow Adobe uses Structured Streaming at Scale
How Adobe uses Structured Streaming at Scale
 
Smarter Together - Bringing Relational Algebra, Powered by Apache Calcite, in...
Smarter Together - Bringing Relational Algebra, Powered by Apache Calcite, in...Smarter Together - Bringing Relational Algebra, Powered by Apache Calcite, in...
Smarter Together - Bringing Relational Algebra, Powered by Apache Calcite, in...
 

Similar a Dynamic filtering for presto join optimisation

Advanced Analytics using Apache Hive
Advanced Analytics using Apache HiveAdvanced Analytics using Apache Hive
Advanced Analytics using Apache HiveMurtaza Doctor
 
James Jara Portfolio 2014 - Enterprise datagrid - Part 3
James Jara Portfolio 2014  - Enterprise datagrid - Part 3James Jara Portfolio 2014  - Enterprise datagrid - Part 3
James Jara Portfolio 2014 - Enterprise datagrid - Part 3James Jara
 
Mutable data @ scale
Mutable data @ scaleMutable data @ scale
Mutable data @ scaleOri Reshef
 
Top Tips for Getting the Best from SuccessFactors Q2 2016 Release Universal ...
Top Tips for Getting the Best from SuccessFactors Q2 2016 Release Universal ...Top Tips for Getting the Best from SuccessFactors Q2 2016 Release Universal ...
Top Tips for Getting the Best from SuccessFactors Q2 2016 Release Universal ...NGA Human Resources
 
Presentation v mware roi tco calculator
Presentation   v mware roi tco calculatorPresentation   v mware roi tco calculator
Presentation v mware roi tco calculatorsolarisyourep
 
SplunkLive! Advanced Session
SplunkLive! Advanced SessionSplunkLive! Advanced Session
SplunkLive! Advanced SessionSplunk
 
Oracle BI Publsiher Using Data Template
Oracle BI Publsiher Using Data TemplateOracle BI Publsiher Using Data Template
Oracle BI Publsiher Using Data TemplateEdi Yanto
 
Matlab Based High Level Synthesis Engine for Area And Power Efficient Arithme...
Matlab Based High Level Synthesis Engine for Area And Power Efficient Arithme...Matlab Based High Level Synthesis Engine for Area And Power Efficient Arithme...
Matlab Based High Level Synthesis Engine for Area And Power Efficient Arithme...ijceronline
 
Data Con LA 2019 - MetaConfig driven FeatureStore with Feature compute & Serv...
Data Con LA 2019 - MetaConfig driven FeatureStore with Feature compute & Serv...Data Con LA 2019 - MetaConfig driven FeatureStore with Feature compute & Serv...
Data Con LA 2019 - MetaConfig driven FeatureStore with Feature compute & Serv...Data Con LA
 
MetaConfig driven FeatureStore : MakeMyTrip | Presented at Data Con LA 2019 b...
MetaConfig driven FeatureStore : MakeMyTrip | Presented at Data Con LA 2019 b...MetaConfig driven FeatureStore : MakeMyTrip | Presented at Data Con LA 2019 b...
MetaConfig driven FeatureStore : MakeMyTrip | Presented at Data Con LA 2019 b...Piyush Kumar
 
Sprint 50 review
Sprint 50 reviewSprint 50 review
Sprint 50 reviewManageIQ
 
Cognos framework manager
Cognos framework managerCognos framework manager
Cognos framework managermaxonlinetr
 
Sql query analyzer & maintenance
Sql query analyzer & maintenanceSql query analyzer & maintenance
Sql query analyzer & maintenancenspyrenet
 
Unifying your data management with Hadoop
Unifying your data management with HadoopUnifying your data management with Hadoop
Unifying your data management with HadoopJayant Shekhar
 
Maximizing Database Tuning in SAP SQL Anywhere
Maximizing Database Tuning in SAP SQL AnywhereMaximizing Database Tuning in SAP SQL Anywhere
Maximizing Database Tuning in SAP SQL AnywhereSAP Technology
 
Self-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
Self-serve analytics journey at Celtra: Snowflake, Spark, and DatabricksSelf-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
Self-serve analytics journey at Celtra: Snowflake, Spark, and DatabricksGrega Kespret
 
Top 10 tips for Oracle performance
Top 10 tips for Oracle performanceTop 10 tips for Oracle performance
Top 10 tips for Oracle performanceGuy Harrison
 
Summer '23 Tips.pptx
Summer '23 Tips.pptxSummer '23 Tips.pptx
Summer '23 Tips.pptxCarl Brundage
 

Similar a Dynamic filtering for presto join optimisation (20)

Advanced Analytics using Apache Hive
Advanced Analytics using Apache HiveAdvanced Analytics using Apache Hive
Advanced Analytics using Apache Hive
 
Rough cut connect2-xyz
Rough cut connect2-xyzRough cut connect2-xyz
Rough cut connect2-xyz
 
James Jara Portfolio 2014 - Enterprise datagrid - Part 3
James Jara Portfolio 2014  - Enterprise datagrid - Part 3James Jara Portfolio 2014  - Enterprise datagrid - Part 3
James Jara Portfolio 2014 - Enterprise datagrid - Part 3
 
CIM MODULE 2.pptx
CIM MODULE 2.pptxCIM MODULE 2.pptx
CIM MODULE 2.pptx
 
Mutable data @ scale
Mutable data @ scaleMutable data @ scale
Mutable data @ scale
 
Top Tips for Getting the Best from SuccessFactors Q2 2016 Release Universal ...
Top Tips for Getting the Best from SuccessFactors Q2 2016 Release Universal ...Top Tips for Getting the Best from SuccessFactors Q2 2016 Release Universal ...
Top Tips for Getting the Best from SuccessFactors Q2 2016 Release Universal ...
 
Presentation v mware roi tco calculator
Presentation   v mware roi tco calculatorPresentation   v mware roi tco calculator
Presentation v mware roi tco calculator
 
SplunkLive! Advanced Session
SplunkLive! Advanced SessionSplunkLive! Advanced Session
SplunkLive! Advanced Session
 
Oracle BI Publsiher Using Data Template
Oracle BI Publsiher Using Data TemplateOracle BI Publsiher Using Data Template
Oracle BI Publsiher Using Data Template
 
Matlab Based High Level Synthesis Engine for Area And Power Efficient Arithme...
Matlab Based High Level Synthesis Engine for Area And Power Efficient Arithme...Matlab Based High Level Synthesis Engine for Area And Power Efficient Arithme...
Matlab Based High Level Synthesis Engine for Area And Power Efficient Arithme...
 
Data Con LA 2019 - MetaConfig driven FeatureStore with Feature compute & Serv...
Data Con LA 2019 - MetaConfig driven FeatureStore with Feature compute & Serv...Data Con LA 2019 - MetaConfig driven FeatureStore with Feature compute & Serv...
Data Con LA 2019 - MetaConfig driven FeatureStore with Feature compute & Serv...
 
MetaConfig driven FeatureStore : MakeMyTrip | Presented at Data Con LA 2019 b...
MetaConfig driven FeatureStore : MakeMyTrip | Presented at Data Con LA 2019 b...MetaConfig driven FeatureStore : MakeMyTrip | Presented at Data Con LA 2019 b...
MetaConfig driven FeatureStore : MakeMyTrip | Presented at Data Con LA 2019 b...
 
Sprint 50 review
Sprint 50 reviewSprint 50 review
Sprint 50 review
 
Cognos framework manager
Cognos framework managerCognos framework manager
Cognos framework manager
 
Sql query analyzer & maintenance
Sql query analyzer & maintenanceSql query analyzer & maintenance
Sql query analyzer & maintenance
 
Unifying your data management with Hadoop
Unifying your data management with HadoopUnifying your data management with Hadoop
Unifying your data management with Hadoop
 
Maximizing Database Tuning in SAP SQL Anywhere
Maximizing Database Tuning in SAP SQL AnywhereMaximizing Database Tuning in SAP SQL Anywhere
Maximizing Database Tuning in SAP SQL Anywhere
 
Self-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
Self-serve analytics journey at Celtra: Snowflake, Spark, and DatabricksSelf-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
Self-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
 
Top 10 tips for Oracle performance
Top 10 tips for Oracle performanceTop 10 tips for Oracle performance
Top 10 tips for Oracle performance
 
Summer '23 Tips.pptx
Summer '23 Tips.pptxSummer '23 Tips.pptx
Summer '23 Tips.pptx
 

Último

Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 

Último (20)

Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 

Dynamic filtering for presto join optimisation

  • 1. Dynamic filtering for Presto join optimisation Roman Zeyde Presto Conference Israel 2019
  • 2. Agenda Existing join optimization techniques Dynamic filtering description Implementation details Performance analysis
  • 3. Existing join optimization techniques Happen during planning phase: • Join reordering • Join distribution type (distributed vs. broadcast) Depend on cost-based optimizer (need column statistics) • Should be enabled via session parameters • Can be estimated using ANALYZE statement
  • 4. Example: join reordering SELECT * FROM items JOIN sales ON sales.item_id = items.id; Prefer keeping the "smaller" table on the right-hand side of the join: Join (item_id=id) Join (item_id=id) Scan sales Scan items Scan items Scan sales
  • 5. Example: broadcast join If the right-hand side table is "small", it can be replicated to all join workers - saving the CPU and network cost of left- hand side repartitioning: Join worker Join worker Join workerLeft-hand side Right-hand side
  • 6. Join worker Join worker Example: distributed join Otherwise, both tables are repartitioned using the join key, allowing joins with larger right-hand side tables: Join workerLeft-hand side Right-hand side
  • 7. Dynamic filtering - introduction Consider the following query: SELECT * FROM sales JOIN items ON sales.item_id = items.id WHERE items.price > 1000; Assumptions: ● sales table is large ● items scan results in a few rows (due to predicate pushdown) Most of the scanned sales rows will be discarded during the join (i.e. high selectivity). How can we optimize this use-case? Join (item_id=id) Scan sales Scan items [price>1000]
  • 8. Dynamic filtering - description 1. Collect relevant id values during items scan 2. Construct dynamic filter F using the collected ids 3. Apply dynamic predicate pushdown using F to sales scan Benefits: • Connector may optimize the scan given F • Most sales rows are not touched by Presto • CPU & network savings for large tables Requirements: • F cannot be too large (memory-wise) • F need to "back propagate" into sales scan in runtime Join (item_id=id) Scan items [price>1000] Scan sales [item_id∈F] (3) (1) (2) Construct dynamic filter F
  • 9. Implementation details - Qubole et al. Supports both distributed and broadcast joins, but requires significant changes in Presto: • Add plan nodes and optimizer rules for dynamic filter collection and application • New coordinator REST endpoint for dynamic filter collection from worker nodes. • Allow connectors to prune partitions during split generation (when dynamic filter is ready) More details can be found here: qubole.com/blog/sql-join-optimizations-qubole-presto (https://docs.google.com/document/d/1TOlxS8ZAXSIHR5ftHbPsgUkuUA-ky-odwmPdZARJrUQ)
  • 10. Implementation - Varada When broadcast join is used, sales' ScanFilterAndProject and items' HashBuilder operators run at the same process: • Add a "pass-through" operator to collect build-side ids. • When ready, pushdown the resulting predicate F into sales page source. No changes needed at the planner, optimizer and coordinator! Implemented as a patch on top of github.com/prestosql/presto (currently work-in- progress). ScanFilterAndProject sales [item_id∈F] ScanFilterAndProject items [price>1000] Exchange Exchange Collect F:=F∪{id} LookupJoin [item_id=id] HashBuilder [id] TaskOutput
  • 11. Performance analysis - benchmark Consider the following query (based on TPC-DS sf10000 dataset): SELECT ss_item_sk FROM store_sales JOIN customer ON ss_customer_sk = c_customer_sk WHERE c_customer_id = 'AAAAAAAAMCOOKLCA'; • store_sales contains 27.7B rows • customer contains 65M rows • Query result contains 334 rows
  • 12. Performance analysis - results Regular join Dynamic filtering improvement Execution time 25 sec 0.9 sec x27 faster CPU time 57.4 min 7.8 sec x440 lower Peak total memory 261 MB 2.2 MB x118 lower Data read (from connector) 258 GB 3.3 kB x78M lower Tested on Varada cluster (with CBO enabled):
  • 13. Up next in Presto improvements • Distributed Joins - extend dynamic filtering • Aggregation Pushdown • Coordinator HA

Notas del editor

  1. CBO is supported today by Presto Hive connector (using Hive statistics).
  2. Since hash-join requires reading the right-hand side table into memory, we would like to estimate the expected sizes' and reorder the join accordingly. It can be done manually - or automatically (using CBO) via connector-provided statistics.
  3. Broadcast join optimization allows to save network cost for LHS repartitioning at the expense of RHS replication. Can be set manually, or via CBO (by enumerating the possible join types and choosing the one with lowest cost).
  4. If we knew the item IDs during the planning, we could use predicate pushdown to propagate them into the connector.
  5. So instead, we need to construct the predicate in run-time (before starting the LHS scan). We re-use existing predicate pushdown mechanism, which allows us to skip most of the LHS table (can be done efficiently in our case).
  6. The coordination problem is much simpler in this case.
  7. Note: we don't know the join key during the plan, so regular predicate pushdown doesn't work. There is a single customer that matches RHS, so the results are highly selective.
  8. Same query, same hardware - without / with dynamic filtering. These results show that dynamic filtering may significantly improve the performance of highly-selective queries, by making relatively small changes in Presto.
  9. We are planning to continue the work on dynamic filtering, as well as adding support for aggregation pushdown and coordinator high-availability.