Presto, an open source distributed SQL engine, is widely recognized for its low-latency queries, high concurrency, and native ability to query multiple data sources.
7. Starburst Enterprise Presto
Performance Connectivity Security Management
30+ supported enterprise
connectors
High performance parallel
connectors for Oracle,
Teradata, Snowflake and
more
Support
From petabytes to exabytes
– query data from disparate
sources using SQL – with
high concurrency
Control your
price/performance with the
latest cost-based optimizer
Caching available for
frequently accessed data
Kerberos & LDAP
integration
Global Security for fine-
grained Access Control
Data encryption
Data masking
Query auditing
Configuration
Autoscaling
High availability
Monitoring
Deploy anywhere
The largest team of Presto
experts in the world
Fully-tested, stable
releases, curated by the
Presto creators
Hot fixes & security
patches
24x7 support, 365 – we’ve
got your back
9. Why are we excited about Delta?
▪ ACID properties over data lake
▪ Open source table format
▪ Stored as Parquet files
▪ Object storage support
▪ Schema evolution
▪ Time travel feature
▪ Metadata & statistics
▪ Data skipping & z-ordering
10. Native Presto Delta Lake Reader
Supports data skipping
Optimizes query using file statistics
Supports reading the Delta transaction
log
Native connector written from scratch
11. Native Delta Lake Reader Performance
▪ 2x average speedup across 22 queries
▪ 6x best query speedup
▪ “What we have here is game changing for our
industry. Especially now that the native Delta
reader works as fast as it does. We have people
lining up to now use this data”
▪ We have queries that were running in 10 minutes
that are now running in 47 seconds"
Feedback from customers:Standard TPC-H benchmark:
Try now: https://docs.starburstdata.com/latest/connector/starburst-delta-lake.html
13. Starburst Platform
Data Scientists Data AnalystsFinance Marketers
The Data Consumption Layer
Existing analytics tools
Data Masking Global Security Column + Row-
level permissions
Query Auditing Fine-grained
access control
Data Encryption
Data Lakes Relational Databases NoSQL Stores Publish/Subscribe
Azure Event Hub
14. Different Technologies In Your Toolbelt
14
ETL
SQL
Streaming Ingestion
Machine Learning
Delta Lake
Management
High Concurrency SQL
BI Reporting/Analytics
Federated Queries
Your Storage
18. Starburst & Delta Lake – Use Case
Using a combination of Databricks and Starburst
Presto to bring a full data ingestion and analytical
environment to life
19. Starburst & Delta Lake – Use Case
● Real-time ingestion of event data
into Delta tables
● Customer and inventory data
ingested every hour
● Modified customer information
merged into Delta Lake table
● Data marts created using streaming
and batch data
20. Starburst & Delta Lake – Use Case
● Single point of access to numerous
data sources
● Query Delta Lake and federate with
legacy databases as well as many
NoSQL data stores
● Enforce table, column and row level
policies to ensure maximum data
security
● Mask column data for different
groups and users
21. Starburst & Delta Lake – Use Case BI Reporting Tools
SQL Query Tools
DEMO TIME!
• Connect using a variety of BI and SQL
tools including Looker, Tableau, Power
BI and DBeaver
• JDBC, ODBC and many libraries
including Python, R and Java