Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.

Delivering rapid-fire Analytics with Snowflake and Tableau

Until recently, advancements in data warehousing and analytics were largely incremental. Small innovations in database design would herald a new data warehouse every
2-3 years, which would quickly become overwhelmed with rapidly increasing data volumes. Knowledge workers struggled to access those databases with development intensive BI tools designed for reporting, rather than exploration and sharing. Both databases and BI tools were strained in locally hosted environments that were inflexible to growth or change.

Snowflake and Tableau represent a fundamentally different approach. Snowflake’s multi-cluster shared data architecture was designed for the cloud and to handle logarithmically larger data volumes at blazing speed. Tableau was made to foster an interactive approach to analytics, freeing knowledge workers to use the speed of Snowflake to their greatest advantage.

  • Inicia sesión para ver los comentarios

Delivering rapid-fire Analytics with Snowflake and Tableau

  1. 1. Welcome
  2. 2. Delivering rapid-fire analytics with Snowflake and Tableau # d a t a 1 9 Harald Erb Sales Engineer, Central Europe
  3. 3. Agenda What is Snowflake? Demo #1: Fresh Data for Tableau via Snowpipe Demo #2: Monitor Account Utilization in Snowflake Take away & Call to Action
  4. 4. What is Snowflake?
  5. 5. © 2019 Snowflake Computing Inc. All Rights Reserved THE SNOWFLAKE TIMELINE SQL Data Warehouse built for the cloud 6 Founded 2012 by industry veterans Over 2,200 active customers Raised over $950M in venture funding from leading investors First customers 2014, general availability 2015 Gartner and Forrester “Leader” Queries processed in Snowflake per day: # rows in largest single table: Largest number of tables single DB: Single customer most data: Single customer most users: > 60.000.000 68.000.000.000.000 200,000 > 40 PB > 10,000 Benoit Dageville Thierry Cruanes
  6. 6. © 2019 Snowflake Computing Inc. All Rights Reserved KNOWN CHALLENGES… 7 Complexity Manage both infrastructure and data Limited Scalability Can’t support all data, users and workloads Diversity Unable to consolidate siloed datasets Inadequate Elasticity Stuck with rigid, inflexible architectures Rigid Cost Forced to keep the lights on 24/7
  7. 7. © 2019 Snowflake Computing Inc. All Rights Reserved NEW ARCHITECTURE FOR DATA WAREHOUSING Multi-Cluster, Shared Data, in the Cloud 8 Traditional Architectures Snowflake Cluster of nodes with a single shared disk. Limited by disk size and I/O throughput (Traditional DW’s based on RDBMS) Shared-Disk (SMP) Cluster of nodes each of which has its own disk – data distributed across the nodes. Not elastic because data must be redistributed when resize the cluster (Most MPP DW‘s, Hadoop) Shared-Nothing (MPP) Multi-Cluster, Shared Data Multiple clusters, shared data. Compute power and storage scale independently of each other
  8. 8. © 2019 Snowflake Computing Inc. All Rights Reserved MULTI CLUSTER, SHARED DATA ARCHITECTURE Cloud Storage Layer Instant, automatic Scalability & Elasticity Compute Layer • Multiple Warehouses without resource contention • Resize Warehouse instantly (scale up/down) • Warehouse scales out automatically and elastically Centralized Storage
  9. 9. © 2019 Snowflake Computing Inc. All Rights Reserved REAL-WORLD USE CASE 10 Continuous Loading (4TB/day) S3 <5min SLA Virtual Warehouse Medium ETL & Maintenance Virtual Warehouse Large Virtual Warehouse 2X-Large Reporting (Segmented) Interactive Dashboard 50% < 1s 85% < 2s 95% < 5s Virtual Warehouse Auto Scale – X-Large x 5 3+ PB of raw data 1,5 PB data stored in Database (8x compression ratio) 25M micro partitions Prod DB
  10. 10. © 2019 Snowflake Computing Inc. All Rights Reserved Concurrency Simplicity Fully managed with a pay-as-you-go model. Works on any data Multiple groups access data simultaneously with no performance degradation Multi petabyte-scale, up to 200x faster performance and 1/10th the cost 200x Performance THE SNOWFLAKE DIFFERENCE
  11. 11. © 2019 Snowflake Computing Inc. All Rights Reserved NEW FEATURES RELEASED Sources: snowflake.com/about/press-and-news data.solita.fi/a-curated-list-of-new-snowflake-features-released-at-snowflake-summit-2019 > Snowflake Data Pipelines • Auto-Ingest • Streams and Tasks • Snowflake Connector for Kafka > Core Data Warehouse • New web-based SQL Editor through acquisition of tech company Numeracy • Materialized Views • JavaScript Store Procedures, hierarchical SQL • External Tables, Hive Metastore integration, Credential-less external stages > Multi-cloud strategy • Snowflake on Google Cloud is set to launch in preview in Fall 2019 • Snowflake announced Database Replication and Database Failover. If a disaster occurs in one region or on one cloud service, businesses can immediately access and control Snowflake data they have replicated in a different region or cloud service. > Secure Data Sharing • Snowflake Data Exchange
  12. 12. Demo #1: Fresh Data for Tableau via Snowpipe
  13. 13. DEMO SCENARIODEMO SCENARIO
  14. 14. DATA SCHEMA Snowflake Web UI – SQL Editor 3rd Party SQL Editor (DBeaver) 3 ys historical data ~184.000.000 rows
  15. 15. DATA ARCHITECTURE Data Sources Extract, Load & Transform Tools (ELT) Extract, Transform & Load Tools (ETL) Database Migration Services Snowflake DW Data Flow Tools Tables, CSV, JSON, XML, Avro, Parquet Virtual Warehouses Corporate Applications Databases Cloud Services Web Devices Azure Blob Amazon S3 Snowpipe Data Lake feeds Data Feed Options with Snowflake • Snowpipe processed 'Messages' or files; structured or semi-structured • Snowpipe designed for continuous ingest – typically < 1 min latency • Potential downstream ELT e.g. hourly • Time-travel can provide static vs dynamic views • (Future – Downstream pipe processing, Direct streaming connectivity) Data Lake Live Query creativecommons.tankerkoenig.de
  16. 16. LAST DATA LOAD = JUNE 7
  17. 17. LOADING ADDITIONAL FILES INTO AWS S3
  18. 18. USING AWS NOTIFICATIONS FOR SNOWPIPE
  19. 19. CREATING A SNOWPIPE TO LOAD DATA FROM S3
  20. 20. NEW DATA AUTOMATICALLY LOADED FROM S3
  21. 21. FRESH DATA READY FOR TABLEAU!
  22. 22. Demo #2: Monitor Account Utilization in Snowflake
  23. 23. PAY FOR WHAT YOU USE…DOWN TO THE SEC. ETL and Processing Morning Noon Night WorkloadReporting Ad-hoc Analytics Morning Noon Night Workload Morning Noon Night Workload Data Scientist Morning Noon Night Workload Snowflake Web UI – Account Billing & Usage
  24. 24. Scott Smith‘s Blog incl. Download of Sample Workbook: tableau.com/about/blog/2019/5/monitor-understand-snowflake-account-usage CONNECT TO SNOWFLAKE DIRECTLY AND ANALYZE/FORECAST ACCOUNT UTILIZATION
  25. 25. Take away & Call to Action
  26. 26. LEARN MORE: BEST PRACTICES E-Paper Download: resources.snowflake.com/ebooks/best-practices-for-using-tableau-with-snowflake E-Book Content: • Creating efficient Tableau workbooks • Connecting to Snowflake • Working with semi-structured data • Working with Snowflake Time Travel • Working with Snowflake Data Sharing • Implementing role-based security • Using custom aggregations • Scaling Snowflake warehouses • Caching • Other performance considerations • Measuring performance
  27. 27. TRY: TABLEAU & SNOWFLAKE QUICK START This Quick Start deploys Tableau Server in the Amazon Web Services (AWS) Cloud and configures it to work with Snowflake in about 30 minutes. More information: aws.amazon.com/quickstart/architecture/tableau-snowflake/
  28. 28. SESSION TAKE AWAY What you don’t have to worry when working with the Snowflake Cloud Data Warehouse Installing, provisioning and maintaining hardware and software: • Snowflake is a cloud-built DW as a service. • Just create an account and load some data. • You can then just connect from Tableau and start querying. Working out the capacity of your DW: • Snowflake is a fully elastic platform, so it can scale to handle all of your data and all of your users. • Just size your compute (virtual warehouses) up and down on the fly to handle peaks and lulls in your data usage. • Turn your warehouses completely off to save money when not used Learning new tools and a new query language: • Snowflake is a fully ANSI SQL-compliant DW à all skills and tools, such as Tableau, will easily connect • Snowflake provides connectors for ODBC, JDBC, Python, Spark and Node.js • Even semi-structured data can be accessed via SQL Optimizing and maintaining your data: • Snowflake is a highly-scalable, columnar data platform allowing users to run analytic queries quickly and easily. • It is not required to index or distribute data across partitions, it is all transparently managed by the platform. • Snowflake also provides inherent data protection capabilities, there is no need to worry about snapshots, backups or other administrative tasks.
  29. 29. Please complete the session survey from the My Evaluations menu in your Tableau Conference Europe 2019 app
  30. 30. Thank you! harald.erb@snowflake.com

×