Se ha denunciado esta presentación.
Se está descargando tu SlideShare. ×

Running cost effective big data workloads with Azure Synapse and Azure Data Lake Storage (Build 2020-INT130)

Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Próximo SlideShare
An intro to Azure Data Lake
An intro to Azure Data Lake
Cargando en…3
×

Eche un vistazo a continuación

1 de 12 Anuncio

Running cost effective big data workloads with Azure Synapse and Azure Data Lake Storage (Build 2020-INT130)

Descargar para leer sin conexión

The presentation discusses how to migrate expensive open source big data workloads to Azure and leverage latest compute and storage innovations within Azure Synapse with Azure Data Lake Storage to develop a powerful and cost effective analytics solutions. It shows how you can bring your .NET expertise with .NET for Apache Spark to bear and how the shared meta data experience in Synapse makes it easy to create a table in Spark and query it from T-SQL.

The presentation discusses how to migrate expensive open source big data workloads to Azure and leverage latest compute and storage innovations within Azure Synapse with Azure Data Lake Storage to develop a powerful and cost effective analytics solutions. It shows how you can bring your .NET expertise with .NET for Apache Spark to bear and how the shared meta data experience in Synapse makes it easy to create a table in Spark and query it from T-SQL.

Anuncio
Anuncio

Más Contenido Relacionado

Presentaciones para usted (20)

Similares a Running cost effective big data workloads with Azure Synapse and Azure Data Lake Storage (Build 2020-INT130) (20)

Anuncio

Más de Michael Rys (20)

Anuncio

Running cost effective big data workloads with Azure Synapse and Azure Data Lake Storage (Build 2020-INT130)

  1. 1. Running cost effective big data workloads with Azure Synapse and Azure Data Lake Storage James Baker Michael Rys Rukmani Gopalan
  2. 2. Agenda 1. Modernize your big data workloads 2. .NET for Apache Spark 3. Demo
  3. 3. Traditional on-prem analytics pipeline Operational database Business/custom apps Operational database Operational database Enterprise data warehouse Data mart Data mart Data mart ETL ETL ETL ETL ETL ETL ETL Reporting Analytics Data mining
  4. 4. Modern data warehouse Logs (structured) Media (unstructured) Files (unstructured) Business/custom apps (structured) Ingest Prep & train Model & serve Store Azure Data Lake Storage Azure SQL Data Warehouse Azure DatabricksAzure Data Factory Power BI
  5. 5. Modern data warehouse with Azure Synapse Logs (structured) Media (unstructured) Files (unstructured) Business/custom apps (structured) Azure Synapse Analytics Power BI Store Azure Data Lake Storage
  6. 6. Modern data warehouse with Azure Synapse Logs (structured) Media (unstructured) Files (unstructured) Business/custom apps (structured) Analytics runtimes SQL Common data estate Shared meta data Unified experience Synapse Studio Store Azure Data Lake Storage Power BI
  7. 7. Cost optimization with Azure Data Lake Storage Disaggregated compute and storage with shared meta data layer Lifecycle management for optimizing TCO Lower compute resources because of high performance
  8. 8. .NET for Apache Spark and Azure Synapse  First-class C# and F# bindings to Apache Spark, bringing the power of big data analytics to .NET developers Apache Spark 2.4/3.0 Data Frames, Structured Streaming, Delta Lake Performance optimized with Apache Arrow and HW Vectorization Learn more at http://dot.net/Spark First class integration in Azure Synapse: Batch Submission Interactive .NET notebooks .NET Standard 2.0 C# and F# ML.NET .NET
  9. 9. Demo: .NET for Spark and shared metadata experience in Azure Synapse Michael Rys, @MikeDoesBigData Analysis with interactive .NET for Spark Notebook Data prep with .NET for Spark Twitter CSV files Seamless analysis with SQL What has Michael been up? Mentions Topics Who was interacting with Michael? Michael @MikeDoesBigData
  10. 10. Guidance from experts Microsoft Docs Explore overviews, tutorials, code samples, and more. Azure Data Lake Storage: https://docs.microsoft.com/azure/storage/blobs/data-lake-storage-introduction Azure Synapse Analytics: https://docs.microsoft.com/azure/synapse-analytics .NET for Apache Spark: https://dot.net/Spark

Notas del editor

  • Establish the baseline
    The need to provision for max utilization/max consumption
    Architectural brittleness in moving data across physical stores
  • Pay for consumption model
    Compute elasticity
    Data evolves ‘in place’ within ubiquitous storage service
  • Encapsulates the MDW pattern within the Synapse service
    Retain benefits of pay for consumption & ubiquitous store
  • Unified experience leveraging heterogenous set of tools/frameworks
    Shared meta data service means that table definitions do not need to be restated as pipeline flows

×