Se ha denunciado esta presentación.
Se está descargando tu SlideShare. ×

Introduction to PolyBase

Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Cargando en…3
×

Eche un vistazo a continuación

1 de 24 Anuncio

Introduction to PolyBase

Descargar para leer sin conexión

First introduced with the Analytics Platform System (APS), PolyBase simplifies management and querying of both relational and non-relational data using T-SQL. It is now available in both Azure SQL Data Warehouse and SQL Server 2016. The major features of PolyBase include the ability to do ad-hoc queries on Hadoop data and the ability to import data from Hadoop and Azure blob storage to SQL Server for persistent storage. A major part of the presentation will be a demo on querying and creating data on HDFS (using Azure Blobs). Come see why PolyBase is the “glue” to creating federated data warehouse solutions where you can query data as it sits instead of having to move it all to one data platform.

First introduced with the Analytics Platform System (APS), PolyBase simplifies management and querying of both relational and non-relational data using T-SQL. It is now available in both Azure SQL Data Warehouse and SQL Server 2016. The major features of PolyBase include the ability to do ad-hoc queries on Hadoop data and the ability to import data from Hadoop and Azure blob storage to SQL Server for persistent storage. A major part of the presentation will be a demo on querying and creating data on HDFS (using Azure Blobs). Come see why PolyBase is the “glue” to creating federated data warehouse solutions where you can query data as it sits instead of having to move it all to one data platform.

Anuncio
Anuncio

Más Contenido Relacionado

Presentaciones para usted (20)

Anuncio

Similares a Introduction to PolyBase (20)

Más de James Serra (15)

Anuncio

Más reciente (20)

Introduction to PolyBase

  1. 1. Introduction to PolyBase James Serra Big Data Evangelist Microsoft JamesSerra3@gmail.com
  2. 2. About Me  Microsoft, Big Data Evangelist  In IT for 30 years, worked on many BI and DW projects  Worked as desktop/web/database developer, DBA, BI and DW architect and developer, MDM architect, PDW/APS developer  Been perm employee, contractor, consultant, business owner  Presenter at PASS Business Analytics Conference, PASS Summit, Enterprise Data World conference  Certifications: MCSE: Data Platform, Business Intelligence; MS: Architecting Microsoft Azure Solutions, Design and Implement Big Data Analytics Solutions, Design and Implement Cloud Data Platform Solutions  Blog at JamesSerra.com  Former SQL Server MVP  Author of book “Reporting with Microsoft SQL Server 2012”
  3. 3. Provides a scalable, T-SQL compatible query processing framework for combining data from both universes
  4. 4. 2012 2013 ……… 2016…2014 PolyBase in SQL Server 16 (CTP3) PolyBase in SQL DW PolyBase in SQL Server 2016 2015
  5. 5. PolyBase Query relational and non-relational data with T-SQL
  6. 6. Disaster recovery: We have several customers that use a pattern of APS > Blob Storage > SQL DW (all via PolyBase) as a pattern for DR (using the cloud service)
  7. 7. SELECT TOP 10 * FROM SQLServer S JOIN Hadoop H S.Key = H.Key
  8. 8. SELECT TOP 10 * FROM SQLServer S JOIN Hadoop H S.Key = H.Key
  9. 9. SELECT TOP 10 * FROM SQLServer S JOIN Hadoop H S.Key = H.Key
  10. 10. SELECT TOP 10 * FROM SQLServer S JOIN Blob B S.Key = B.Key
  11. 11. SELECT TOP 10 * FROM SQLServer S JOIN Blob B S.Key = B.Key
  12. 12. SELECT TOP 10 * FROM SQLServer S JOIN Hadoop H S.Key = H.Key JOIN Blob B and S.Key = B.Key
  13. 13. https://msdn.microsoft.com/en-us/library/mt143174.aspx
  14. 14. Polybase (works with) Azure Blob Store Push Down HDInsight Push Down Cloudera Push Down HortonWorks Push Down Azure Data Lake Store Push Down SQL 2016 (Now) Yes N/A Yes No Yes Yes Yes Yes No N/A SQL 2016 (Near future) Yes N/A Yes No Yes Yes Yes Yes No N/A Azure SQL DW (Now) Yes N/A Yes No No No No No Yes! N/A Azure SQL DW (Near future) Yes N/A Yes No No No No No Yes N/A APS (Now) Yes N/A Yes Yes (int). No (ext) Yes Yes Yes Yes No N/A APS (Near future) Yes N/A Yes Yes/No Yes Yes Yes Yes No N/A
  15. 15. https://msdn.microsoft.com/en-us/library/mt607030.aspx Allows you to create a cluster of SQL Server instances to process large data sets from external data sources in a scale-out fashion for better query performance
  16. 16. CREATE DATABASE SCOPED CREDENTIAL HadoopCredential WITH IDENTITY = 'hadoopUserName', Secret = 'hadoopPassword'; CREATE EXTERNAL DATA SOURCE HadoopCluster WITH (TYPE = Hadoop, LOCATION = 'hdfs://10.193.26.177:8020', RESOURCE_MANAGER_LOCATION = '10.193.26.178:8050', HadoopCredential); CREATE EXTERNAL FILE FORMAT TextFile WITH ( FORMAT_TYPE = DELIMITEDTEXT, DATA_COMPRESSION = 'org.apache.hadoop.io.compress.GzipCodec', FORMAT_OPTIONS (FIELD_TERMINATOR ='|', USE_TYPE_DEFAULT = TRUE)); CREATE EXTERNAL TABLE [dbo].[Customer] ( [SensorKey] int NOT NULL, int NOT NULL, [Speed] float NOT NULL ) WITH (LOCATION='//Sensor_Data//May2014/', DATA_SOURCE = HadoopCluster, FILE_FORMAT = TextFile ); Once per Hadoop User HDFS File Path Once per File Format Once per Hadoop Cluster per user
  17. 17. Resources  PolyBase guide: https://msdn.microsoft.com/en-us/library/mt143171.aspx  Azure SQL Data Warehouse loading patterns and strategies: http://bit.ly/1XskZL2
  18. 18. Q & A ? James Serra, Big Data Evangelist Email me at: JamesSerra3@gmail.com Follow me at: @JamesSerra Link to me at: www.linkedin.com/in/JamesSerra Visit my blog at: JamesSerra.com (where this slide deck is posted via the “Presentations” link on the top menu)

Notas del editor

  • First introduced with the Analytics Platform System (APS), PolyBase simplifies management and querying of both relational and non-relational data using T-SQL. It is now available in both Azure SQL Data Warehouse and SQL Server 2016. The major features of PolyBase include the ability to do ad-hoc queries on Hadoop data and the ability to import data from Hadoop and Azure blob storage to SQL Server for persistent storage. A major part of the presentation will be a demo on querying and creating data on HDFS (using Azure Blobs). Come see why PolyBase is the “glue” to creating federated data warehouse solutions where you can query data as it sits instead of having to move it all to one data platform.

  • Fluff, but point is I bring real work experience to the session
  • We are planning to release a preview of this functionality early next year as part of SQL Server V.Next CTPs, exact release dates are still in flux.
    By preview early next year PolyBase will support Teradata, Oracle, SQL Server, MongoDB, Hadoop and Azure blob storage (not MySQL!). We will continue to add more sources until GA.

    http://demo.sqlmag.com/scaling-success-sql-server-2016/integrating-big-data-and-sql-server-2016

    When it comes to key BI investments we are making it much easier to manage relational and non-relational data with Polybase technology that allows you to query Hadoop data and SQL Server relational data through single T-SQL query. One of the challenges we see with Hadoop is there are not enough people out there with Hadoop and Map Reduce skillset and this technology simplifies the skillset needed to manage Hadoop data. This can also work across your on-premises environment or SQL Server running in Azure.
  • https://msdn.microsoft.com/en-us/library/mt163689.aspx

    SQL Server 2016 does not support HDInsight yet (but APS does). SQL DW only supports Azure Blob Storage.

    At the time we executed the PoC, Polybase didn’t support ADL Storage directly so we needed to go through BLOB Storage anyway. This forced us to deal with data type conversions and file encoding again. The PG is working to ADL Store – Polybase integration indeed.

    what is the timeline for Polybase to work on Azure Data Lake Storage? prior to end of CY’16
    I know that work is currently underway on this.
    I do not have an exact date for ALDS support, but the work is underway and the aim is to have it done by the end of CY16.

    Currently there are no plans for spark integration with polybase. Yes, we are working on a strategy for integration with metanautix to add support for additional data sources with polybase, but don’t have any details to share at this time.

×