Azure Data Lake: integracion dentro de soluciones de inteligencia de negocios
Azure SQL DW evoluciona a Synapse Analytics
1. Azure SQL Data Warehouse
Introducion
Juan Manuel Alvarado Ortiz
MVP Data Platform
Twitter: juanbizzz
2. EX TRAER
Data Warehouse moderno
PREPARAR TRANSFORMAR SERVIR
ALMACENAMIENTO
VISUALIZAR
On-premises data
Cloud data
SaaS data
3. El mejor en
relacion
Costo/Rendimiento
Productividad de
desarrolladores
Administracion
inteligente de cargas
Flexibilidad de datos
Hasta 94% menos
costoso que los
competidores
Manejo de prioridad en
cargas de trabajo
Carga de diferentes tipos
de datos
Use su herramienta de
Desarrollo preferida con
SQL data warehouse
Lider en la industria a
nivel seguridad
Seguridad avanzada y
99.9% de disponibilidad
en servicio
Azure SQL Data Warehouse
4. Azure SQL Data Warehouse ventaja en
rendimiento
Gen2 adaptive caching – using non-volatile memory
solid-state drives (NVMe) to increase the I/O bandwidth
available to queries.
Azure FPGA-accelerated networking enhancements – to
move data at rates of up to 1GB/sec per node to improve
queries
Instant data movement – leverages multi-core
parallelism in underlying SQL Servers to move data
efficiently between compute nodes.
Query Optimization – ongoing investments in
distributed query optimization
Overview
5. Seguridad
completa
Data In Transit
Data encryption at rest (Service & User Managed Keys)
Data Discovery and Classification
Native Row Level Security
Table and View Security (GRANT / DENY)
Column Level Security
SQL Authentication
Native Azure Active Directory
Integrated Security
Multi-Factor Authentication
Virtual Network (VNET)
SQL Firewall (server)
Integration with ExpressRoute
SQL Threat Detection
SQL Auditing
Vulnerability Assessment
Data Protection
CATEGORY FEATURE SQL DATA
WAREHOUSE
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
6. 1 2 10 11
Running Queued
3 4 5 6 7 98 121011 12
Scheduler Without Importance
9 10
QueuedQueued
CEOCEOCEO
By default, workloads are run on a first-in first out basis.
Administracion de la carga de trabajo
importancia de los roles
7. 1 2 10 11
Running Queued
3 4 5 6 7 98 12
Scheduler With Importance Turned On
12
Queued
CEOCEO
LowNormal Normal High
CREATE WORKLOAD CLASSIFIER classifier_name
WITH
(
WORKLOAD_GROUP = 'name’ ,
MEMBERNAME = 'security_account' [ [ , ]
IMPORTANCE = { LOW | BELOW_NORMAL | NORMAL (default) | ABOVE_NORMAL | HIGH }])
Administracion de la carga de trabajo
Importancia del trabajo
8. Data Heterogenea- JSON
Leer datos JSON data almacenados en un
string:
• ISJSON – Verifica si es Json valido
• JSON_VALUE Extraer un valor
• JSON_QUERY – Extraer un arreglo o un
objeto de un string Json
Overview
-- Return all rows with valid JSON data
SELECT CustomerId, OrderDetails
FROM CustomerOrders
WHERE ISJSON(OrderDetails) > 0;
CustomerId OrderDetails
101
N'[{ StoreId": "AW73565", "Order": { "Number":"SO43659",
"Date":"2011-05-31T00:00:00“ }, "Item": { "Price":2024.40, "Quantity":1 }}]'
-- Extract values from JSON string
SELECT CustomerId,
Country,
JSON_VALUE(OrderDetails,'$.StoreId') AS StoreId,
JSON_QUERY(OrderDetails,'$.Item') AS ItemDetails
FROM CustomerOrders;
CustomerId Country StoreId ItemDetails
101 Bahrain AW73565 { "Price":2024.40, "Quantity":1 }
11. Azure Synapse is Azure SQL Data Warehouse evolucion — conecta big
data, data warehousing, y data integration en un solo servicio de principio a fin
, para soluciones analiticas y escalable en la nube.
Azure Synapse Analytics
12. Facil de usar
Rapido de explorar
Rapido de comenzar a usar
Big Data
O
Este es el resultado en los negocios, son forzados a
mantener dos tipos de sistemas criticos, en forma
independiente para sus soluciones analiticas
Alta seguridad
Pieza clave en privacidad
Dependiente de rendimiento
Data Relacional
Data Lake Data Warehouse
13. Bienvenidos a Azure Synapse Analytics
Azure conecta lo major de los dos mundos, en un solo servicio
Facil de usar
Rapido de explorar
Rapido de comenzar
Alta seguridad
Pieza clave en privacidad
Dependiente de rendimiento
14. Escenarios de uso
Big data y analitica avanzada
SQL
data warehousing moderno
“Necesitamos integrar toda
nuestros datos – incluyendo Big
Data—con nuestro data
warehouse”
Analitica Avanzada
“Queremos tartar de predecir
el comportamiento y churn de
nuestros clientes”
Analitica en tiempo real
“Estamos tratanto de realizar
analisis en tiempo real de los
sensores de nuestros equipos
en la fabrica”
15. Analitica usando Azure Synapse
Platforma
METASTORE
SECURITY
MANAGEMENT
MONITORING
DATA INTEGRATION
Analytics Runtimes
PROVISIONED ON-DEMAND
Form Factors
SQL
Languajes
Python .NET Java Scala R
Experiencia Azure Synapse Studio
Power BI
Azure Machine
Learning
On-premises data
Cloud data
SaaS data Azure
Data Lake Storage Gen2
Common Data Model
Enterprise Security
Optimized for Analytics
Extrae datos de
multiples Fuentes de
datos
Prepara los datos
Analiza los datos
preparados
previamente
Provee los analisis a
multiples usuarios
Construye dashboards
y reportes para BI
16. Analitica usando Azure Synapse
Platforma
METASTORE
SECURITY
MANAGEMENT
MONITORING
DATA INTEGRATION
Analytics Runtimes
PROVISIONED ON-DEMAND
Form Factors
SQL
Languajes
Python .NET Java Scala R
Experiencia Azure Synapse Studio
Power BI
Azure Machine
Learning
On-premises data
Cloud data
SaaS data Azure
Data Lake Storage Gen2
Common Data Model
Enterprise Security
Optimized for Analytics
Extrae datos de
multiples Fuentes de
datos
Prepara los datos
Analiza los datos
preparados
previamente
Provee los analisis a
multiples usuarios
Construye dashboards
y reportes para BI
17. Azure Synapse Studio
Ambiente único donde desarrolla y
administra todos los componente
de la solución analítica incluyendo:
Data Lakes, Machine Learning,
Power BI, Sparks SQL
Datawarehouse y Data Flows
This brings me to Azure SQL Data Warehouse. Admittedly in Gen 1 we had work to do but now with Gen 2 we have a rock solid product
Let’s discuss the five value propositions shows here. This comprises of the features/characteristics that help us win workloads.
- Azure data Warehouse superior performance that help users achieve their goals quickly and efficiently at the cheapest price point.
- In order to ensure our customers’ data is safe and secure, the solution includes multiple layers of security
- Prioritization of your workloads allows you to have granular control on how your workloads are being run or how you allocate resources to the different users, based on the importance of their queries.
- Flexible system to quickly adapt to the needs of the user. This could be number of concurrent users or prioritization of workloads.
- Track, apply, and deploy changes with Azure DevOps in Visual Studio providing a familiar user experience
First cloud DW to offer independent scaling for compute and storage. The reason why this is important is that it gives you the flexibility to quickly meet the changing needs of your analytics workloads, as well as the ability to control your costs in a much better way. We are the only cloud provider to offer a truly elastic DW today, with the ability to scale up/down and pause/resume to optimize your costs and performance.
And with Azure SQL Data Warehouse, it is super easy to get started… with a click of a button, you can spin up an enterprise-class DW in minutes.
And thanks to the integration with Azure Databricks for data preparation and data enrichment, Microsoft is the only company offering you a truly integrated cloud analytics platform that brings together the best of Spark and relational data warehousing. With its support for Spark and Polybase, Azure SQL DW is your integrated data management solution, providing the ability to perform complex queries inside and outside your DW is a friction-free way.
PERFORMANCE
SECURITY
CONSIDER A DIAGRAM TO MAP THIS AGIANST ALSO
SQL has a robust ALM for data warehouse deployment, ensuring low risk of breaking changes and inefficiencies in team development.
Furthermore Visual Studio provides comprehensive Load testing framework, and can efficiently orchestrate end-to-end Integration Testing