Machine Learning en SQL Server

Machine Learning con SQL Server
www.triggerdb.com
http://blog.triggerdb.com/maxi

BIOGRAPHY POINT THREE
Especialista en Data Platform con mas de 15 años de experiencia en
trabajos de (tuning – seguridad – migraciones – Desarrollo – BI – Alta
disponibilidad – Big data y Machine Learning
Maximiliano Accotto
BIOGRAPHY POINT ONE
Owner TriggerDB Consulting SRL | www.triggerdb.com
BIOGRAPHY POINT TWO
Microsoft MVP Data Platform desde el año 2005, miembro de SQLPass
Argentina y speaker para Microsoft en diferentes eventos desde el
año 2005
https://twitter.com/maxiaccotto
https://www.linkedin.com/in/maxiaccotto/

Acerca de TriggerDB Consulting
Nacimos en Argentina en el año 2005, desde entonces estamos ayudando a distintas organizaciones en el mundo en el
manejo y entrenamiento de las plataformas de datos de Microsoft (SQL Server, PowerBI, Azure, Big Data,etc) con la meta
de traspasarles a nuestros clientes los conocimientos adquiridos día a día.
Somos Microsoft Partner certificados en Data platform / Data Analytics y PowerBI
Líneas de Contacto www.triggerdb.com
Info@triggerdb.com
https://www.facebook.com/triggerdb/

¿Que es machine Learning?
Es una rama de la inteligencia artificial cuyo objetivo es desarrollar
técnicas que permitan a las computadoras aprender.

Regular Database + App Intelligence Database + App
Application +
Intelligence
Database
Application
Intelligence
+ Database
VS

¿Porque ML en SQL Server?
Eliminar el movimiento de datos
Operación ML scripts y modelos
Performance y escalabilidad Enterprise
SQL Transformations
Relational data
Analytics library

Ahora
Attribute Selection
sp_execute_external_script
‘R code’
Attribute Selection Tool
C++/R
Segmentation Tool
C++/R
Segmentation
‘R code’
Scoring
‘R code’
Data
tables
SQL Server
SQL Server
Stored
Proc
Data
tables
Antes

SQL Server Machine Learning Services
• SQL Server 2016
• R support (3.2.2 version)
• Microsoft R Server
• SQL Server 2017
• Scoring native en TSQL usando PREDICT function (+Linux
support)
• EXTERNAL LIBRARY DDL para el manejo de paquetes R
• Ejecucion en batch para la entrada de datos
• Soporte para R (3.3.3 version)
• Soporte para Python (Anaconda 3.5.2)

Machine Learning Server
• Soporte Multi-plataforma
• Windows, Linux, Hadoop, SQL Server
• Microsoft R Server
• RevoScaleR, MicrosoftML, olapR, sqlrutils packages
• Uso de Web services para operar.
• Microsoft Machine Learning Server
• Soporte de R & Python
• revoscalepy, microsoftml python libraries
• rxExecBy

Comparación ScaleR Performance
 US flight data for 20 years
 Linear Regression on Arrival Delay
 Run on 4 core laptop, 16GB RAM & 500GB SSD

Any R/Python
IDE
Data Scientist
Workstation
Script
Results
Execution
1
3
Data Scientists - Exploración de Datos y Desarrollo de Modelos
SQL Server
2
R/Python Runtime
Machine Learning
Services
train <- RxSqlServerData(query,
connectionString, computeContext)
rxLogit(formula, train)

Application exec sp_execute_external_script
@language = ‘Python’
, @script =
-- Python code --
The stored procedure
contains R or Python code
and executes in-database
Application Developer - Operacionalización de
modelos
Stored Proc call
Results
1
3
Execution
SQL Server
2
R/Python Runtime
Machine Learning
Services

Trabajo del DBA: Habilitar ML en SQL
Server
Enable External scripts
– Exec sp_configure ‘external
scripts enabled’, 1
– RCONFIGURE
SQL Server 2016
O superior

SP_execute_external_script
EXEC sp_execute_external_script
@language = N’R’,
@script = N’[Codigo]’,
@input_data_1 = N’[SQL input]’
[ , @input_data_1_name = N‘InputDataSet’ ]
[ , @output_data_1_name = N’OutputDataSet’ ]
[ , @params = N’parameter’ ]
WITH RESULT SETS (([SQL output]));
input_data_1_name and
output_data_1_name are optional
and default to InputDataSet and
OutputDataSet respectively

Operationalized R
EXEC sp_execute_external_script
@language = N’R’,
@script = N’[R code goes here]’,
@input_data_1 = N’[SQL input]’
[ , @input_data_1_name = N‘InputDataSet’ ]
[ , @output_data_1_name = N’OutputDataSet’ ]
[ , @params = N’parameter’ ]
WITH RESULT SETS (([SQL output]));
If output is a model or plot,
specify varbinary(max) in
WITH RESULT SETS

Tipos de salida
1. Dataset
• Standard resultset of rows and columns
• Data types will vary
2. Plot
• Static images
• Binary
3. Model
• Trained models such as linear regression, naïve bayes, etc.
• Binary

Realtime Predictions usando Scoring nativo
• PREDICT function
• No depende de R o Python runtime
• Habilitado on SQL Server tanto en Windows como Linux
• Uso
• Single or small number of rows scoring
• Highly concurrent scoring scenarios
• Predict during INSERT, UPDATE, MERGE statements
• Requirements
• Models built using RevoScaleR or revoscalepy
• rxLinMod, rxLogit, rxBTrees, rxDTree, rxDForest
• Serialized using rxSerializeModel (R) or rx_serialize_model (Python)

1,000,000 predictions per second
https://blogs.technet.microsoft.com/dataplatforminsider/2016/10/11/1000000-predictions-per-
second/

Links de interés
https://microsoft.github.io/sql-ml-tutorials/
https://blogs.msdn.microsoft.com/mlserver/
https://blogs.technet.microsoft.com/machinelearning/

Próximos eventos
Curso online SQL Server performance Tuning (agosto 2018)
http://www.triggerdb.com/event/curso-a-distancia-performance-y-tuning-sql-server/
Curso online PowerBI (septiembre 2018)
http://www.triggerdb.com/event/curso-powerbi-a-distancia/

Muchas gracias
Triggerdb.com
info@triggerdb.com

Machine Learning en SQL Server

Recomendados

Recomendados

Más contenido relacionado

Similar a Machine Learning en SQL Server

Similar a Machine Learning en SQL Server (20)

Más de Maximiliano Accotto

Más de Maximiliano Accotto (20)

Último

Último (20)

Machine Learning en SQL Server

Notas del editor