SlideShare una empresa de Scribd logo
1 de 43
© Copyright SELA Software & Education Labs Ltd. | 14-18 Baruch Hirsch St Bnei Brak, 51202 Israel | www.selagroup.com
Eyal Ben Ivri
Building Big Data Solutions
on Azure
About me
Eyal Ben Ivri
Big Data & Cloud Architect, Sela Group
Focus On Hadoop Eco-System & Big-Data +
NoSQL Solutions
Modern Data – The Big Picture
IoT
User Data
Media Files
Documents
Machine Data
Log Files
The Light Rail problem – TLV
Railway
Imagine the new light Rail maintenance
company
IoT – Internet of Trains (and cameras, and cash
registers and carts and rails and more…)
Analyze data in stream and in batch
Dashboards
Alerts
The perfect problem
What We Need
An integrated data solution that will be:
Able to process events from external sources
Able to walk data through different pipelines
Fast and responsive
Big-Data Ready
In Other Words
Consume
BI Dashboards Applications
Process
ETL Aggregations Computation Analysis Querying
Persist
Hadoop SQL NoSQL
Ingest
IoT Structured Data Un-Structured Data
Microsoft Azure Services for
IoT and BigData
Devices Device Connectivity Storage Analytics Presentation & Action
Event Hubs SQL Database
Machine
Learning
App Service
Service Bus
Table/Blob
Storage
Stream Analytics Power BI
External Data
Sources
DocumentDB HDInsight
Notification
Hubs
Data Lake Store Data Factory Mobile Services
External Data
Sources
Data Lake
Analytics
BizTalk Services
{ }
Microsoft Azure Services for
IoT and BigData
Devices Device Connectivity Storage Analytics Presentation & Action
Event Hubs SQL Database
Machine
Learning
App Service
Service Bus
Table/Blob
Storage
Stream Analytics Power BI
External Data
Sources
DocumentDB HDInsight
Notification
Hubs
Data Lake Store Data Factory Mobile Services
External Data
Sources
Data Lake
Analytics
BizTalk Services
{ }
Event Hub
Messages at scale
Why not throw it into a queue, and have a
listener at the backend?
Scaling limits, because of the architecture of queues
and topics of a standard Service Bus
Event Hub uses a partition model
Getting Started
Easy to set up
Two Configurations
Partition Count – Depend on the number of consumers (2-
32)
Message Retention (days) – between 1 and 7 days
Secured using SAS Policies
Field
Gateway
Device
Connectivity & Management
IoT with Event Hubs
Devices
RTOS,Linux,Windows,Android,iOS
Cloud Gateway
Event Hubs
Field
Gateway
Protocol
Adaptation
Field
Gateway
Device
Connectivity & Management
Analytics &
Operationalized Insights
IoT & Data Processing Patterns
Devices
RTOS,Linux,Windows,Android,iOS
Protocol
Adaptation
Batch Analytics & Visualizations
Azure HDInsight, AzureML, Power BI,
Azure Data Factory
Hot Path Analytics
Azure Stream Analytics, Azure HDInsight Storm
Hot Path Business Logic
Service Fabric & Actor Framework
Cloud Gateway
Event Hubs
&
IoT Hub
Field
Gateway
Protocol
Adaptation
TLV Railway
Can now ingest millions of messages each
second
These messages carry data from:
Devices
End-Machines
Servers
Next, we need to use this data to create real-
time alerts when something goes wrong
Azure Stream Analytics
Automatic recovery
Monitoring and alerting
Scale on demand
Managed Cloud Service
Each unit handles 1MB/s
Can scale up to 1GB/s
SQL like language
temporal windowing
semantics
support for reference data
Stream Analytics – Main Concepts
Inputs
Can be stream or reference data (metadata)
Stream Data sources can be Event Hub, Blob Storage
(using blobs with timestamps) or IoT Hub (preview)
Serialization types support CSV, JSON, and Avro
Query
A SQL query to that will select from input(s) and
dump results to output(s)
Output
Can be Blob, SQL, Event Hub (notification), Power BI
(preview), Table storage, Service Bus or DocumentDB
Tumbling Windows
How many trains entered each station every 5
minutes?
SELECT TrainId, COUNT(*) FROM EntryStream
GROUP BY TrainId, TumblingWindow(minute,5)
Temporal Windows
Tumbling Window
A series of fixed-sized, non-overlapping and
contiguous time intervals
Hopping Window
Scheduled overlapping windows
Sliding Window
Outputs events only for those points in time when
the content of the window actually changes
TLV Railway
Can now respond in near-real-time to events as
they happen
Track and maintain malfunctioning equipment
Receive real time data regarding customers
entering and leaving stations
Data can now be processed, so we need a place
to save it, preferably at scale.
DocumentDB and Azure Data
Services
fully managed, scalable, queryable, schema free JSON
document database service for modern applications
transactional processing
rich query
managed as a service
elastic scale
internet accessible http/rest
schema-free data model
arbitrary data formats
DocumentDB features
JSON Documents
SQL support
Linq Support
REST API Support
JS Support (triggers, UDFs, stored procedures)
Automatic Index
Multiple Document Transactions
Tunable Consistency
DocumentDB Key Concept
Collection
A collection of Documents
Not a table (different entities can go into the same
collection)
Collections = Partitions
Not just logical containers, but physical ones
Demo
TLV Railway – Part 1
TLV Railway
Can now store it’s data in a highly scalable store
Great for interactive querying of any data
Messages from sensors
Reference Data
But this data (and other data) needs to move to
other places (SQL, Batch processing, ML). How?
What is Azure Data Factory?
Azure Data Factory is a managed service to produce
trusted information from data stored in the cloud
and on-premises. Easily create, orchestrate and
schedule highly-available, fault tolerant work flows
to move and transform your data at scale.
Evolving Approaches to Analytics
ETL Tool
(SSIS, etc)
EDW
(SQL Svr, Teradata, etc)
Extract
Original
Data
Load
Transformed
Data
Transform
BI Tools
Ingest
Original
Data
Scale-out
Storage &
Compute
(HDFS, Blob Storage,
etc)
Transform & Load
Data Marts
Data Lake(s)
Dashboards
Apps
Streaming data
Data Factory – Main concepts
Data Store
A data source/sink component
SQL (Azure or On-Premise), Storage, DocumentDB and
more)
Data Set
A defined data set that is contained inside a data store
One data store can have many data sets
Compute
A service for computation
HDInsight, Azure Batch, Data Lake Analytics, Azure ML
Data Factory – Main concepts
Pipeline
Set of instructions
“Take data from data set A and move to compute,
then store results in data set B”
Slices
Everything is time sliced
A data set (source) can declare on what time
intervals the data can be sliced, and the pipeline will
be activated when a new slice is ready
JSON
Microsoft Azure Services for
IoT and BigData
Devices Device Connectivity Storage Analytics Presentation & Action
Event Hubs SQL Database
Machine
Learning
App Service
Service Bus
Table/Blob
Storage
Stream Analytics Power BI
External Data
Sources
DocumentDB HDInsight
Notification
Hubs
Data Lake Store Data Factory Mobile Services
External Data
Sources
Data Lake
Analytics
BizTalk Services
{ }
Microsoft Azure Services for
IoT and BigData
Devices Device Connectivity Storage Analytics Presentation & Action
Event Hubs SQL Database
Machine
Learning
App Service
Service Bus
Table/Blob
Storage
Stream Analytics Power BI
External Data
Sources
DocumentDB HDInsight
Notification
Hubs
Data Lake Store Data Factory Mobile Services
External Data
Sources
Data Lake
Analytics
BizTalk Services
{ }
TLV Railway
Can now integrate different services and
different data sources
Move data with ease and as little hassle as
possible
What about aggregations, deeper dive into
data, for more complex analysis?
HDInsight
Hadoop-as-a-Service
Based on the Hortonworks distribution
Few flavors:
Hadoop (Windows + Linux)
Storm (Windows + Linux)
HBase (Windows + Linux)
Spark (Windows + Linux)
Data size
Access
Updates
Structure
Integrity
Scaling
Hadoop vs. Relational DB
Demo
TLV Railway – Part 2
TLV Railway - Summary
Can now perform advanced analytics on top of
large amounts of data, in a variety of formats
(not just structured, boring data)
Can integrate all the loose ends of data coming
in, with data generated in ”Old-School” data
platforms like SQL that is collected from Line-
of-Business applications
We’ve covered data ingestion, responding in
real-time, querying, storing and processing
Azure Stack
Hadoop and OSS vs.
Azure IoT and BigData Ecosystem
Azure Ecosystem OSS
Event Hubs Kafka
Stream Analytics Storm
HDInsight Hadoop
Map Reduce Map Reduce
Hive Hive
Spark Spark
HBase HBase
Azure ML Mahout
Data Factory Pig
DocumentDB MongoDB / Couchbase
Data Lake (preview)
Is “TLV Railway” fake?
London did it first
Summary
Get started today at
http://azure.microsoft.com
Questions

Más contenido relacionado

La actualidad más candente

Big Data Analytics in the Cloud with Microsoft Azure
Big Data Analytics in the Cloud with Microsoft AzureBig Data Analytics in the Cloud with Microsoft Azure
Big Data Analytics in the Cloud with Microsoft AzureMark Kromer
 
Building Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft AzureBuilding Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft AzureDmitry Anoshin
 
Verizon Centralizes Data into a Data Lake in Real Time for Analytics
Verizon Centralizes Data into a Data Lake in Real Time for AnalyticsVerizon Centralizes Data into a Data Lake in Real Time for Analytics
Verizon Centralizes Data into a Data Lake in Real Time for AnalyticsDataWorks Summit
 
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
[Webinar] Getting to Insights Faster: A Framework for Agile Big DataInfochimps, a CSC Big Data Business
 
Entity Resolution Service - Bringing Petabytes of Data Online for Instant Access
Entity Resolution Service - Bringing Petabytes of Data Online for Instant AccessEntity Resolution Service - Bringing Petabytes of Data Online for Instant Access
Entity Resolution Service - Bringing Petabytes of Data Online for Instant AccessDataWorks Summit
 
Big Data Use Cases
Big Data Use CasesBig Data Use Cases
Big Data Use Casesboorad
 
Cortana Analytics Workshop: Operationalizing Your End-to-End Analytics Solution
Cortana Analytics Workshop: Operationalizing Your End-to-End Analytics SolutionCortana Analytics Workshop: Operationalizing Your End-to-End Analytics Solution
Cortana Analytics Workshop: Operationalizing Your End-to-End Analytics SolutionMSAdvAnalytics
 
Microsoft and Hortonworks Delivers the Modern Data Architecture for Big Data
Microsoft and Hortonworks Delivers the Modern Data Architecture for Big DataMicrosoft and Hortonworks Delivers the Modern Data Architecture for Big Data
Microsoft and Hortonworks Delivers the Modern Data Architecture for Big DataHortonworks
 
Big data on Azure for Architects
Big data on Azure for ArchitectsBig data on Azure for Architects
Big data on Azure for ArchitectsTomasz Kopacz
 
The modern analytics architecture
The modern analytics architectureThe modern analytics architecture
The modern analytics architectureJoseph D'Antoni
 
Big Data on Azure Tutorial
Big Data on Azure TutorialBig Data on Azure Tutorial
Big Data on Azure Tutorialrustd
 
Best Practices: Hadoop migration to Azure HDInsight
Best Practices: Hadoop migration to Azure HDInsightBest Practices: Hadoop migration to Azure HDInsight
Best Practices: Hadoop migration to Azure HDInsightRevin Chalil
 
Cortana Analytics Suite
Cortana Analytics SuiteCortana Analytics Suite
Cortana Analytics SuiteJames Serra
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouseJames Serra
 
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...Dataconomy Media
 
Pentaho Big Data Analytics with Vertica and Hadoop
Pentaho Big Data Analytics with Vertica and HadoopPentaho Big Data Analytics with Vertica and Hadoop
Pentaho Big Data Analytics with Vertica and HadoopMark Kromer
 
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...Databricks
 
Modernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureModernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureDatabricks
 

La actualidad más candente (20)

Big Data with Azure
Big Data with AzureBig Data with Azure
Big Data with Azure
 
Big Data Analytics in the Cloud with Microsoft Azure
Big Data Analytics in the Cloud with Microsoft AzureBig Data Analytics in the Cloud with Microsoft Azure
Big Data Analytics in the Cloud with Microsoft Azure
 
Building Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft AzureBuilding Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft Azure
 
Verizon Centralizes Data into a Data Lake in Real Time for Analytics
Verizon Centralizes Data into a Data Lake in Real Time for AnalyticsVerizon Centralizes Data into a Data Lake in Real Time for Analytics
Verizon Centralizes Data into a Data Lake in Real Time for Analytics
 
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
 
Entity Resolution Service - Bringing Petabytes of Data Online for Instant Access
Entity Resolution Service - Bringing Petabytes of Data Online for Instant AccessEntity Resolution Service - Bringing Petabytes of Data Online for Instant Access
Entity Resolution Service - Bringing Petabytes of Data Online for Instant Access
 
Big Data Use Cases
Big Data Use CasesBig Data Use Cases
Big Data Use Cases
 
Cortana Analytics Workshop: Operationalizing Your End-to-End Analytics Solution
Cortana Analytics Workshop: Operationalizing Your End-to-End Analytics SolutionCortana Analytics Workshop: Operationalizing Your End-to-End Analytics Solution
Cortana Analytics Workshop: Operationalizing Your End-to-End Analytics Solution
 
Microsoft and Hortonworks Delivers the Modern Data Architecture for Big Data
Microsoft and Hortonworks Delivers the Modern Data Architecture for Big DataMicrosoft and Hortonworks Delivers the Modern Data Architecture for Big Data
Microsoft and Hortonworks Delivers the Modern Data Architecture for Big Data
 
Big data on Azure for Architects
Big data on Azure for ArchitectsBig data on Azure for Architects
Big data on Azure for Architects
 
Big Data in Azure
Big Data in AzureBig Data in Azure
Big Data in Azure
 
The modern analytics architecture
The modern analytics architectureThe modern analytics architecture
The modern analytics architecture
 
Big Data on Azure Tutorial
Big Data on Azure TutorialBig Data on Azure Tutorial
Big Data on Azure Tutorial
 
Best Practices: Hadoop migration to Azure HDInsight
Best Practices: Hadoop migration to Azure HDInsightBest Practices: Hadoop migration to Azure HDInsight
Best Practices: Hadoop migration to Azure HDInsight
 
Cortana Analytics Suite
Cortana Analytics SuiteCortana Analytics Suite
Cortana Analytics Suite
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouse
 
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
 
Pentaho Big Data Analytics with Vertica and Hadoop
Pentaho Big Data Analytics with Vertica and HadoopPentaho Big Data Analytics with Vertica and Hadoop
Pentaho Big Data Analytics with Vertica and Hadoop
 
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
 
Modernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureModernizing to a Cloud Data Architecture
Modernizing to a Cloud Data Architecture
 

Destacado

MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -MapR Technologies
 
Intorducing Big Data and Microsoft Azure
Intorducing Big Data and Microsoft AzureIntorducing Big Data and Microsoft Azure
Intorducing Big Data and Microsoft AzureKhalid Salama
 
Big Data in the Cloud - Montreal April 2015
Big Data in the Cloud - Montreal April 2015Big Data in the Cloud - Montreal April 2015
Big Data in the Cloud - Montreal April 2015Cindy Gross
 
Visualising the tabular model for power view upload
Visualising the tabular model for power view uploadVisualising the tabular model for power view upload
Visualising the tabular model for power view uploadJen Stirrup
 
Belgian Windows Server 2012 Launch windows azure insights for the enterprise ...
Belgian Windows Server 2012 Launch windows azure insights for the enterprise ...Belgian Windows Server 2012 Launch windows azure insights for the enterprise ...
Belgian Windows Server 2012 Launch windows azure insights for the enterprise ...Mike Martin
 
Go Serverless with Azure Functions
Go Serverless with Azure FunctionsGo Serverless with Azure Functions
Go Serverless with Azure FunctionsJim O'Neil
 
Enterprise Data Workflows with Cascading and Windows Azure HDInsight
Enterprise Data Workflows with Cascading and Windows Azure HDInsightEnterprise Data Workflows with Cascading and Windows Azure HDInsight
Enterprise Data Workflows with Cascading and Windows Azure HDInsightPaco Nathan
 
Fraud Detection using Hadoop
Fraud Detection using HadoopFraud Detection using Hadoop
Fraud Detection using Hadoophadooparchbook
 
Microsoft NYC 14
Microsoft NYC 14Microsoft NYC 14
Microsoft NYC 14SwitchPitch
 
Azure api app métricas com application insights
Azure api app métricas com application insightsAzure api app métricas com application insights
Azure api app métricas com application insightsNicolas Takashi
 
Big data streaming with Apache Spark on Azure
Big data streaming with Apache Spark on AzureBig data streaming with Apache Spark on Azure
Big data streaming with Apache Spark on AzureWillem Meints
 
SQLSaturday #230 - Introduction to Microsoft Big Data (Part 1)
SQLSaturday #230 - Introduction to Microsoft Big Data (Part 1)SQLSaturday #230 - Introduction to Microsoft Big Data (Part 1)
SQLSaturday #230 - Introduction to Microsoft Big Data (Part 1)Sascha Dittmann
 
Azure Stream Analytics : Analyse Data in Motion
Azure Stream Analytics  : Analyse Data in MotionAzure Stream Analytics  : Analyse Data in Motion
Azure Stream Analytics : Analyse Data in MotionRuhani Arora
 
Going serverless
Going serverlessGoing serverless
Going serverlessTechExeter
 
2016-08-25 TechExeter - going serverless with Azure
2016-08-25 TechExeter - going serverless with Azure2016-08-25 TechExeter - going serverless with Azure
2016-08-25 TechExeter - going serverless with AzureSteve Lee
 
Azure functions
Azure functionsAzure functions
Azure functionsvivek p s
 

Destacado (20)

MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -
 
Intorducing Big Data and Microsoft Azure
Intorducing Big Data and Microsoft AzureIntorducing Big Data and Microsoft Azure
Intorducing Big Data and Microsoft Azure
 
Big Data Application Architectures - Fraud Detection
Big Data Application Architectures - Fraud DetectionBig Data Application Architectures - Fraud Detection
Big Data Application Architectures - Fraud Detection
 
Big Data in the Cloud - Montreal April 2015
Big Data in the Cloud - Montreal April 2015Big Data in the Cloud - Montreal April 2015
Big Data in the Cloud - Montreal April 2015
 
Visualising the tabular model for power view upload
Visualising the tabular model for power view uploadVisualising the tabular model for power view upload
Visualising the tabular model for power view upload
 
Belgian Windows Server 2012 Launch windows azure insights for the enterprise ...
Belgian Windows Server 2012 Launch windows azure insights for the enterprise ...Belgian Windows Server 2012 Launch windows azure insights for the enterprise ...
Belgian Windows Server 2012 Launch windows azure insights for the enterprise ...
 
Go Serverless with Azure Functions
Go Serverless with Azure FunctionsGo Serverless with Azure Functions
Go Serverless with Azure Functions
 
Enterprise Data Workflows with Cascading and Windows Azure HDInsight
Enterprise Data Workflows with Cascading and Windows Azure HDInsightEnterprise Data Workflows with Cascading and Windows Azure HDInsight
Enterprise Data Workflows with Cascading and Windows Azure HDInsight
 
Fraud Detection using Hadoop
Fraud Detection using HadoopFraud Detection using Hadoop
Fraud Detection using Hadoop
 
Microsoft NYC 14
Microsoft NYC 14Microsoft NYC 14
Microsoft NYC 14
 
Azure api app métricas com application insights
Azure api app métricas com application insightsAzure api app métricas com application insights
Azure api app métricas com application insights
 
Azure IOT
Azure IOTAzure IOT
Azure IOT
 
Big data streaming with Apache Spark on Azure
Big data streaming with Apache Spark on AzureBig data streaming with Apache Spark on Azure
Big data streaming with Apache Spark on Azure
 
SQLSaturday #230 - Introduction to Microsoft Big Data (Part 1)
SQLSaturday #230 - Introduction to Microsoft Big Data (Part 1)SQLSaturday #230 - Introduction to Microsoft Big Data (Part 1)
SQLSaturday #230 - Introduction to Microsoft Big Data (Part 1)
 
Software scope
Software scopeSoftware scope
Software scope
 
Azure HDInsight
Azure HDInsightAzure HDInsight
Azure HDInsight
 
Azure Stream Analytics : Analyse Data in Motion
Azure Stream Analytics  : Analyse Data in MotionAzure Stream Analytics  : Analyse Data in Motion
Azure Stream Analytics : Analyse Data in Motion
 
Going serverless
Going serverlessGoing serverless
Going serverless
 
2016-08-25 TechExeter - going serverless with Azure
2016-08-25 TechExeter - going serverless with Azure2016-08-25 TechExeter - going serverless with Azure
2016-08-25 TechExeter - going serverless with Azure
 
Azure functions
Azure functionsAzure functions
Azure functions
 

Similar a Building big data solutions on azure

Building IoT and Big Data Solutions on Azure
Building IoT and Big Data Solutions on AzureBuilding IoT and Big Data Solutions on Azure
Building IoT and Big Data Solutions on AzureIdo Flatow
 
Azure Platform
Azure Platform Azure Platform
Azure Platform Wes Yanaga
 
Data Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshData Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshJeffrey T. Pollock
 
How does Microsoft solve Big Data?
How does Microsoft solve Big Data?How does Microsoft solve Big Data?
How does Microsoft solve Big Data?James Serra
 
[「RDB技術者のためのNoSQLガイド」出版記念セミナー] Azure DocumentDB
[「RDB技術者のためのNoSQLガイド」出版記念セミナー] Azure DocumentDB[「RDB技術者のためのNoSQLガイド」出版記念セミナー] Azure DocumentDB
[「RDB技術者のためのNoSQLガイド」出版記念セミナー] Azure DocumentDBNaoki (Neo) SATO
 
Internet of Things in Tbilisi
Internet of Things in TbilisiInternet of Things in Tbilisi
Internet of Things in TbilisiAlexey Bokov
 
Microsoft Azure Technical Overview
Microsoft Azure Technical OverviewMicrosoft Azure Technical Overview
Microsoft Azure Technical Overviewgjuljo
 
SQL Server Data Services
SQL Server Data ServicesSQL Server Data Services
SQL Server Data ServicesEduardo Castro
 
Big Data Analytics from Azure Cloud to Power BI Mobile
Big Data Analytics from Azure Cloud to Power BI MobileBig Data Analytics from Azure Cloud to Power BI Mobile
Big Data Analytics from Azure Cloud to Power BI MobileRoy Kim
 
Introduction Big Data
Introduction Big DataIntroduction Big Data
Introduction Big DataFrank Kienle
 
Trivadis Azure Data Lake
Trivadis Azure Data LakeTrivadis Azure Data Lake
Trivadis Azure Data LakeTrivadis
 
Azure Data Explorer deep dive - review 04.2020
Azure Data Explorer deep dive - review 04.2020Azure Data Explorer deep dive - review 04.2020
Azure Data Explorer deep dive - review 04.2020Riccardo Zamana
 
IoT & Azure, the field of possibilities
IoT & Azure, the field of possibilitiesIoT & Azure, the field of possibilities
IoT & Azure, the field of possibilitiesAlex Danvy
 
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Amazon Web Services
 
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Amazon Web Services
 
Windows Azure: Lessons From The Field
Windows Azure: Lessons From The FieldWindows Azure: Lessons From The Field
Windows Azure: Lessons From The FieldRob Gillen
 
Data Estate Modernization
Data Estate ModernizationData Estate Modernization
Data Estate ModernizationKarina Matos
 

Similar a Building big data solutions on azure (20)

Building IoT and Big Data Solutions on Azure
Building IoT and Big Data Solutions on AzureBuilding IoT and Big Data Solutions on Azure
Building IoT and Big Data Solutions on Azure
 
Azure Platform
Azure Platform Azure Platform
Azure Platform
 
Webinar Data Mesh - Part 3
Webinar Data Mesh - Part 3Webinar Data Mesh - Part 3
Webinar Data Mesh - Part 3
 
Data Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshData Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to Mesh
 
How does Microsoft solve Big Data?
How does Microsoft solve Big Data?How does Microsoft solve Big Data?
How does Microsoft solve Big Data?
 
[「RDB技術者のためのNoSQLガイド」出版記念セミナー] Azure DocumentDB
[「RDB技術者のためのNoSQLガイド」出版記念セミナー] Azure DocumentDB[「RDB技術者のためのNoSQLガイド」出版記念セミナー] Azure DocumentDB
[「RDB技術者のためのNoSQLガイド」出版記念セミナー] Azure DocumentDB
 
Internet of Things in Tbilisi
Internet of Things in TbilisiInternet of Things in Tbilisi
Internet of Things in Tbilisi
 
Azure IoT Summary
Azure IoT SummaryAzure IoT Summary
Azure IoT Summary
 
Microsoft Azure Technical Overview
Microsoft Azure Technical OverviewMicrosoft Azure Technical Overview
Microsoft Azure Technical Overview
 
SQL Server Data Services
SQL Server Data ServicesSQL Server Data Services
SQL Server Data Services
 
Big Data Analytics from Azure Cloud to Power BI Mobile
Big Data Analytics from Azure Cloud to Power BI MobileBig Data Analytics from Azure Cloud to Power BI Mobile
Big Data Analytics from Azure Cloud to Power BI Mobile
 
Introduction Big Data
Introduction Big DataIntroduction Big Data
Introduction Big Data
 
Trivadis Azure Data Lake
Trivadis Azure Data LakeTrivadis Azure Data Lake
Trivadis Azure Data Lake
 
Azure Data Explorer deep dive - review 04.2020
Azure Data Explorer deep dive - review 04.2020Azure Data Explorer deep dive - review 04.2020
Azure Data Explorer deep dive - review 04.2020
 
IoT & Azure, the field of possibilities
IoT & Azure, the field of possibilitiesIoT & Azure, the field of possibilities
IoT & Azure, the field of possibilities
 
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
 
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
 
Azure Cloud Services
Azure Cloud ServicesAzure Cloud Services
Azure Cloud Services
 
Windows Azure: Lessons From The Field
Windows Azure: Lessons From The FieldWindows Azure: Lessons From The Field
Windows Azure: Lessons From The Field
 
Data Estate Modernization
Data Estate ModernizationData Estate Modernization
Data Estate Modernization
 

Último

Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Klinik kandungan
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabiaahmedjiabur940
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowgargpaaro
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...nirzagarg
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...gajnagarg
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRajesh Mondal
 
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...HyderabadDolls
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...Bertram Ludäscher
 
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...gajnagarg
 
Statistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbersStatistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numberssuginr1
 
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...HyderabadDolls
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...nirzagarg
 
Computer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfComputer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfSayantanBiswas37
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制vexqp
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxronsairoathenadugay
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...Elaine Werffeli
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...gajnagarg
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareGraham Ware
 
20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdfkhraisr
 
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...HyderabadDolls
 

Último (20)

Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
 
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
 
Statistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbersStatistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbers
 
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
 
Computer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfComputer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdf
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf
 
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
 

Building big data solutions on azure

  • 1. © Copyright SELA Software & Education Labs Ltd. | 14-18 Baruch Hirsch St Bnei Brak, 51202 Israel | www.selagroup.com Eyal Ben Ivri Building Big Data Solutions on Azure
  • 2. About me Eyal Ben Ivri Big Data & Cloud Architect, Sela Group Focus On Hadoop Eco-System & Big-Data + NoSQL Solutions
  • 3. Modern Data – The Big Picture IoT User Data Media Files Documents Machine Data Log Files
  • 4.
  • 5. The Light Rail problem – TLV Railway Imagine the new light Rail maintenance company IoT – Internet of Trains (and cameras, and cash registers and carts and rails and more…) Analyze data in stream and in batch Dashboards Alerts The perfect problem
  • 6. What We Need An integrated data solution that will be: Able to process events from external sources Able to walk data through different pipelines Fast and responsive Big-Data Ready
  • 7. In Other Words Consume BI Dashboards Applications Process ETL Aggregations Computation Analysis Querying Persist Hadoop SQL NoSQL Ingest IoT Structured Data Un-Structured Data
  • 8. Microsoft Azure Services for IoT and BigData Devices Device Connectivity Storage Analytics Presentation & Action Event Hubs SQL Database Machine Learning App Service Service Bus Table/Blob Storage Stream Analytics Power BI External Data Sources DocumentDB HDInsight Notification Hubs Data Lake Store Data Factory Mobile Services External Data Sources Data Lake Analytics BizTalk Services { }
  • 9. Microsoft Azure Services for IoT and BigData Devices Device Connectivity Storage Analytics Presentation & Action Event Hubs SQL Database Machine Learning App Service Service Bus Table/Blob Storage Stream Analytics Power BI External Data Sources DocumentDB HDInsight Notification Hubs Data Lake Store Data Factory Mobile Services External Data Sources Data Lake Analytics BizTalk Services { }
  • 10. Event Hub Messages at scale Why not throw it into a queue, and have a listener at the backend? Scaling limits, because of the architecture of queues and topics of a standard Service Bus Event Hub uses a partition model
  • 11. Getting Started Easy to set up Two Configurations Partition Count – Depend on the number of consumers (2- 32) Message Retention (days) – between 1 and 7 days Secured using SAS Policies
  • 12. Field Gateway Device Connectivity & Management IoT with Event Hubs Devices RTOS,Linux,Windows,Android,iOS Cloud Gateway Event Hubs Field Gateway Protocol Adaptation
  • 13. Field Gateway Device Connectivity & Management Analytics & Operationalized Insights IoT & Data Processing Patterns Devices RTOS,Linux,Windows,Android,iOS Protocol Adaptation Batch Analytics & Visualizations Azure HDInsight, AzureML, Power BI, Azure Data Factory Hot Path Analytics Azure Stream Analytics, Azure HDInsight Storm Hot Path Business Logic Service Fabric & Actor Framework Cloud Gateway Event Hubs & IoT Hub Field Gateway Protocol Adaptation
  • 14. TLV Railway Can now ingest millions of messages each second These messages carry data from: Devices End-Machines Servers Next, we need to use this data to create real- time alerts when something goes wrong
  • 15. Azure Stream Analytics Automatic recovery Monitoring and alerting Scale on demand Managed Cloud Service Each unit handles 1MB/s Can scale up to 1GB/s SQL like language temporal windowing semantics support for reference data
  • 16. Stream Analytics – Main Concepts Inputs Can be stream or reference data (metadata) Stream Data sources can be Event Hub, Blob Storage (using blobs with timestamps) or IoT Hub (preview) Serialization types support CSV, JSON, and Avro Query A SQL query to that will select from input(s) and dump results to output(s) Output Can be Blob, SQL, Event Hub (notification), Power BI (preview), Table storage, Service Bus or DocumentDB
  • 17. Tumbling Windows How many trains entered each station every 5 minutes? SELECT TrainId, COUNT(*) FROM EntryStream GROUP BY TrainId, TumblingWindow(minute,5)
  • 18. Temporal Windows Tumbling Window A series of fixed-sized, non-overlapping and contiguous time intervals Hopping Window Scheduled overlapping windows Sliding Window Outputs events only for those points in time when the content of the window actually changes
  • 19. TLV Railway Can now respond in near-real-time to events as they happen Track and maintain malfunctioning equipment Receive real time data regarding customers entering and leaving stations Data can now be processed, so we need a place to save it, preferably at scale.
  • 20. DocumentDB and Azure Data Services fully managed, scalable, queryable, schema free JSON document database service for modern applications transactional processing rich query managed as a service elastic scale internet accessible http/rest schema-free data model arbitrary data formats
  • 21. DocumentDB features JSON Documents SQL support Linq Support REST API Support JS Support (triggers, UDFs, stored procedures) Automatic Index Multiple Document Transactions Tunable Consistency
  • 22. DocumentDB Key Concept Collection A collection of Documents Not a table (different entities can go into the same collection) Collections = Partitions Not just logical containers, but physical ones
  • 24. TLV Railway Can now store it’s data in a highly scalable store Great for interactive querying of any data Messages from sensors Reference Data But this data (and other data) needs to move to other places (SQL, Batch processing, ML). How?
  • 25. What is Azure Data Factory? Azure Data Factory is a managed service to produce trusted information from data stored in the cloud and on-premises. Easily create, orchestrate and schedule highly-available, fault tolerant work flows to move and transform your data at scale.
  • 26. Evolving Approaches to Analytics ETL Tool (SSIS, etc) EDW (SQL Svr, Teradata, etc) Extract Original Data Load Transformed Data Transform BI Tools Ingest Original Data Scale-out Storage & Compute (HDFS, Blob Storage, etc) Transform & Load Data Marts Data Lake(s) Dashboards Apps Streaming data
  • 27. Data Factory – Main concepts Data Store A data source/sink component SQL (Azure or On-Premise), Storage, DocumentDB and more) Data Set A defined data set that is contained inside a data store One data store can have many data sets Compute A service for computation HDInsight, Azure Batch, Data Lake Analytics, Azure ML
  • 28. Data Factory – Main concepts Pipeline Set of instructions “Take data from data set A and move to compute, then store results in data set B” Slices Everything is time sliced A data set (source) can declare on what time intervals the data can be sliced, and the pipeline will be activated when a new slice is ready JSON
  • 29.
  • 30. Microsoft Azure Services for IoT and BigData Devices Device Connectivity Storage Analytics Presentation & Action Event Hubs SQL Database Machine Learning App Service Service Bus Table/Blob Storage Stream Analytics Power BI External Data Sources DocumentDB HDInsight Notification Hubs Data Lake Store Data Factory Mobile Services External Data Sources Data Lake Analytics BizTalk Services { }
  • 31. Microsoft Azure Services for IoT and BigData Devices Device Connectivity Storage Analytics Presentation & Action Event Hubs SQL Database Machine Learning App Service Service Bus Table/Blob Storage Stream Analytics Power BI External Data Sources DocumentDB HDInsight Notification Hubs Data Lake Store Data Factory Mobile Services External Data Sources Data Lake Analytics BizTalk Services { }
  • 32. TLV Railway Can now integrate different services and different data sources Move data with ease and as little hassle as possible What about aggregations, deeper dive into data, for more complex analysis?
  • 33.
  • 34. HDInsight Hadoop-as-a-Service Based on the Hortonworks distribution Few flavors: Hadoop (Windows + Linux) Storm (Windows + Linux) HBase (Windows + Linux) Spark (Windows + Linux)
  • 37. TLV Railway - Summary Can now perform advanced analytics on top of large amounts of data, in a variety of formats (not just structured, boring data) Can integrate all the loose ends of data coming in, with data generated in ”Old-School” data platforms like SQL that is collected from Line- of-Business applications We’ve covered data ingestion, responding in real-time, querying, storing and processing Azure Stack
  • 38. Hadoop and OSS vs. Azure IoT and BigData Ecosystem Azure Ecosystem OSS Event Hubs Kafka Stream Analytics Storm HDInsight Hadoop Map Reduce Map Reduce Hive Hive Spark Spark HBase HBase Azure ML Mahout Data Factory Pig DocumentDB MongoDB / Couchbase
  • 41. London did it first
  • 42. Summary Get started today at http://azure.microsoft.com

Notas del editor

  1. Key goal of slide: IoT as you know is a hot area these days and there are a number of players that claim to be active in this space…. And they tend to focus on specific elements you see in this diagram. Microsoft has the most comprehensive portfolio of cloud services that customers need to develop and deploy end-to-end IoT solutions. Customers are adopting these services and are successfully deploying their solutions today (reference Rockwell, ThyssenKrupp) Talk track [Short Version for Sam’s Leadership Session]: As we think about Azure IoT services, Microsoft has the most comprehensive portfolio of cloud services that customers need to develop and deploy end-to-end IoT solutions Ranging from devices that produce data, to connecting them to the cloud storage, and driving analytics to gain valuable business insights that allows enterprises to take actions Talk track [Long Version Chris’ Breakout Session]: As we think about Azure IoT services, there are a collection of capabilities involved. First there are Producers. These can be basic sensors, small form factor devices, traditional computer systems, or even complex assets made up of a number of data sources. Next we have the Connect Devices capabilities on the ingress level within and around Azure. The primary destination is Service Bus & Event Hubs, but this relies on client agent technology either at the edge device level or within a field or cloud gateway. We also have capabilities for other external data sources o provide data As data is ingressed to Azure, there are various Storage options there can be a number of destinations engaged. Traditional database technology, table or blob, or even more complex destinations like Document DB are possible. External or third party technologies can also be used. This is where the flexibility and agility of a platform shows its strength, This is where analysts like Gartner are forming opinions about just how robust our platform can be. As this data is processed in Azure, there are a number of capabilities that can be utilized. Machine Learning, HD Insight, Stream Analytics are examples of tools that can analytics the data in various ways. Finally the concept of Take Actions uses Azure services. Data may populate a LOB portal, be pushed to apps, or presented in analytics and productivity tools. These are all ways that the data gets out of these architecture points to allow organizations to use analysis to change / transform their business. Through all of these areas, there is the possibility of utilizing existing investments either within your Azure environment, or elsewhere.
  2. Key goal of slide: IoT as you know is a hot area these days and there are a number of players that claim to be active in this space…. And they tend to focus on specific elements you see in this diagram. Microsoft has the most comprehensive portfolio of cloud services that customers need to develop and deploy end-to-end IoT solutions. Customers are adopting these services and are successfully deploying their solutions today (reference Rockwell, ThyssenKrupp) Talk track [Short Version for Sam’s Leadership Session]: As we think about Azure IoT services, Microsoft has the most comprehensive portfolio of cloud services that customers need to develop and deploy end-to-end IoT solutions Ranging from devices that produce data, to connecting them to the cloud storage, and driving analytics to gain valuable business insights that allows enterprises to take actions Talk track [Long Version Chris’ Breakout Session]: As we think about Azure IoT services, there are a collection of capabilities involved. First there are Producers. These can be basic sensors, small form factor devices, traditional computer systems, or even complex assets made up of a number of data sources. Next we have the Connect Devices capabilities on the ingress level within and around Azure. The primary destination is Service Bus & Event Hubs, but this relies on client agent technology either at the edge device level or within a field or cloud gateway. We also have capabilities for other external data sources o provide data As data is ingressed to Azure, there are various Storage options there can be a number of destinations engaged. Traditional database technology, table or blob, or even more complex destinations like Document DB are possible. External or third party technologies can also be used. This is where the flexibility and agility of a platform shows its strength, This is where analysts like Gartner are forming opinions about just how robust our platform can be. As this data is processed in Azure, there are a number of capabilities that can be utilized. Machine Learning, HD Insight, Stream Analytics are examples of tools that can analytics the data in various ways. Finally the concept of Take Actions uses Azure services. Data may populate a LOB portal, be pushed to apps, or presented in analytics and productivity tools. These are all ways that the data gets out of these architecture points to allow organizations to use analysis to change / transform their business. Through all of these areas, there is the possibility of utilizing existing investments either within your Azure environment, or elsewhere.
  3. Key goal of slide: IoT as you know is a hot area these days and there are a number of players that claim to be active in this space…. And they tend to focus on specific elements you see in this diagram. Microsoft has the most comprehensive portfolio of cloud services that customers need to develop and deploy end-to-end IoT solutions. Customers are adopting these services and are successfully deploying their solutions today (reference Rockwell, ThyssenKrupp) Talk track [Short Version for Sam’s Leadership Session]: As we think about Azure IoT services, Microsoft has the most comprehensive portfolio of cloud services that customers need to develop and deploy end-to-end IoT solutions Ranging from devices that produce data, to connecting them to the cloud storage, and driving analytics to gain valuable business insights that allows enterprises to take actions Talk track [Long Version Chris’ Breakout Session]: As we think about Azure IoT services, there are a collection of capabilities involved. First there are Producers. These can be basic sensors, small form factor devices, traditional computer systems, or even complex assets made up of a number of data sources. Next we have the Connect Devices capabilities on the ingress level within and around Azure. The primary destination is Service Bus & Event Hubs, but this relies on client agent technology either at the edge device level or within a field or cloud gateway. We also have capabilities for other external data sources o provide data As data is ingressed to Azure, there are various Storage options there can be a number of destinations engaged. Traditional database technology, table or blob, or even more complex destinations like Document DB are possible. External or third party technologies can also be used. This is where the flexibility and agility of a platform shows its strength, This is where analysts like Gartner are forming opinions about just how robust our platform can be. As this data is processed in Azure, there are a number of capabilities that can be utilized. Machine Learning, HD Insight, Stream Analytics are examples of tools that can analytics the data in various ways. Finally the concept of Take Actions uses Azure services. Data may populate a LOB portal, be pushed to apps, or presented in analytics and productivity tools. These are all ways that the data gets out of these architecture points to allow organizations to use analysis to change / transform their business. Through all of these areas, there is the possibility of utilizing existing investments either within your Azure environment, or elsewhere.
  4. Key goal of slide: IoT as you know is a hot area these days and there are a number of players that claim to be active in this space…. And they tend to focus on specific elements you see in this diagram. Microsoft has the most comprehensive portfolio of cloud services that customers need to develop and deploy end-to-end IoT solutions. Customers are adopting these services and are successfully deploying their solutions today (reference Rockwell, ThyssenKrupp) Talk track [Short Version for Sam’s Leadership Session]: As we think about Azure IoT services, Microsoft has the most comprehensive portfolio of cloud services that customers need to develop and deploy end-to-end IoT solutions Ranging from devices that produce data, to connecting them to the cloud storage, and driving analytics to gain valuable business insights that allows enterprises to take actions Talk track [Long Version Chris’ Breakout Session]: As we think about Azure IoT services, there are a collection of capabilities involved. First there are Producers. These can be basic sensors, small form factor devices, traditional computer systems, or even complex assets made up of a number of data sources. Next we have the Connect Devices capabilities on the ingress level within and around Azure. The primary destination is Service Bus & Event Hubs, but this relies on client agent technology either at the edge device level or within a field or cloud gateway. We also have capabilities for other external data sources o provide data As data is ingressed to Azure, there are various Storage options there can be a number of destinations engaged. Traditional database technology, table or blob, or even more complex destinations like Document DB are possible. External or third party technologies can also be used. This is where the flexibility and agility of a platform shows its strength, This is where analysts like Gartner are forming opinions about just how robust our platform can be. As this data is processed in Azure, there are a number of capabilities that can be utilized. Machine Learning, HD Insight, Stream Analytics are examples of tools that can analytics the data in various ways. Finally the concept of Take Actions uses Azure services. Data may populate a LOB portal, be pushed to apps, or presented in analytics and productivity tools. These are all ways that the data gets out of these architecture points to allow organizations to use analysis to change / transform their business. Through all of these areas, there is the possibility of utilizing existing investments either within your Azure environment, or elsewhere.