SlideShare una empresa de Scribd logo
1 de 58
Descargar para leer sin conexión
Microsoft cloud big data strategy
About Me
 Microsoft, Big Data Evangelist
 In IT for 30 years, worked on many BI and DW projects
 Worked as desktop/web/database developer, DBA, BI and DW architect and developer, MDM
architect, PDW/APS developer
 Been perm employee, contractor, consultant, business owner
 Presenter at PASS Business Analytics Conference, PASS Summit, Enterprise Data World conference
 Certifications: MCSE: Data Platform, Business Intelligence; MS: Architecting Microsoft Azure
Solutions, Design and Implement Big Data Analytics Solutions, Design and Implement Cloud Data
Platform Solutions
 Blog at JamesSerra.com
 Former SQL Server MVP
 Author of book “Reporting with Microsoft SQL Server 2012”
Agenda
 Big data defined
 Microsoft big data solution
 Azure data lake
Microsoft cloud big data strategy
Big Data is changing
traditional data
warehousing
… data warehousing has reached the
most significant tipping point since
its inception. The biggest, possibly
most elaborate data management
system in IT is changing.
– Gartner, “The State of Data Warehousing”*
* Donald Feinberg, Mark Beyer, Merv Adrian, Roxane Edjlali (Gartner), The State of Data Warehousing in 2012 (Stamford, CT.: Gartner, 2012)
Data sources
ETL
Data warehouse
BI and analytics
Big Data has new data characteristics
Data complexity: variety and velocity
Petabytes
Big Data is driving transformative changes
Traditional Big Data
Relational data
with highly modeled schema
All data
with schema agility
Specialized HW Commodity HW
Data
characteristics
Costs
Culture
Operational reporting
Focus on rear-view analysis
Experimentation leading
to intelligent action
With machine learning, graph, a/b testing
Big Data introduces new culture of experimentation
Understand customer patterns to
uncover cross-sell opportunities
Historical campaign
effectiveness
Generate year-end financial
reports
Financial monitoring with real-time
recommendations to increase revenue
Generate year-end financial
reports
Real-time product offers and
promotions based on behavior
Collect historical data on
equipment performance
Real-time monitoring to
identify proactive maintenance
Shipping features without
understanding success
Building successful features
correlating user action with
product experience
Action
Value
From data to decisions and actions
However, there are challenges to Big Data…
Obtaining skills
and capabilities
Determining how
to get value
Integrating with
existing IT investments
*Gartner: Survey Analysis – Hadoop Adoption Drivers and Challenges (Stamford, CT.: Gartner, 2015)
But, Microsoft has done it before
We needed to better leverage data and analytics to do
more experimentation
So we:
• Designed a data lake for everyone to put their data into
• Built tools approachable by any developer
• Created machine learning tools for collaborating
across large experiment models
Result:
• Across Microsoft, ten thousand developers doing
experimentation leading to better insights
• Leading to growth in our Microsoft businesses:
• Office productivity revenue (45%YoY)*
• Intelligent Cloud (100% YoY)*
• Bing search share doubles
2010 2011 2012 2013 2014 2015
Growth of data @ Microsoft
Windows
SMSG
Live
Bing
CRM/Dynamics
Xbox Live
Office365
Malware Protection Microsoft Stores
Commerce Risk
Skype
LCA
Exchange
Yammer
PetabytesExabytes
* Microsoft. FY16 Q4 Results, URL: http://www.microsoft.com/en-us/Investor/earnings/FY-2016-Q4/press-release-webcast
Microsoft is now taking
everything we’ve
learned on this journey
and bringing it to our
customers
Technology. Cost. Culture.
Microsoft cloud big data strategy
Big Data as a cornerstone of Cortana Intelligence
Action
People
Automated
Systems
Apps
Web
Mobile
Bots
Intelligence
Dashboards &
Visualizations
Cortana
Bot
Framework
Cognitive
Services
Power BI
Information
Management
Event Hubs
Data Catalog
Data Factory
Machine Learning
and Analytics
HDInsight
(Hadoop and
Spark)
Stream Analytics
Intelligence
Data Lake
Analytics
Machine
Learning
Big Data Stores
Data Lake Store
Data
Sources
Apps
Sensors
and
devices
Data
SQL Data
Warehouse
CONTROL EASE OF USE
Azure Data Lake
Analytics
Azure Data Lake Store
Azure Storage
Any Hadoop technology
Workload optimized,
managed clusters
Specific apps in a multi-
tenant form factor
Azure Marketplace
HDP | CDH | MapR
Azure Data Lake
Analytics
IaaS Hadoop Managed Hadoop Big Data as-a-service
Azure HDInsight
BIGDATA
STORAGE
BIGDATA
ANALYTICS
Bringing Big Data to everybody
Accelerate the pace of innovation through a state-of-the-art cloud platform
UserAdoption
Microsoft Big Data Portfolio
SQL Server Stretch
Business intelligence
Machine learning analytics
Insights
Azure SQL Database
SQL Server 2016
SQL Server 2016 Fast Track
Azure SQL DW
Azure Data Lake
DocumentDB
HDInsight
Hadoop
Analytics Platform System
Sequential Scale Out + AcrossScale Up
Key
Relational Non-relational
On-premisesCloud
Microsoft has solutions covering
and connecting all four
quadrants – that’s why SQL
Server is one of the most utilized
databases in the world
16
Azure
HDInsight
A Cloud Spark and
Hadoop service for the
Enterprise
Reliable with an industry leading SLA
Enterprise-grade security and monitoring
Productive platform for developers and
scientists
Cost effective cloud scale
Integration with leading ISV applications
Easy for administrators to manage
63% lower TCO than deploy your own
Hadoop on-premises*
*IDC study “The Business Value and TCO Advantage of Apache Hadoop in the Cloud with Microsoft Azure HDInsight”
Hortonworks Data Platform (HDP) 2.5
Simply put, Hortonworks ties all the open source products together (22)
(under the covers of HDInsight)
Azure
Data Lake Store
A No limits Data Lake that
powers Big Data Analytics
Petabyte size files and Trillions of objects
Scalable throughput for massively parallel
analytics
HDFS for the cloud
Always encrypted, role-based security &
auditing
Enterprise-grade support
Azure
Data Lake Analytics
A No limits Analytics Job
Service to power intelligent
action
Start in seconds, scale instantly, pay per job
Develop massively parallel programs with
simplicity
Debug and optimize your big data programs
with ease
Virtualize your analytics
Enterprise-grade security, auditing and
support
Azure Data Lake
YARN
U-SQL
Analytics HDInsight
Hive R Server
HDFS
Store
Store and analyze data of any kind and size
Develop faster, debug and optimize smarter
Interactively explore patterns in your data
No learning curve
Managed and supported
Dynamically scales to match your business
priorities
Enterprise-grade security
Built on YARN, designed for the cloud
Azure SQL Data Warehouse
A relational data warehouse-as-a-service, fully managed by Microsoft.
Industries first elastic cloud data warehouse with enterprise-grade capabilities.
Integrated with on-premises and cloud assets.
Simple compute & storage billing
Pay for what you need
High performance without rewriting
applications
Low cost for latent data
Infrastructure, management and
support provided
Scales to petabytes of data with MPP processing
Resize compute nodes < 1 minute
Faster time to insight than other SMP offering
Designed for “on-demand” workload
Integrated with Azure platform and
other Microsoft services
Enables hybrid solutions
Built on SQL Server experience &
technology
PolyBase
Query relational and non-relational data with T-SQL
By preview early this year PolyBase will support Teradata, Oracle,
SQL Server, MongoDB, Hadoop and Azure blob storage
Publish-subscribe data
distribution
Managed PaaS (Platform
as a Service) solution
Scales with your needs to
millions of events per
second
Provides a durable buffer
between event publishers
and event consumers
Azure Event Hubs
Azure Stream Analytics
Process real-time data in Azure
Consumes millions of real-time events from Event Hub collected from devices, sensors, infrastructure,
and applications
Performs time-sensitive analysis using SQL-like language against multiple real-time streams and
reference data
Outputs to persistent stores, dashboards or back to devices
Point of
Service Devices
Self Checkout
Stations
Kiosks
Smart
Phones
Slates/
Tablets
PCs/
Laptops
Servers
Digital
Signs
Diagnostic
EquipmentRemote Medical
Monitors
Logic
Controllers
Specialized
DevicesThin
Clients
Handhelds
Security
POS
Terminals
Automation
Devices
Vending
Machines
Kinect
ATM
Azure Machine Learning
Get started with just a browser
Requires no provisioning; simply log
on to your Azure subscription or try
it for free off azure.com/ml
Experience the power of choice
Choose from hundreds of algorithms
and packages from R and Python or
drop in your own custom code
Take advantage of business-tested
algorithms from Xbox and Bing
Deploy solutions in minutes
With the click of a button, deploy
the finished model as a web service
that can connect to any data,
anywhere
Connect to the world
Brand and monetize solutions on
our global Machine Learning
Marketplace
https://datamarket.azure.com/
Beyond business intelligence – machine intelligence
Microsoft Azure
Machine Learning Studio
Modeling environment (shown)
Microsoft Azure
Machine Learning API service
Model in production as a web service
Microsoft Azure
Machine Learning Marketplace
APIs and solutions for broad use
Enable enterprise-wide self-service data source registration and discovery
A metadata repository that allow users to register, enrich,
understand, discover, and consume data sources
Delivers differentiated value though
‒ Data source discovery; rather than data discovery
‒ Support for data from any source; Structured and
unstructured, on premises and in the cloud
‒ Publishing, discovery and consumption through any tool
‒ Annotation crowdsourcing: empowering any user to
capture and share their knowledge.
This, while allowing IT to maintain control and oversight
Azure Data Factory
Connect to relational or non-
relational data that is on-
premises or in the cloud
Orchestrate data movement &
data processing
Publish to Power BI users as a
searchable data view
Operationalize (schedule,
manage, debug) workflows
Lifecycle management,
monitoring
Orchestrate trusted information production in Azure
Microsoft Confidential – Under Strict NDA
C#
MapReduce
Hive
Pig
Stored Procedures
Azure Machine Learning
Discovery & exploration –
Custom visualizations—
R integration -
Microsoft cloud big data strategy
146.03K145.84K145.96K146.06K 40.08K38.84K39.99K40.33K
Microsoft cloud big data strategy
www.botframework.com
Microsoft
Cognitive
Services
Give your apps
a human side
Cognitive Services API Collection
Azure Analysis Services
Azure Analysis Services is based on the proven analytics engine that has helped
organizations turn complex data into a trusted, single source of truth for years.
Built for
hybrid data
Access and model
data on-premises,
in the cloud, or both
Interactive
visualization
Quick, highly interactive
self-service data discovery
with support of major
data visualization tools
Proven
technology
Powerful, proven tabular
models built from SQL Server
2016 Analysis Services
Cloud
powered
Easy to deploy, scale, and
manage as a platform-as-
a-service solution
SQL Server
R Services
Linux
Hadoop Teradata
Windows
CommercialCommunity
R ServerR Open
Fully managed database service
built on a native JSON data model
Application controlled schema with
massive scale-out enables iterative
development and evolving data models
Automatic indexing enables robust
querying over schema-free data
Integrated transactional JavaScript
processing + tunable consistency enable
high performance application
experiences
Azure DocumentDB
SQL Server on Linux
(Preview today, GA in
mid-2017)
Red Hat - Microsoft
Partnership
(Nov 2015)
Microsoft joins Eclipse
Foundation (Mar 2016).
HD Insight PaaS on
Linux GA (Sep 2015)
C:Usersmarkhill>
root@localhost: #
bash
Azure Marketplace 60% of all images in
Azure Marketplace
are based on
Linux/OSS
In partnership with the Linux
Foundation, Microsoft releases the
Microsoft Certified Solutions Associate
(MCSA) Linux on Azure certification.
493,141,677 ?????? Microsoft Open Source Hub
Ross Gardler: President Apache Software
Foundation
Wim Coekaerts: Oracle’s Mr Linux
1 out of 4 VMs on Azure runs
Linux, and getting larger every
day
• 28.9% of All VMs are Linux
• >50% of new VMs
Microsoft cloud big data strategy
Azure Data Lake
Big Data made easy
Analytics on any data,
any size
Easier and more
productive for all users Enterprise-ready
Azure Data Lake
Big Data made easy
Analytics on any data,
any size
Easier and more
productive for all users Enterprise-ready
Petabyte size files and
Trillions of objects • Store data in it’s native format
• PB sized files, 200x larger than
anyone else
• Scalable throughput for
massively parallel analytics
• No need to redesign
application or reparation data
at higher scale
TBs
EBs
Store
Any type
of analytics
• Batch, interactive, streaming,
machine learning
• Allows for exploratory analytics
over data
• Analyze with Hadoop and
Microsoft solutions
Cortana Intelligence Suite
YARN
U-SQL
Analytics HDInsight
HDFS
Store
Hive R Server
Start in seconds, Scale
instantly, Pay per job
with Analytics
• Process big data jobs in 30
seconds
• No infrastructure to worry
about (no servers, no VMs, no
clusters)
• Instantly scale analytic units up
or down (processing power)
• Architected for cloud scale and
performance
• Frees you up to focus only on
your business logic
Azure Data Lake
Big Data made easy
Analytics on any data,
any size
Easier and more
productive for all users Enterprise-ready
Easy for administrators
to spin up quickly
• Deploy big data projects
in minutes
• No hardware to install,
tune, configure or deploy
• No infrastructure or
software to manage
• Scale to tens to thousands
of machines instantly
Debug and Optimize
your Big Data
programs with ease
• Deep integration with
Visual Studio, Visual Studio
Code, Eclipse, & IntelliJ
• Easy for novices to write
simple queries
• Integrated with U-SQL,
Hive, Storm, and Spark
• Actively offers recommendations
to improve performance and
reduce cost
• Playback visually displays job run
Develop massively
parallel programs with
simplicity
• U-SQL: a simple
and powerful language that’s
familiar and easily extensible
• Unifies the declarative
nature of SQL with expressive
power of C#
• Leverage existing libraries in
.NET languages, R and Python
• Massively parallelize code on
diverse workloads (ETL, ML,
image tagging, facial detection)
Query data where it lives
Easily query data in multiple Azure data stores without moving it to a single store
Benefits
• Avoid moving large amounts of data across the
network between stores (federated query/logical data
warehouse)
• Single view of data irrespective of physical location
• Minimize data proliferation issues caused by
maintaining multiple copies
• Single query language for all data
• Each data store maintains its own sovereignty
• Design choices based on the need
• Push SQL expressions to remote SQL sources
• Filters
• Joins
U-SQL
Query
Query
Azure
Storage Blobs
Azure SQL
in VMs
Azure
SQL DB
Azure Data
Lake Analytics
Azure
SQL Data Warehouse
Azure
Data Lake Storage
Easy for data scientists
with familiar R language
R Server for HDInsight
• Largest portable R parallel
analytics library
• Terabyte-scale machine
learning—1,000x larger than
in open source R
• Up to 100x faster performance
using Spark and optimized
vector/math libraries
• Enterprise-grade security
and support
*Applies to HDInsight only
Azure Data Lake
Big Data made easy
Analytics on any data,
any size
Easier and more
productive for all users Enterprise-ready
Highest availability
guarantee in the industry
for peace of mind
• Managed, monitored and
supported by Microsoft
• Enterprise-leading SLA—
99.9% uptime
• No IT resources needed for
upgrades and patching
• Microsoft monitors your
deployment so you don’t
have to
99.9% SLA
Azure Regions
38 Regions Worldwide, 32 Generally Available
 100+ datacenters
 Top 3 networks in the world
 2.5x AWS, 7x Google DC Regions
 G Series – Largest VM in World, 32 cores, 448GB Ram, SSD…
Always encrypted,
Role-based security
& Auditing
• Always encrypted; in motion
using SSL, and at rest using keys
in Azure Key Vault
• Single sign-on, multi-factor
authentication and seamless
integration of on-premises
identities with Active Directory
• Fine-grained POSIX-based ACLs
for role-based access controls
• Auditing every access /
configuration change
Lower total cost
of ownership
• No hardware
• Hadoop support included with
Azure support
• Pay only for what you use
• Independently scale storage
and compute
• No need to hire specialized
operations team
• 63% lower total cost of
ownership than on-premises*
*IDC study “The Business Value and TCO Advantage of Apache Hadoop in the Cloud
with Microsoft Azure HDInsight”
Microsoft cloud big data strategy
Recognized by
top analysts
Forrester Wave for Big Data
Hadoop Cloud
• Named industry leader by
Forrester with the most
comprehensive, scalable, and
integrated platforms*
• Recognized for its cloud-first
strategy that is paying off*
*The Forrester WaveTM: Big Data Hadoop Cloud Solutions, Q2 2016.
Q & A ?
James Serra, Big Data Evangelist
Email me at: JamesSerra3@gmail.com
Follow me at: @JamesSerra
Link to me at: www.linkedin.com/in/JamesSerra
Visit my blog at: JamesSerra.com (where this slide deck is posted under the “Presentations” tab)

Más contenido relacionado

La actualidad más candente

Time to Talk about Data Mesh
Time to Talk about Data MeshTime to Talk about Data Mesh
Time to Talk about Data MeshLibbySchulze
 
Building the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake AnalyticsBuilding the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake AnalyticsKhalid Salama
 
ExpertsLive NL 2022 - Microsoft Purview - What's in it for my organization?
ExpertsLive NL 2022 - Microsoft Purview - What's in it for my organization?ExpertsLive NL 2022 - Microsoft Purview - What's in it for my organization?
ExpertsLive NL 2022 - Microsoft Purview - What's in it for my organization?Albert Hoitingh
 
Modernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureModernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureDatabricks
 
Data Mesh for Dinner
Data Mesh for DinnerData Mesh for Dinner
Data Mesh for DinnerKent Graziano
 
Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...
Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...
Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...HostedbyConfluent
 
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...Cathrine Wilhelmsen
 
Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)James Serra
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Databricks
 
Snowflake Company Presentation
Snowflake Company PresentationSnowflake Company Presentation
Snowflake Company PresentationAndrewJiang18
 
Data Quality Best Practices
Data Quality Best PracticesData Quality Best Practices
Data Quality Best PracticesDATAVERSITY
 
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...DataScienceConferenc1
 
Building Lakehouses on Delta Lake with SQL Analytics Primer
Building Lakehouses on Delta Lake with SQL Analytics PrimerBuilding Lakehouses on Delta Lake with SQL Analytics Primer
Building Lakehouses on Delta Lake with SQL Analytics PrimerDatabricks
 
Data Lake Overview
Data Lake OverviewData Lake Overview
Data Lake OverviewJames Serra
 
DAS Slides: Master Data Management – Aligning Data, Process, and Governance
DAS Slides: Master Data Management – Aligning Data, Process, and GovernanceDAS Slides: Master Data Management – Aligning Data, Process, and Governance
DAS Slides: Master Data Management – Aligning Data, Process, and GovernanceDATAVERSITY
 
Microsoft Data Platform - What's included
Microsoft Data Platform - What's includedMicrosoft Data Platform - What's included
Microsoft Data Platform - What's includedJames Serra
 
Data Architecture Best Practices for Advanced Analytics
Data Architecture Best Practices for Advanced AnalyticsData Architecture Best Practices for Advanced Analytics
Data Architecture Best Practices for Advanced AnalyticsDATAVERSITY
 

La actualidad más candente (20)

Time to Talk about Data Mesh
Time to Talk about Data MeshTime to Talk about Data Mesh
Time to Talk about Data Mesh
 
Building the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake AnalyticsBuilding the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake Analytics
 
ExpertsLive NL 2022 - Microsoft Purview - What's in it for my organization?
ExpertsLive NL 2022 - Microsoft Purview - What's in it for my organization?ExpertsLive NL 2022 - Microsoft Purview - What's in it for my organization?
ExpertsLive NL 2022 - Microsoft Purview - What's in it for my organization?
 
Modernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureModernizing to a Cloud Data Architecture
Modernizing to a Cloud Data Architecture
 
Data Mesh for Dinner
Data Mesh for DinnerData Mesh for Dinner
Data Mesh for Dinner
 
Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...
Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...
Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...
 
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
 
Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
Snowflake Company Presentation
Snowflake Company PresentationSnowflake Company Presentation
Snowflake Company Presentation
 
Data Quality Best Practices
Data Quality Best PracticesData Quality Best Practices
Data Quality Best Practices
 
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
 
Building Lakehouses on Delta Lake with SQL Analytics Primer
Building Lakehouses on Delta Lake with SQL Analytics PrimerBuilding Lakehouses on Delta Lake with SQL Analytics Primer
Building Lakehouses on Delta Lake with SQL Analytics Primer
 
Data Lake Overview
Data Lake OverviewData Lake Overview
Data Lake Overview
 
DAS Slides: Master Data Management – Aligning Data, Process, and Governance
DAS Slides: Master Data Management – Aligning Data, Process, and GovernanceDAS Slides: Master Data Management – Aligning Data, Process, and Governance
DAS Slides: Master Data Management – Aligning Data, Process, and Governance
 
Snowflake Datawarehouse Architecturing
Snowflake Datawarehouse ArchitecturingSnowflake Datawarehouse Architecturing
Snowflake Datawarehouse Architecturing
 
Elastic Data Warehousing
Elastic Data WarehousingElastic Data Warehousing
Elastic Data Warehousing
 
Microsoft Data Platform - What's included
Microsoft Data Platform - What's includedMicrosoft Data Platform - What's included
Microsoft Data Platform - What's included
 
Data Architecture Best Practices for Advanced Analytics
Data Architecture Best Practices for Advanced AnalyticsData Architecture Best Practices for Advanced Analytics
Data Architecture Best Practices for Advanced Analytics
 
Webinar Data Mesh - Part 3
Webinar Data Mesh - Part 3Webinar Data Mesh - Part 3
Webinar Data Mesh - Part 3
 

Destacado

SQL Server 2017 Deep Dive - @Ignite 2017
SQL Server 2017 Deep Dive - @Ignite 2017SQL Server 2017 Deep Dive - @Ignite 2017
SQL Server 2017 Deep Dive - @Ignite 2017Travis Wright
 
Microsoft Ignite 2017 - SQL Server on Kubernetes, Swarm, and Open Shift
Microsoft Ignite 2017 - SQL Server on Kubernetes, Swarm, and Open ShiftMicrosoft Ignite 2017 - SQL Server on Kubernetes, Swarm, and Open Shift
Microsoft Ignite 2017 - SQL Server on Kubernetes, Swarm, and Open ShiftTravis Wright
 
What’s new in SQL Server 2017
What’s new in SQL Server 2017What’s new in SQL Server 2017
What’s new in SQL Server 2017James Serra
 
Choosing technologies for a big data solution in the cloud
Choosing technologies for a big data solution in the cloudChoosing technologies for a big data solution in the cloud
Choosing technologies for a big data solution in the cloudJames Serra
 
SQL Server 2017 on Linux Introduction
SQL Server 2017 on Linux IntroductionSQL Server 2017 on Linux Introduction
SQL Server 2017 on Linux IntroductionTravis Wright
 
Machine learning services with SQL Server 2017
Machine learning services with SQL Server 2017Machine learning services with SQL Server 2017
Machine learning services with SQL Server 2017Mark Tabladillo
 
Introduction to PolyBase
Introduction to PolyBaseIntroduction to PolyBase
Introduction to PolyBaseJames Serra
 
Microsoft Technologies for Data Science 201612
Microsoft Technologies for Data Science 201612Microsoft Technologies for Data Science 201612
Microsoft Technologies for Data Science 201612Mark Tabladillo
 
Bootcamp 2017 - SQL Server on Linux
Bootcamp 2017 - SQL Server on LinuxBootcamp 2017 - SQL Server on Linux
Bootcamp 2017 - SQL Server on LinuxMaximiliano Accotto
 

Destacado (9)

SQL Server 2017 Deep Dive - @Ignite 2017
SQL Server 2017 Deep Dive - @Ignite 2017SQL Server 2017 Deep Dive - @Ignite 2017
SQL Server 2017 Deep Dive - @Ignite 2017
 
Microsoft Ignite 2017 - SQL Server on Kubernetes, Swarm, and Open Shift
Microsoft Ignite 2017 - SQL Server on Kubernetes, Swarm, and Open ShiftMicrosoft Ignite 2017 - SQL Server on Kubernetes, Swarm, and Open Shift
Microsoft Ignite 2017 - SQL Server on Kubernetes, Swarm, and Open Shift
 
What’s new in SQL Server 2017
What’s new in SQL Server 2017What’s new in SQL Server 2017
What’s new in SQL Server 2017
 
Choosing technologies for a big data solution in the cloud
Choosing technologies for a big data solution in the cloudChoosing technologies for a big data solution in the cloud
Choosing technologies for a big data solution in the cloud
 
SQL Server 2017 on Linux Introduction
SQL Server 2017 on Linux IntroductionSQL Server 2017 on Linux Introduction
SQL Server 2017 on Linux Introduction
 
Machine learning services with SQL Server 2017
Machine learning services with SQL Server 2017Machine learning services with SQL Server 2017
Machine learning services with SQL Server 2017
 
Introduction to PolyBase
Introduction to PolyBaseIntroduction to PolyBase
Introduction to PolyBase
 
Microsoft Technologies for Data Science 201612
Microsoft Technologies for Data Science 201612Microsoft Technologies for Data Science 201612
Microsoft Technologies for Data Science 201612
 
Bootcamp 2017 - SQL Server on Linux
Bootcamp 2017 - SQL Server on LinuxBootcamp 2017 - SQL Server on Linux
Bootcamp 2017 - SQL Server on Linux
 

Similar a Microsoft cloud big data strategy

How does Microsoft solve Big Data?
How does Microsoft solve Big Data?How does Microsoft solve Big Data?
How does Microsoft solve Big Data?James Serra
 
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...Dataconomy Media
 
Opportunity: Data, Analytic & Azure
Opportunity: Data, Analytic & Azure Opportunity: Data, Analytic & Azure
Opportunity: Data, Analytic & Azure Abhimanyu Singhal
 
Modern Business Intelligence and Advanced Analytics
Modern Business Intelligence and Advanced AnalyticsModern Business Intelligence and Advanced Analytics
Modern Business Intelligence and Advanced AnalyticsCollective Intelligence Inc.
 
Azure databricks c sharp corner toronto feb 2019 heather grandy
Azure databricks c sharp corner toronto feb 2019 heather grandyAzure databricks c sharp corner toronto feb 2019 heather grandy
Azure databricks c sharp corner toronto feb 2019 heather grandyNilesh Shah
 
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solutionDifferentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solutionJames Serra
 
Azure Synapse 101 Webinar Presentation
Azure Synapse 101 Webinar PresentationAzure Synapse 101 Webinar Presentation
Azure Synapse 101 Webinar PresentationMatthew W. Bowers
 
Bringing the Power of Big Data Computation to Salesforce
Bringing the Power of Big Data Computation to SalesforceBringing the Power of Big Data Computation to Salesforce
Bringing the Power of Big Data Computation to SalesforceSalesforce Developers
 
Benefits of the Azure Cloud
Benefits of the Azure CloudBenefits of the Azure Cloud
Benefits of the Azure CloudCaserta
 
SPS Vancouver 2018 - What is CDM and CDS
SPS Vancouver 2018 - What is CDM and CDSSPS Vancouver 2018 - What is CDM and CDS
SPS Vancouver 2018 - What is CDM and CDSNicolas Georgeault
 
Derfor skal du bruge en DataLake
Derfor skal du bruge en DataLakeDerfor skal du bruge en DataLake
Derfor skal du bruge en DataLakeMicrosoft
 
SendGrid Improves Email Delivery with Hybrid Data Warehousing
SendGrid Improves Email Delivery with Hybrid Data WarehousingSendGrid Improves Email Delivery with Hybrid Data Warehousing
SendGrid Improves Email Delivery with Hybrid Data WarehousingAmazon Web Services
 
Benefits of the Azure cloud
Benefits of the Azure cloudBenefits of the Azure cloud
Benefits of the Azure cloudJames Serra
 
Capture the Cloud with Azure
Capture the Cloud with AzureCapture the Cloud with Azure
Capture the Cloud with AzureShahed Chowdhuri
 
Meetup Toulouse Microsoft Azure : Bâtir une solution IoT
Meetup Toulouse Microsoft Azure : Bâtir une solution IoTMeetup Toulouse Microsoft Azure : Bâtir une solution IoT
Meetup Toulouse Microsoft Azure : Bâtir une solution IoTAlex Danvy
 
Cloud Scale Analytics Pitch Deck
Cloud Scale Analytics Pitch DeckCloud Scale Analytics Pitch Deck
Cloud Scale Analytics Pitch DeckNicholas Vossburg
 
OpenSistemas Corporate Presentation
OpenSistemas Corporate PresentationOpenSistemas Corporate Presentation
OpenSistemas Corporate PresentationOpenSistemas
 
zData BI & Advanced Analytics Platform + 8 Week Pilot Programs
zData BI & Advanced Analytics Platform + 8 Week Pilot ProgramszData BI & Advanced Analytics Platform + 8 Week Pilot Programs
zData BI & Advanced Analytics Platform + 8 Week Pilot ProgramszData Inc.
 
Big Data on Azure Tutorial
Big Data on Azure TutorialBig Data on Azure Tutorial
Big Data on Azure Tutorialrustd
 

Similar a Microsoft cloud big data strategy (20)

How does Microsoft solve Big Data?
How does Microsoft solve Big Data?How does Microsoft solve Big Data?
How does Microsoft solve Big Data?
 
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
 
Opportunity: Data, Analytic & Azure
Opportunity: Data, Analytic & Azure Opportunity: Data, Analytic & Azure
Opportunity: Data, Analytic & Azure
 
Big Data in Azure
Big Data in AzureBig Data in Azure
Big Data in Azure
 
Modern Business Intelligence and Advanced Analytics
Modern Business Intelligence and Advanced AnalyticsModern Business Intelligence and Advanced Analytics
Modern Business Intelligence and Advanced Analytics
 
Azure databricks c sharp corner toronto feb 2019 heather grandy
Azure databricks c sharp corner toronto feb 2019 heather grandyAzure databricks c sharp corner toronto feb 2019 heather grandy
Azure databricks c sharp corner toronto feb 2019 heather grandy
 
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solutionDifferentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
 
Azure Synapse 101 Webinar Presentation
Azure Synapse 101 Webinar PresentationAzure Synapse 101 Webinar Presentation
Azure Synapse 101 Webinar Presentation
 
Bringing the Power of Big Data Computation to Salesforce
Bringing the Power of Big Data Computation to SalesforceBringing the Power of Big Data Computation to Salesforce
Bringing the Power of Big Data Computation to Salesforce
 
Benefits of the Azure Cloud
Benefits of the Azure CloudBenefits of the Azure Cloud
Benefits of the Azure Cloud
 
SPS Vancouver 2018 - What is CDM and CDS
SPS Vancouver 2018 - What is CDM and CDSSPS Vancouver 2018 - What is CDM and CDS
SPS Vancouver 2018 - What is CDM and CDS
 
Derfor skal du bruge en DataLake
Derfor skal du bruge en DataLakeDerfor skal du bruge en DataLake
Derfor skal du bruge en DataLake
 
SendGrid Improves Email Delivery with Hybrid Data Warehousing
SendGrid Improves Email Delivery with Hybrid Data WarehousingSendGrid Improves Email Delivery with Hybrid Data Warehousing
SendGrid Improves Email Delivery with Hybrid Data Warehousing
 
Benefits of the Azure cloud
Benefits of the Azure cloudBenefits of the Azure cloud
Benefits of the Azure cloud
 
Capture the Cloud with Azure
Capture the Cloud with AzureCapture the Cloud with Azure
Capture the Cloud with Azure
 
Meetup Toulouse Microsoft Azure : Bâtir une solution IoT
Meetup Toulouse Microsoft Azure : Bâtir une solution IoTMeetup Toulouse Microsoft Azure : Bâtir une solution IoT
Meetup Toulouse Microsoft Azure : Bâtir une solution IoT
 
Cloud Scale Analytics Pitch Deck
Cloud Scale Analytics Pitch DeckCloud Scale Analytics Pitch Deck
Cloud Scale Analytics Pitch Deck
 
OpenSistemas Corporate Presentation
OpenSistemas Corporate PresentationOpenSistemas Corporate Presentation
OpenSistemas Corporate Presentation
 
zData BI & Advanced Analytics Platform + 8 Week Pilot Programs
zData BI & Advanced Analytics Platform + 8 Week Pilot ProgramszData BI & Advanced Analytics Platform + 8 Week Pilot Programs
zData BI & Advanced Analytics Platform + 8 Week Pilot Programs
 
Big Data on Azure Tutorial
Big Data on Azure TutorialBig Data on Azure Tutorial
Big Data on Azure Tutorial
 

Más de James Serra

Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)James Serra
 
Data Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future OutlookData Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future OutlookJames Serra
 
Azure Synapse Analytics Overview (r1)
Azure Synapse Analytics Overview (r1)Azure Synapse Analytics Overview (r1)
Azure Synapse Analytics Overview (r1)James Serra
 
Power BI Overview, Deployment and Governance
Power BI Overview, Deployment and GovernancePower BI Overview, Deployment and Governance
Power BI Overview, Deployment and GovernanceJames Serra
 
Power BI Overview
Power BI OverviewPower BI Overview
Power BI OverviewJames Serra
 
Machine Learning and AI
Machine Learning and AIMachine Learning and AI
Machine Learning and AIJames Serra
 
Azure data platform overview
Azure data platform overviewAzure data platform overview
Azure data platform overviewJames Serra
 
AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...
AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...
AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...James Serra
 
Power BI for Big Data and the New Look of Big Data Solutions
Power BI for Big Data and the New Look of Big Data SolutionsPower BI for Big Data and the New Look of Big Data Solutions
Power BI for Big Data and the New Look of Big Data SolutionsJames Serra
 
How to build your career
How to build your careerHow to build your career
How to build your careerJames Serra
 
Is the traditional data warehouse dead?
Is the traditional data warehouse dead?Is the traditional data warehouse dead?
Is the traditional data warehouse dead?James Serra
 
Introduction to Azure Databricks
Introduction to Azure DatabricksIntroduction to Azure Databricks
Introduction to Azure DatabricksJames Serra
 
Azure SQL Database Managed Instance
Azure SQL Database Managed InstanceAzure SQL Database Managed Instance
Azure SQL Database Managed InstanceJames Serra
 
Learning to present and becoming good at it
Learning to present and becoming good at itLearning to present and becoming good at it
Learning to present and becoming good at itJames Serra
 
What's new in SQL Server 2016
What's new in SQL Server 2016What's new in SQL Server 2016
What's new in SQL Server 2016James Serra
 
Introducing DocumentDB
Introducing DocumentDB Introducing DocumentDB
Introducing DocumentDB James Serra
 
Overview on Azure Machine Learning
Overview on Azure Machine LearningOverview on Azure Machine Learning
Overview on Azure Machine LearningJames Serra
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lakeJames Serra
 
Introduction to Microsoft’s Hadoop solution (HDInsight)
Introduction to Microsoft’s Hadoop solution (HDInsight)Introduction to Microsoft’s Hadoop solution (HDInsight)
Introduction to Microsoft’s Hadoop solution (HDInsight)James Serra
 
HA/DR options with SQL Server in Azure and hybrid
HA/DR options with SQL Server in Azure and hybridHA/DR options with SQL Server in Azure and hybrid
HA/DR options with SQL Server in Azure and hybridJames Serra
 

Más de James Serra (20)

Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)
 
Data Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future OutlookData Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future Outlook
 
Azure Synapse Analytics Overview (r1)
Azure Synapse Analytics Overview (r1)Azure Synapse Analytics Overview (r1)
Azure Synapse Analytics Overview (r1)
 
Power BI Overview, Deployment and Governance
Power BI Overview, Deployment and GovernancePower BI Overview, Deployment and Governance
Power BI Overview, Deployment and Governance
 
Power BI Overview
Power BI OverviewPower BI Overview
Power BI Overview
 
Machine Learning and AI
Machine Learning and AIMachine Learning and AI
Machine Learning and AI
 
Azure data platform overview
Azure data platform overviewAzure data platform overview
Azure data platform overview
 
AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...
AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...
AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...
 
Power BI for Big Data and the New Look of Big Data Solutions
Power BI for Big Data and the New Look of Big Data SolutionsPower BI for Big Data and the New Look of Big Data Solutions
Power BI for Big Data and the New Look of Big Data Solutions
 
How to build your career
How to build your careerHow to build your career
How to build your career
 
Is the traditional data warehouse dead?
Is the traditional data warehouse dead?Is the traditional data warehouse dead?
Is the traditional data warehouse dead?
 
Introduction to Azure Databricks
Introduction to Azure DatabricksIntroduction to Azure Databricks
Introduction to Azure Databricks
 
Azure SQL Database Managed Instance
Azure SQL Database Managed InstanceAzure SQL Database Managed Instance
Azure SQL Database Managed Instance
 
Learning to present and becoming good at it
Learning to present and becoming good at itLearning to present and becoming good at it
Learning to present and becoming good at it
 
What's new in SQL Server 2016
What's new in SQL Server 2016What's new in SQL Server 2016
What's new in SQL Server 2016
 
Introducing DocumentDB
Introducing DocumentDB Introducing DocumentDB
Introducing DocumentDB
 
Overview on Azure Machine Learning
Overview on Azure Machine LearningOverview on Azure Machine Learning
Overview on Azure Machine Learning
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lake
 
Introduction to Microsoft’s Hadoop solution (HDInsight)
Introduction to Microsoft’s Hadoop solution (HDInsight)Introduction to Microsoft’s Hadoop solution (HDInsight)
Introduction to Microsoft’s Hadoop solution (HDInsight)
 
HA/DR options with SQL Server in Azure and hybrid
HA/DR options with SQL Server in Azure and hybridHA/DR options with SQL Server in Azure and hybrid
HA/DR options with SQL Server in Azure and hybrid
 

Último

KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostMatt Ray
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1DianaGray10
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...DianaGray10
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopBachir Benyammi
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintMahmoud Rabie
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...Aggregage
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxUdaiappa Ramachandran
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesDavid Newbury
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Commit University
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-pyJamie (Taka) Wang
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdfPedro Manuel
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Adtran
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?IES VE
 
OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureEric D. Schabell
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.YounusS2
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Will Schroeder
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemAsko Soukka
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024SkyPlanner
 
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfJamie (Taka) Wang
 

Último (20)

KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 Workshop
 
20230104 - machine vision
20230104 - machine vision20230104 - machine vision
20230104 - machine vision
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership Blueprint
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptx
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond Ontologies
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-py
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdf
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?
 
OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability Adventure
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystem
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024
 
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
 

Microsoft cloud big data strategy

  • 2. About Me  Microsoft, Big Data Evangelist  In IT for 30 years, worked on many BI and DW projects  Worked as desktop/web/database developer, DBA, BI and DW architect and developer, MDM architect, PDW/APS developer  Been perm employee, contractor, consultant, business owner  Presenter at PASS Business Analytics Conference, PASS Summit, Enterprise Data World conference  Certifications: MCSE: Data Platform, Business Intelligence; MS: Architecting Microsoft Azure Solutions, Design and Implement Big Data Analytics Solutions, Design and Implement Cloud Data Platform Solutions  Blog at JamesSerra.com  Former SQL Server MVP  Author of book “Reporting with Microsoft SQL Server 2012”
  • 3. Agenda  Big data defined  Microsoft big data solution  Azure data lake
  • 5. Big Data is changing traditional data warehousing … data warehousing has reached the most significant tipping point since its inception. The biggest, possibly most elaborate data management system in IT is changing. – Gartner, “The State of Data Warehousing”* * Donald Feinberg, Mark Beyer, Merv Adrian, Roxane Edjlali (Gartner), The State of Data Warehousing in 2012 (Stamford, CT.: Gartner, 2012) Data sources ETL Data warehouse BI and analytics
  • 6. Big Data has new data characteristics Data complexity: variety and velocity Petabytes
  • 7. Big Data is driving transformative changes Traditional Big Data Relational data with highly modeled schema All data with schema agility Specialized HW Commodity HW Data characteristics Costs Culture Operational reporting Focus on rear-view analysis Experimentation leading to intelligent action With machine learning, graph, a/b testing
  • 8. Big Data introduces new culture of experimentation Understand customer patterns to uncover cross-sell opportunities Historical campaign effectiveness Generate year-end financial reports Financial monitoring with real-time recommendations to increase revenue Generate year-end financial reports Real-time product offers and promotions based on behavior Collect historical data on equipment performance Real-time monitoring to identify proactive maintenance Shipping features without understanding success Building successful features correlating user action with product experience
  • 9. Action Value From data to decisions and actions
  • 10. However, there are challenges to Big Data… Obtaining skills and capabilities Determining how to get value Integrating with existing IT investments *Gartner: Survey Analysis – Hadoop Adoption Drivers and Challenges (Stamford, CT.: Gartner, 2015)
  • 11. But, Microsoft has done it before We needed to better leverage data and analytics to do more experimentation So we: • Designed a data lake for everyone to put their data into • Built tools approachable by any developer • Created machine learning tools for collaborating across large experiment models Result: • Across Microsoft, ten thousand developers doing experimentation leading to better insights • Leading to growth in our Microsoft businesses: • Office productivity revenue (45%YoY)* • Intelligent Cloud (100% YoY)* • Bing search share doubles 2010 2011 2012 2013 2014 2015 Growth of data @ Microsoft Windows SMSG Live Bing CRM/Dynamics Xbox Live Office365 Malware Protection Microsoft Stores Commerce Risk Skype LCA Exchange Yammer PetabytesExabytes * Microsoft. FY16 Q4 Results, URL: http://www.microsoft.com/en-us/Investor/earnings/FY-2016-Q4/press-release-webcast
  • 12. Microsoft is now taking everything we’ve learned on this journey and bringing it to our customers Technology. Cost. Culture.
  • 14. Big Data as a cornerstone of Cortana Intelligence Action People Automated Systems Apps Web Mobile Bots Intelligence Dashboards & Visualizations Cortana Bot Framework Cognitive Services Power BI Information Management Event Hubs Data Catalog Data Factory Machine Learning and Analytics HDInsight (Hadoop and Spark) Stream Analytics Intelligence Data Lake Analytics Machine Learning Big Data Stores Data Lake Store Data Sources Apps Sensors and devices Data SQL Data Warehouse
  • 15. CONTROL EASE OF USE Azure Data Lake Analytics Azure Data Lake Store Azure Storage Any Hadoop technology Workload optimized, managed clusters Specific apps in a multi- tenant form factor Azure Marketplace HDP | CDH | MapR Azure Data Lake Analytics IaaS Hadoop Managed Hadoop Big Data as-a-service Azure HDInsight BIGDATA STORAGE BIGDATA ANALYTICS Bringing Big Data to everybody Accelerate the pace of innovation through a state-of-the-art cloud platform UserAdoption
  • 16. Microsoft Big Data Portfolio SQL Server Stretch Business intelligence Machine learning analytics Insights Azure SQL Database SQL Server 2016 SQL Server 2016 Fast Track Azure SQL DW Azure Data Lake DocumentDB HDInsight Hadoop Analytics Platform System Sequential Scale Out + AcrossScale Up Key Relational Non-relational On-premisesCloud Microsoft has solutions covering and connecting all four quadrants – that’s why SQL Server is one of the most utilized databases in the world 16
  • 17. Azure HDInsight A Cloud Spark and Hadoop service for the Enterprise Reliable with an industry leading SLA Enterprise-grade security and monitoring Productive platform for developers and scientists Cost effective cloud scale Integration with leading ISV applications Easy for administrators to manage 63% lower TCO than deploy your own Hadoop on-premises* *IDC study “The Business Value and TCO Advantage of Apache Hadoop in the Cloud with Microsoft Azure HDInsight”
  • 18. Hortonworks Data Platform (HDP) 2.5 Simply put, Hortonworks ties all the open source products together (22) (under the covers of HDInsight)
  • 19. Azure Data Lake Store A No limits Data Lake that powers Big Data Analytics Petabyte size files and Trillions of objects Scalable throughput for massively parallel analytics HDFS for the cloud Always encrypted, role-based security & auditing Enterprise-grade support
  • 20. Azure Data Lake Analytics A No limits Analytics Job Service to power intelligent action Start in seconds, scale instantly, pay per job Develop massively parallel programs with simplicity Debug and optimize your big data programs with ease Virtualize your analytics Enterprise-grade security, auditing and support
  • 21. Azure Data Lake YARN U-SQL Analytics HDInsight Hive R Server HDFS Store Store and analyze data of any kind and size Develop faster, debug and optimize smarter Interactively explore patterns in your data No learning curve Managed and supported Dynamically scales to match your business priorities Enterprise-grade security Built on YARN, designed for the cloud
  • 22. Azure SQL Data Warehouse A relational data warehouse-as-a-service, fully managed by Microsoft. Industries first elastic cloud data warehouse with enterprise-grade capabilities. Integrated with on-premises and cloud assets. Simple compute & storage billing Pay for what you need High performance without rewriting applications Low cost for latent data Infrastructure, management and support provided Scales to petabytes of data with MPP processing Resize compute nodes < 1 minute Faster time to insight than other SMP offering Designed for “on-demand” workload Integrated with Azure platform and other Microsoft services Enables hybrid solutions Built on SQL Server experience & technology
  • 23. PolyBase Query relational and non-relational data with T-SQL By preview early this year PolyBase will support Teradata, Oracle, SQL Server, MongoDB, Hadoop and Azure blob storage
  • 24. Publish-subscribe data distribution Managed PaaS (Platform as a Service) solution Scales with your needs to millions of events per second Provides a durable buffer between event publishers and event consumers Azure Event Hubs
  • 25. Azure Stream Analytics Process real-time data in Azure Consumes millions of real-time events from Event Hub collected from devices, sensors, infrastructure, and applications Performs time-sensitive analysis using SQL-like language against multiple real-time streams and reference data Outputs to persistent stores, dashboards or back to devices Point of Service Devices Self Checkout Stations Kiosks Smart Phones Slates/ Tablets PCs/ Laptops Servers Digital Signs Diagnostic EquipmentRemote Medical Monitors Logic Controllers Specialized DevicesThin Clients Handhelds Security POS Terminals Automation Devices Vending Machines Kinect ATM
  • 26. Azure Machine Learning Get started with just a browser Requires no provisioning; simply log on to your Azure subscription or try it for free off azure.com/ml Experience the power of choice Choose from hundreds of algorithms and packages from R and Python or drop in your own custom code Take advantage of business-tested algorithms from Xbox and Bing Deploy solutions in minutes With the click of a button, deploy the finished model as a web service that can connect to any data, anywhere Connect to the world Brand and monetize solutions on our global Machine Learning Marketplace https://datamarket.azure.com/ Beyond business intelligence – machine intelligence Microsoft Azure Machine Learning Studio Modeling environment (shown) Microsoft Azure Machine Learning API service Model in production as a web service Microsoft Azure Machine Learning Marketplace APIs and solutions for broad use
  • 27. Enable enterprise-wide self-service data source registration and discovery A metadata repository that allow users to register, enrich, understand, discover, and consume data sources Delivers differentiated value though ‒ Data source discovery; rather than data discovery ‒ Support for data from any source; Structured and unstructured, on premises and in the cloud ‒ Publishing, discovery and consumption through any tool ‒ Annotation crowdsourcing: empowering any user to capture and share their knowledge. This, while allowing IT to maintain control and oversight
  • 28. Azure Data Factory Connect to relational or non- relational data that is on- premises or in the cloud Orchestrate data movement & data processing Publish to Power BI users as a searchable data view Operationalize (schedule, manage, debug) workflows Lifecycle management, monitoring Orchestrate trusted information production in Azure Microsoft Confidential – Under Strict NDA C# MapReduce Hive Pig Stored Procedures Azure Machine Learning
  • 29. Discovery & exploration – Custom visualizations— R integration -
  • 34. Microsoft Cognitive Services Give your apps a human side Cognitive Services API Collection
  • 35. Azure Analysis Services Azure Analysis Services is based on the proven analytics engine that has helped organizations turn complex data into a trusted, single source of truth for years. Built for hybrid data Access and model data on-premises, in the cloud, or both Interactive visualization Quick, highly interactive self-service data discovery with support of major data visualization tools Proven technology Powerful, proven tabular models built from SQL Server 2016 Analysis Services Cloud powered Easy to deploy, scale, and manage as a platform-as- a-service solution
  • 36. SQL Server R Services Linux Hadoop Teradata Windows CommercialCommunity R ServerR Open
  • 37. Fully managed database service built on a native JSON data model Application controlled schema with massive scale-out enables iterative development and evolving data models Automatic indexing enables robust querying over schema-free data Integrated transactional JavaScript processing + tunable consistency enable high performance application experiences Azure DocumentDB
  • 38. SQL Server on Linux (Preview today, GA in mid-2017) Red Hat - Microsoft Partnership (Nov 2015) Microsoft joins Eclipse Foundation (Mar 2016). HD Insight PaaS on Linux GA (Sep 2015) C:Usersmarkhill> root@localhost: # bash Azure Marketplace 60% of all images in Azure Marketplace are based on Linux/OSS In partnership with the Linux Foundation, Microsoft releases the Microsoft Certified Solutions Associate (MCSA) Linux on Azure certification. 493,141,677 ?????? Microsoft Open Source Hub Ross Gardler: President Apache Software Foundation Wim Coekaerts: Oracle’s Mr Linux 1 out of 4 VMs on Azure runs Linux, and getting larger every day • 28.9% of All VMs are Linux • >50% of new VMs
  • 40. Azure Data Lake Big Data made easy Analytics on any data, any size Easier and more productive for all users Enterprise-ready
  • 41. Azure Data Lake Big Data made easy Analytics on any data, any size Easier and more productive for all users Enterprise-ready
  • 42. Petabyte size files and Trillions of objects • Store data in it’s native format • PB sized files, 200x larger than anyone else • Scalable throughput for massively parallel analytics • No need to redesign application or reparation data at higher scale TBs EBs Store
  • 43. Any type of analytics • Batch, interactive, streaming, machine learning • Allows for exploratory analytics over data • Analyze with Hadoop and Microsoft solutions Cortana Intelligence Suite YARN U-SQL Analytics HDInsight HDFS Store Hive R Server
  • 44. Start in seconds, Scale instantly, Pay per job with Analytics • Process big data jobs in 30 seconds • No infrastructure to worry about (no servers, no VMs, no clusters) • Instantly scale analytic units up or down (processing power) • Architected for cloud scale and performance • Frees you up to focus only on your business logic
  • 45. Azure Data Lake Big Data made easy Analytics on any data, any size Easier and more productive for all users Enterprise-ready
  • 46. Easy for administrators to spin up quickly • Deploy big data projects in minutes • No hardware to install, tune, configure or deploy • No infrastructure or software to manage • Scale to tens to thousands of machines instantly
  • 47. Debug and Optimize your Big Data programs with ease • Deep integration with Visual Studio, Visual Studio Code, Eclipse, & IntelliJ • Easy for novices to write simple queries • Integrated with U-SQL, Hive, Storm, and Spark • Actively offers recommendations to improve performance and reduce cost • Playback visually displays job run
  • 48. Develop massively parallel programs with simplicity • U-SQL: a simple and powerful language that’s familiar and easily extensible • Unifies the declarative nature of SQL with expressive power of C# • Leverage existing libraries in .NET languages, R and Python • Massively parallelize code on diverse workloads (ETL, ML, image tagging, facial detection)
  • 49. Query data where it lives Easily query data in multiple Azure data stores without moving it to a single store Benefits • Avoid moving large amounts of data across the network between stores (federated query/logical data warehouse) • Single view of data irrespective of physical location • Minimize data proliferation issues caused by maintaining multiple copies • Single query language for all data • Each data store maintains its own sovereignty • Design choices based on the need • Push SQL expressions to remote SQL sources • Filters • Joins U-SQL Query Query Azure Storage Blobs Azure SQL in VMs Azure SQL DB Azure Data Lake Analytics Azure SQL Data Warehouse Azure Data Lake Storage
  • 50. Easy for data scientists with familiar R language R Server for HDInsight • Largest portable R parallel analytics library • Terabyte-scale machine learning—1,000x larger than in open source R • Up to 100x faster performance using Spark and optimized vector/math libraries • Enterprise-grade security and support *Applies to HDInsight only
  • 51. Azure Data Lake Big Data made easy Analytics on any data, any size Easier and more productive for all users Enterprise-ready
  • 52. Highest availability guarantee in the industry for peace of mind • Managed, monitored and supported by Microsoft • Enterprise-leading SLA— 99.9% uptime • No IT resources needed for upgrades and patching • Microsoft monitors your deployment so you don’t have to 99.9% SLA
  • 53. Azure Regions 38 Regions Worldwide, 32 Generally Available  100+ datacenters  Top 3 networks in the world  2.5x AWS, 7x Google DC Regions  G Series – Largest VM in World, 32 cores, 448GB Ram, SSD…
  • 54. Always encrypted, Role-based security & Auditing • Always encrypted; in motion using SSL, and at rest using keys in Azure Key Vault • Single sign-on, multi-factor authentication and seamless integration of on-premises identities with Active Directory • Fine-grained POSIX-based ACLs for role-based access controls • Auditing every access / configuration change
  • 55. Lower total cost of ownership • No hardware • Hadoop support included with Azure support • Pay only for what you use • Independently scale storage and compute • No need to hire specialized operations team • 63% lower total cost of ownership than on-premises* *IDC study “The Business Value and TCO Advantage of Apache Hadoop in the Cloud with Microsoft Azure HDInsight”
  • 57. Recognized by top analysts Forrester Wave for Big Data Hadoop Cloud • Named industry leader by Forrester with the most comprehensive, scalable, and integrated platforms* • Recognized for its cloud-first strategy that is paying off* *The Forrester WaveTM: Big Data Hadoop Cloud Solutions, Q2 2016.
  • 58. Q & A ? James Serra, Big Data Evangelist Email me at: JamesSerra3@gmail.com Follow me at: @JamesSerra Link to me at: www.linkedin.com/in/JamesSerra Visit my blog at: JamesSerra.com (where this slide deck is posted under the “Presentations” tab)

Notas del editor

  1. Think of big data as all data, no matter what the volume, velocity, or variety. The simple truth is a traditional on-prem data warehouse will not handle big data. So what is Microsoft’s strategy for building a big data solution? And why is it best to have this solution in the cloud? That is what this presentation will cover. Be prepared to discover all the various Microsoft technologies and products from collecting data, transforming it, storing it, to visualizing it. My goal is to help you not only understand each product but understand how they all fit together, so you can be the hero who builds your companies big data solution.
  2. Fluff, but point is I bring real work experience to the session
  3. Key goal of slide: To convey what every IT person knows: The data warehouse and what’s it for. Then we set-up the Gartner quote to say that there is a tipping point. End the slide with a question: Why is it at a tipping point?   Slide talk track: What is the “traditional” data warehouse? IT professionals know this well. A data warehouse or an enterprise data warehouse is a database that was designed specifically for data analysis. It is the single source of truth or the central repository for all data in the company. This means disparate data in the company coming from your transactional systems, your ERP, CRM or Line of Business applications would all be extracted, transformed, and cleansed and put into the warehouse. It was built so that the people who is accessing the warehouse using BI tools will be accessing data that has been provisioned by IT and represent accurate data sanctioned by the company. However, this traditional data warehouse is reaching an inflection point. Gartner in their analysis of the state of data warehousing noted that it is reaching the most significant tipping point since it’s inception. The question is why? What is going on?
  4. Data is now the key strategic business asset. Every device, every customer, every activity – everything that’s happening in the world around us - is producing incredibly rich data that can help us create new experiences, new efficiencies, new business models and even new inventions. Leveraging this data can be the differentiator for your business. For example, IDC estimates companies that are leaders in using data assets to their advantage will capture $1.6 trillion more in business value than those that lag behind.   While data is pervasive, actionable intelligence from data is elusive. Our customers want to transform data to intelligent action and reinvent their business processes. To do this they need to more easily analyze massive amounts of data – so they can move from seeing “what happened” and understanding “why it happened” to predicting “what will happen” and ultimately, knowing “what should I do”. Only then can they create the intelligent enterprise.
  5. Result: Used across Microsoft in Office, Xbox Live, Azure, Windows, Bing and Skype Supports ten thousand developers running experimentations Manages exabytes of data https://www.microsoft.com/en-us/Investor/earnings/FY-2016-Q4/press-release-webcast
  6. Everything: technology, cost, culture
  7. Our portfolio of products provides customers with the power to deploy the solution that suits their business needs. Your choice of platform, whether on-premises, hybrid or private or public cloud, doesn’t limit you now or in the future. Migrating or expanding becomes an easy process and doesn’t require excessive downtime or introduce potential threats to your business success. With Microsoft, you can seamlessly scale up to larger processing and storage capabilities, or scale out by adding additional servers in parallel arrangement. T: SQL Server is a trusted market leader, and it’s the cornerstone of our data warehouse offering.
  8. Reliable Open Source analytics with an Industry leading SLA HDInsight allows you to easily spin up enterprise-grade open source cluster types guaranteed with the industry’s best 99.9% SLA and 24/7 support. We guarantee this SLA for the entire big data solution, not just the VM instances. HDInsight is architected for full redundancy and high availability including head node replication, data geo-replication, and built-in standby NameNode making HDInsight resilient to critical failures not addressed in standard Hadoop implementations. Azure also offers cluster monitoring and 24x7 enterprise support backed by Microsoft and Hortonworks with 37 combined committers for Hadoop core, more than all other managed cloud providers combined to support your deployment and the ability to fix and commit code back to Hadoop. Enterprise Grade Security & Monitoring HDInsight protects your data assets and easily extends your on-premise security and governance controls to the cloud. We feature single sign-on (SSO), multi-factor authentication and seamless management of millions of identities through Azure Active Directory. You can authorize users and groups with fine-grained access control policies over all your enterprise data with Apache Ranger. HDInsight meets HIPAA, PCI, SOC compliance, ensuring your enterprise data assets are always protected with the highest security and regulatory compliance. To ensure the highest level of business continuity, HDInsight extends capabilities for alerting, monitoring, defining pre-emptive actions, and enhanced workload protection through native integration with Azure Operations Management Suite (OMS). Most Productive platform for developers and scientists HDInsight offers developers tailored experiences through rich productivity suites for Hadoop & Spark with integrated development environments using Visual Studio, Eclipse, and IntelliJ supporting Scala, Python, R, Java, and .Net. HDInsight gives data scientists the ability to create narratives that combine code, statistical equations, and visualizations that tell a story about the data through integration to the two most popular notebooks: Jupyter and Zeppelin. HDInsight is also the only managed cloud Hadoop solution with integration to Microsoft R Server. Multi-threaded math libraries and transparent parallelization in R Server means handling up to 1000x more data and up to 50x faster speeds than open source R—helping you train more accurate models for better predictions than previously possible. Cost effective cloud scale HDInsight has decoupled compute and storage, enabling you to cost-effectively scale workloads up or down, independent of storage. Local storage can still be used for caching and fast I/O. Spark and interactive Hive users can choose SSD memory for interactive performance; while Kafka users can retain all streaming data in premium managed disks. You only pay for the compute and storage you use and are given the ability to choose any Azure VM types that enables the best utilization of resources. A recent study showed HDInsight delivering 63% lower TCO than deploying Hadoop on premises over 5 years.* Integration with leading Productivity Applications In the broader ecosystem for Hadoop, there is a thriving market of independent software vendors (ISVs) who provide value added solutions. Through a unique design where every cluster is extended with edge nodes and script action, HDInsight lets customers spin up Hadoop and Spark clusters pre-integrated and pre-tuned with any ISV application out-of-the-box. Datameer, Cask, AtScale, StreamSets are few such applications, which are very popular on the HDInsight platform today. Easy for administrators to manage With HDInsight, administrators can deploy Hadoop in the cloud without buying new hardware or incurring other up-front costs. There’s also no time-consuming installation or set up. There is also no need to patch the operating system or upgrade the Hadoop versions. Azure does it for you. Launch your first cluster in minutes.
  9. Petabyte size files and Trillions of objects: With Azure Data Lake Store your organization can analyze all of its data in a single place with no artificial constraints. Your Data Lake Store can store trillions of files where a single file can be greater than a petabyte in size which is 200x larger than other cloud stores. This makes Data Lake Store ideal for storing any type of data including massive datasets like high-resolution video, genomic and seismic datasets, medical data, and data from a wide variety of industries. Scalable throughput  for massively parallel analytics: Without redesigning your application or repartitioning your data at higher scale, Data Lake Store scales throughput to support any size of analytic workload. It provides massive throughput to run analytic jobs with 1,000+ concurrent executors that read and write hundreds of terabytes of data efficiently. HDFS for the Cloud: Microsoft Azure Data Lake Store supports any application that uses the open Apache Hadoop Distributed File System (HDFS) standard. By supporting HDFS, you can easily migrate your existing Hadoop and Spark data to the cloud without recreating your HDFS directory structure. Always encrypted, Role-based security & Auditing: Data Lake Store protects your data assets and extends your on-premises security and governance controls to the cloud easily.  Data is always encrypted; in motion using SSL, and at rest using service or user managed HSM-backed keys in Azure Key Vault.  Capabilities such as single sign-on (SSO), multi-factor authentication and seamless management of millions of identities is built-in through Azure Active Directory. You can authorize users and groups with fine-grained POSIX-based ACLs for all data in the Store enabling role-based access controls. Finally, you can meet security and regulatory compliance needs by auditing every access or configuration change to the system. Enterprise-grade Support: We guarantee a 99.9% enterprise-grade SLA and 24/7 support for your big data solution.
  10. Start in seconds, Scale instantly, Pay per job: Our on-demand service will have you processing Big Data jobs within 30 seconds. There is no infrastructure to worry about because there are no servers, VMs, or clusters to wait for, manage or tune. You can instantly scale the analytic units (processing power) from one to thousands for each job. You only pay for the processing used per job. Develop massively parallel programs with simplicity: U-SQL is a simple, expressive, and extensible language that allows you to write code once and automatically have it be parallelized for the scale you need. You can process petabytes of data for diverse workload categories such as ETL, machine learning, cognitive science, machine translation, imaging processing, and sentiment analysis by using U-SQL and leveraging existing libraries written in .NET languages, R, or Python.. Debug and Optimize your Big Data programs with ease: Debugging failures in cloud distributed programs are now as easy as debugging a program in your personal environment. Our execution environment actively analyzes your programs as they run and offers recommendations to improve performance and reduce cost. For example, if you requested 1000 AUs for your program and only 50 AUs were needed, the system would recommend that you only use 50 AUs resulting in a 20x cost savings. Virtualize your analytics: The power to act on all your data with optimized data virtualization of your relational sources such as Azure SQL Server on VMs, Azure SQL Database, and Azure SQL Data Warehouse. Queries are automatically optimized by moving processing close to the source data, without data movement, thereby maximizing performance and minimizing latency. Enterprise-grade Security, Auditing and Support: Extend your on-premises security and governance controls to the cloud for meeting your security and regulatory compliance needs. Capabilities such as single sign-on (SSO), multi-factor authentication and seamless management of millions of identities is built-in through Azure Active Directory. Role Based Access control, and the ability to audit all processing and management operations are on by default. We guarantee a 99.9% enterprise-grade SLA and 24/7 support for your big data solution.
  11. 22
  12. We are planning to release a preview of this functionality early next year as part of SQL Server V.Next CTPs, exact release dates are still in flux. By preview early next year PolyBase will support Teradata, Oracle, SQL Server, MongoDB, Hadoop and Azure blob storage (not MySQL!). We will continue to add more sources until GA. http://demo.sqlmag.com/scaling-success-sql-server-2016/integrating-big-data-and-sql-server-2016 When it comes to key BI investments we are making it much easier to manage relational and non-relational data with Polybase technology that allows you to query Hadoop data and SQL Server relational data through single T-SQL query. One of the challenges we see with Hadoop is there are not enough people out there with Hadoop and Map Reduce skillset and this technology simplifies the skillset needed to manage Hadoop data. This can also work across your on-premises environment or SQL Server running in Azure.
  13. Comparison of IoT Hub and Event Hubs: https://azure.microsoft.com/en-us/documentation/articles/iot-hub-compare-event-hubs/
  14. Azure Stream Analytics is a cost effective event processing engine that helps uncover real-time insights from devices, sensors, infrastructure, applications, and data. It will enable various opportunities including Internet of Things (IoT) scenarios such as real-time fleet management or gaining insights from devices like mobile phones and connected cars. Deployed in the Azure cloud, Stream Analytics has elastic scale where resources are efficiently allocated and paid for as requested. Developers are given a rapid development experience where they describe their desired transformations in SQL and the system abstracts the complexities of the parallelization, distributed computing, and error handling from them. Looking forward into H2 FY15, Stream Analytics will become generally available after previewing at TechEd EMEA 2014.
  15. Microsoft’s Big Data vision in the cloud is to enable organizations to solve large, complex problems end-to-end, from storing and managing TBs of data without investing in hardware and software, to seamless integration with the 1 billion users of Excel. As part of this vision, Microsoft offers Azure Machine Learning, designed to democratize the complex task of advanced analytics. Advanced analytics is using products like Azure Machine Learning to find new and actionable insights that traditional approaches to business intelligence are unlikely to discover. An easy way to think about this is thinking about a dashboard. Today when confined by only BI tools without a connection to machine learning, it is solely the job of the human looking at the spreadsheet to gain insights and react to the data. But a human can only consume so many variables. A computer, on the other hand, can consume a great deal more variables to provide much deeper insight on the data. Humans can then react to the data to make decisions that drive competitive advantage, as well as program the computer further to recognize important patterns in the future. This is why we say beyond business intelligence – machine intelligence. The accessibility of our solution starts with set up. Previously you needed to provision your workspace on-premises for machine learning, also thinking about server space and a host of other considerations. Today you can get started with just a browser. With only an Azure subscription, you can take advantage of the full functionality of Azure Machine Learning within minutes. Taking a test drive is even easier, click Get Started off azure.com/ml and with simply a Microsoft ID you’re off to the races. Another limit with other machine learning solutions are siloed environments that only allow for one programming language or make changing from one algorithm to another time consuming and complex. With Azure ML, you can experience the power of choice. That choice expands to language, with both Python and R being first class citizens of Azure ML, or algorithm. You can choose from hundreds of algorithms, including business-tested ones running our Microsoft businesses today. And swapping out algorithms to land on the right one for you is done with a click. Additionally you can drop in custom R and Python code – your “special sauce” – and mix and match that with the other options in the tool. Most revolutionary of all you can deploy solutions in minutes as a web service, which is simply a url which can connect to any data, anywhere – including on-premises or in another cloud environment. The ability to put a model into production almost immediately, as well as revise it easily, is unique to Microsoft and allows companies to stay on top of the changing business landscape more effectively than is offered by any other provider today. We even take that a step further, allowing model developers to connect to the world with our Machine Learning Marketplace, where they can publish finished solutions and APIs with their own brand and business model. Developers can also discover machine learning solutions there without any machine learning skills needed – the data science is inside. Check it out at https://datamarket.azure.com/.
  16. Azure Data Factory is a cloud service for creating, managing, and monitoring the production of trusted information from on-premises and cloud data sources using transformative analytics at scale. Data Factory can be used in solutions to gain insights from operational and service health telemetry data, analyze customer actions to determine an optimal targeted marketing strategy, or predict customer churn from customer profile and service log data. Instead of writing hard-to-manage custom code to wire together a data warehouse with Hadoop, NoSQL, and SaaS, use Data Factory to quickly create and deploy highly available data processing pipelines, significantly cutting your time to solution and your operational costs. Get a single monitoring view of all of your data processing pipelines along with data lineage and service health. Bring together on-premises data like SQL Server and cloud data like Azure SQL Database, Blobs, and Tables with the transformative analytics of HDInsight (Hive, Pig, MapReduce, custom .NET code), and even Azure Machine Learning, to produce trusted information that is easily consumed by BI tools or applications. Looking forward into H2 FY15, Data Factory will become generally available after previewing at TechEd EMEA 2014.
  17. Power BI Desktop is a self-service BI tool designed to allow users to pull data together from multiple different data sources. Transform and clean that data. Model and add custom calculations. And then visually explore and create interactive reports that can be easily published and shared through the Power BI service. In addition – you can now create your own custom visualizations though our open source visualization framework. More information available at powerbi.com/visuals
  18. Power BI dashboards With updates to Power BI customers can now see all their data through a single pane of glass. Live Power BI dashboards show visualizations and KPIs from data that reside both on-premises and in the cloud, providing a consolidated view across their business regardless of where their data lives. You can then explore their data further by drilling through the dashboard into the underlying reports, discovering new insights that they can pin back to the dashboard to monitor performance going forward.
  19. Natural Language Interface - With Power BI we continue to find new ways to simplify how people analyze and gain insight from data, providing industry leading features such as natural language query. Natural language query provides users with an easier way to interact with their data, allowing them to type questions of their data and receive answers in the form of live visualizations. Power BI integration with Cortana allows you to now ask these question directly from Cortana and to have answers from your Power BI data surfaced to you by Cortana. These data driven answers can range from simple numeric values (“revenue for the last quarter”), charts (“revenue over time”), maps (“revenue by region”) or data represented through any of the other Power BI data visualizations. Combined with the Cortana Analytics suite, this opens up amazing new opportunities to use Cortana to enable your business, and your customers' businesses, to get things done in more helpful, proactive, and natural ways.   Quick Insights - providing a new ways to help users find hidden insights in their data. The new Quick Insights feature allows users to automatically scan and detect patterns and trends in the data that they publish to Power BI. Through a partnership with Microsoft Research, the Quick Insights feature uses a growing list of algorithms to automatically discover and visualize correlations, outliers, trends, seasonality, change points in trends, and other factors in your data in seconds.  
  20. Animation set to loop (replace /Build walk in ?), Add session id to top Bot Framework provides everything you need to build and connect intelligent bots that interact naturally wherever your users are talking, from text/sms to Skype, Slack, Office 365 mail and other popular services. Bot Framework consists of three main components: Bot Connector, Bot Builder, and Bot Directory
  21. At Microsoft, we’ve been offering APIs for a very long time across the company.  In delivering Microsoft Cognitive Services API, we started with 4 last year at /build (2015); added 7 more last December, and today (May 2016) we have 22 APIs in our collection.      Cognitive Services are available individually or as a part of the Cortana Intelligence Suite, formerly known as Cortana Analytics, which provides a comprehensive collection of services powered by cutting-edge research into machine learning, perception, analytics and social bots. These APIs are powered by Microsoft Azure. Developers and businesses can use this suite of services and tools to create apps that learn about our world and interact with people and customers in personalized, intelligent ways.  
  22. Key points: Summarize key benefits for Azure Analysis Services Talk track: As already mentioned, Azure Analysis Services is based on the proven analytics engine in SQL Server 2016 Analysis Services, that has helped organizations turn complex data into a trusted, single source of truth for years. This means that BI professionals who are familiar with SQL Server Analysis Services, tabular models can get started quickly and do not need to learn new tools or skills. And with the power of the cloud, BI professionals do not need to manage infrastructure on-premises. They can easily deploy the BI solution and benefit from the scalability of the cloud. Organizations store data in the cloud and on-premises. Azure Analysis Services is built for hybrid data. Data can be access in the cloud, on-premises or a combination of both, enabling a hybrid solution. So - customers do not have to move on-premises data to the cloud. And last but not least. Azure Analysis Services enables interactive data visualization over billions of rows of data and as it supports BI industry standards such as XML/A and MDX, business users can access data using their preferred data visualization tool. Whether it is Power BI, Excel or other major data visualization tools. To summarize, Azure Analysis Services is simple to use – it is easy to get started, you can use your existing skills to create BI semantic models, and your favorite data visualizations tools to analyze your data.
  23. Slide objective Show broad commitment to R by preserving freely available, enhanced editions, Windows and SQL Server editions and R Server editions for leading EDWs, Linux and Hadoop platforms. Differentiate free, open editions from commercial by mentioning availability of commercial 24x7 support, and enhancements to support very large scale data analytics at speed. Talking points Notes
  24. Microsoft Azure DocumentDB is the highly-scalable NoSQL document database-as-a-service that enables query over schema-free data and multi-document transaction processing helps deliver configurable and reliable performance and enables rapid development DocumentDB is the right solution for applications that run in the cloud when predictable throughput, low latency, and flexible query are key. Fully managed PaaS database service backed by the power of Microsoft Azure. Unlike many other NoSQL offers, DocumentDB was built for the cloud to perform and scale in a multi-tenant environment. Cluster administration, replication, and other management functions are handled for the customer automatically. DocumentDB is backed by a 99.95% availability SLA (at GA) to provide consistent, reliable performance. Application controlled schema with massive scale-out enables iterative development and evolving data models. DocumentDB supports a schema-free data model where the application defines the data model. This supports modern application development scenarios where applications are developed iteratively with many versions supported concurrently and data models continuously evolve. Automatic indexing enables robust querying over schema-free data. DocumentDB is the first of its kind to offer SQL over schema-free JSON data and multi-document transactional processing. Integrated transactional JavaScript processing + tunable consistency enable high performance application experiences. DocumentDB supports stored procedures, triggers, and user-defined functions. It also supports tunable consistency with well-defined click stops to enable developers to tune database performance based on the application’s needs. The key scenarios for DocumentDB are the following: Emitting telemetry and logging data Storing/querying event and workflow data Persisting device and app configuration data User generated content Scalable, iterative app development
  25. Petabyte size files and Trillions of objects: With Azure Data Lake Store your organization can analyze all of its data in a single place with no artificial constraints. Your Data Lake Store can store trillions of files where a single file can be greater than a petabyte in size which is 200x larger than other cloud stores. This makes Data Lake Store ideal for storing any type of data including massive datasets like high-resolution video, genomic and seismic datasets, medical data, and data from a wide variety of industries. Scalable throughput  for massively parallel analytics: Without redesigning your application or repartitioning your data at higher scale, Data Lake Store scales throughput to support any size of analytic workload. It provides massive throughput to run analytic jobs with 1,000+ concurrent executors that read and write hundreds of terabytes
  26. Start in seconds, Scale instantly, Pay per job: Our on-demand service will have you processing Big Data jobs within 30 seconds. There is no infrastructure to worry about because there are no servers, VMs, or clusters to wait for, manage or tune. You can instantly scale the analytic units (processing power) from one to thousands for each job. You only pay for the processing used per job.
  27. Debug and Optimize your Big Data programs with ease: Debugging failures in cloud distributed programs are now as easy as debugging a program in your personal environment. Our execution environment actively analyzes your programs as they run and offers recommendations to improve performance and reduce cost. For example, if you requested 1000 AUs for your program and only 50 AUs were needed, the system would recommend that you only use 50 AUs resulting in a 20x cost savings.
  28. Develop massively parallel programs with simplicity: U-SQL is a simple, expressive, and extensible language that allows you to write code once and automatically have it be parallelized for the scale you need. You can process petabytes of data for diverse workload categories such as ETL, machine learning, cognitive science, machine translation, imaging processing, and sentiment analysis by using U-SQL and leveraging existing libraries written in .NET languages, R, or Python..
  29. With Microsoft Azure HDInsight, Microsoft R Server is now available as an option when you create HDInsight clusters in Azure. This new capability provides data scientists, statisticians, and R programmers with on-demand access to scalable, distributed methods of analytics on HDInsight. Clusters can be sized to the projects and tasks at hand and torn down when they're no longer needed. Since they're part of Azure HDInsight, these clusters come with enterprise-level 24/7 support, an SLA of 99.9% uptime, and the flexibility to integrate with other components in the Azure ecosystem. R Server on HDInsight provides the latest capabilities for R-based analytics on datasets of virtually any size loaded to either Azure Blob or Data Lake storage. Since R Server is built on open source R, the R-based applications you build can leverage any of the 8000+ open source R packages, as well as the routines in ScaleR, Microsoft’s big data analytics package that's included with R Server. The edge node of a cluster provides a convenient place to connect to the cluster and to run your R scripts. With an edge node, you have the option of running ScaleR’s parallelized distributed functions across the cores of the edge node server. You also have the option to run them across the nodes of the cluster by using ScaleR’s Hadoop Map Reduce or Spark compute contexts. The models or predictions that result from analyses can be downloaded for use on-premises. They can also be operationalized elsewhere in Azure, such as through an Azure Machine Learning Studio web service.
  30. https://azure.microsoft.com/en-us/regions/
  31. Always encrypted, Role-based security & Auditing: Data Lake Store protects your data assets and extends your on-premises security and governance controls to the cloud easily.  Data is always encrypted; in motion using SSL, and at rest using service or user managed HSM-backed keys in Azure Key Vault.  Capabilities such as single sign-on (SSO), multi-factor authentication and seamless management of millions of identities is built-in through Azure Active Directory. You can authorize users and groups with fine-grained POSIX-based ACLs for all data in the Store enabling role-based access controls. Finally, you can meet security and regulatory compliance needs by auditing every access or configuration change to the system.
  32. 1) Copy source data into the Azure Data Lake Store (twitter data example) 2) Massage/filter the data using Hadoop (or skip using Hadoop and use stored procedures in SQL DW/DB to massage data after step #5) 3) Pass data into Azure ML to build models using Hive query (or pass in directly from Azure Data Lake Store) 4) Azure ML feeds prediction results into the data warehouse 5) Non-relational data in Azure Data Lake Store copied to data warehouse in relational format (optionally use PolyBase with external tables to avoid copying data) 6) Power BI pulls data from data warehouse to build dashboards and reports 7) Azure Data Catalog captures metadata from Azure Data Lake Store and SQL DW/DB 8) Power BI and Excel can pull data from the Azure Data Lake Store via HDInsight 9) To support high concurrency if using SQL DW, or for easier end-user data layer, create an SSAS cube
  33. Get it at //aka.ms/forresterwave