Se ha denunciado esta presentación.
Se está descargando tu SlideShare. ×

How does Microsoft solve Big Data?

Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Próximo SlideShare
Transitioning to a BI Role
Transitioning to a BI Role
Cargando en…3
×

Eche un vistazo a continuación

1 de 59 Anuncio

How does Microsoft solve Big Data?

Descargar para leer sin conexión

So you got a handle on what Big Data is and how you can use it to find business value in your data.  Now you need an understanding of the Microsoft products that can be used to create a Big Data solution.  Microsoft has many pieces of the puzzle and in this presentation I will show how they fit together.  How does Microsoft enhance and add value to Big Data?  From collecting data, transforming it, storing it, to visualizing it, I will show you Microsoft’s solutions for every step of the way

So you got a handle on what Big Data is and how you can use it to find business value in your data.  Now you need an understanding of the Microsoft products that can be used to create a Big Data solution.  Microsoft has many pieces of the puzzle and in this presentation I will show how they fit together.  How does Microsoft enhance and add value to Big Data?  From collecting data, transforming it, storing it, to visualizing it, I will show you Microsoft’s solutions for every step of the way

Anuncio
Anuncio

Más Contenido Relacionado

Presentaciones para usted (20)

A los espectadores también les gustó (20)

Anuncio

Similares a How does Microsoft solve Big Data? (20)

Anuncio

Más reciente (20)

How does Microsoft solve Big Data?

  1. 1. How does Microsoft solve Big Data? James Serra Big Data Evangelist Microsoft JamesSerra3@gmail.com JamesSerra.com
  2. 2. Other Presentations  Building an Effective Data Warehouse Architecture Reasons for building a DW and the various approaches and DW concepts (Kimball vs Inmon)  Building a Big Data Solution (Building an Effective Data Warehouse Architecture with Hadoop, the cloud and MPP) Explains what Big Data is, it’s benefits including use cases, and how Hadoop, the cloud, and MPP fit in  Finding business value in Big Data (What exactly is Big Data and why should I care?) Very similar to “Building a Big Data Solution” but target audience is business users/CxO instead of architects  How does Microsoft solve Big Data? Covers the Microsoft products that can be used to create a Big Data solution  Modern Data Warehousing with the Microsoft Analytics Platform System The next step in data warehouse performance is APS, a MPP appliance  Power BI, Azure ML, Azure HDInsights, Azure Data Factory, etc Deep dives into the various Microsoft Big Data related products
  3. 3. About Me  Business Intelligence Consultant, in IT for 30 years  Microsoft, Big Data Evangelist  Worked as desktop/web/database developer, DBA, BI and DW architect and developer, MDM architect, PDW/APS developer  Been perm, contractor, consultant, business owner  Presenter at PASS Business Analytics Conference and PASS Summit  MCSE: Data Platform and Business Intelligence  MS: Architecting Microsoft Azure Solutions  Blog at JamesSerra.com  Former SQL Server MVP  Author of book “Reporting with Microsoft SQL Server 2012”
  4. 4. I tried understanding all the Microsoft Big Data products… And ended up passed-out drunk in a Denny’s parking lot Let’s prevent that from happening…
  5. 5. Agenda  Collect + Manage  Transform + Analyze  Visual + Decide  Access Methods  Product Groupings  Modern Data Warehouse  Sample architectures
  6. 6. Microsoft’s portfolio of products • Windows • Visual Studio • .NET • Azure, HDInsight • Power BI: Power Query, Power Map, PowerPivot, Power View • Azure ML • APS • SQL Server, Azure SQL DB • SCOM • SSAS, SSRS, SSIS • Excel • Report Builder • PerformancePoint • SharePoint • DQS • MDS • Data Lake • SQL DW Microsoft has all the Lego's to build anything you want, but difficulty is determining how the pieces fit together
  7. 7. The Microsoft data platform MobileReports Natural language queryDashboardsApplications StreamingRelational Internal & externalNon-relational NoSQL Orchestration Machine learningModeling Information management Complex event processing Transform + analyze Visualize + decide Collect + manage Data
  8. 8. Secure, reliable performance Increase speed across all your data workloads Capture any data: structured, unstructured, and streaming Scale your platform quickly to meet changing demands Collect and manage diverse data types with breakthrough speed Collect + manage Transform + analyze Visualize + decide Collect + manage Data
  9. 9. SQL Server options Azure SQL Database has a max database size of 500GB Potential total volume size of up to 64 TB
  10. 10. Cloud-born data4 Data sources Our customer challenges Increasing data volumes 1 Real-time business requests 2 New data sources and types 3 Non-Relational Data
  11. 11. Parallelism • Uses many separate CPUs running in parallel to execute a single program • Shared Nothing: Each CPU has its own memory and disk (scale-out) • Segments communicate using high-speed network between nodes MPP - Massively Parallel Processing • Multiple CPUs used to complete individual processes simultaneously • All CPUs share the same memory, disks, and network controllers (scale-up) • All SQL Server implementations up until now have been SMP • Mostly, the solution is housed on a shared SAN SMP - Symmetric Multiprocessing
  12. 12. Analytics Platform System (APS) for Big Data Pre-Built Hardware + Software Appliance • Co-engineered with HP, Dell, Quanta • Scale-out, up to 100x performance increase • Optional Hadoop region • Appliance installed in 1-2 days • Support - Microsoft provides first call support • Hardware partner provides onsite break/fix support PlugandPlay Built-inBest Practices SaveTime On-Premise Solution
  13. 13. Office 365 Azure
  14. 14. YARN U-SQL Analytics Service HDInsight HDFS Store Introducing Azure Data Lake Store No fixed limits file size (PB file sizes) Designed for diversity of analytic workloads Accessible to all HDFS compliant analytic applications (Hortonworks, Cloudera, MapR) Managed, monitored, and supported by Microsoft Enterprise grade features around security, compliance & management
  15. 15. Support HBase as NoSQL columnar database on Azure Blobs Support Storm as stream processing Hadoop in Azure (HDP under the covers) Data Node Data Node Data Node Data Node Task Tracker Task Tracker Task Tracker Task Tracker Name Node Job Tracker HMaster Coordination Region Server Region Server Region Server Region Server HBase as a columnar NoSQL transactional database running on Azure Blobs Storm as a streaming service for near real time processing Hadoop 2.4 support for 100x query gains on Hive queries Mahout support for machine learning + Hadoop Graphical User Interface for HIVE queries
  16. 16. Microsoft Azure Data Lake YARN U-SQL Analytics Service HDInsight Store HDFS Announcing Azure Data Lake Analytics Service Distributed analytics service Dynamically scales to meet your business needs Productive day one with industry leading development tools (for novices & experts) Analytics over all data (unstructured, semi- structured, structured) U-SQL: simple and familiar, easily extensible Hive coming soon Built on open standards (YARN)
  17. 17. Data sources What happened? Why did it happen? Descriptive Analytics Diagnostic Analytics Why did it happen? What will happen? Predictive Analytics Prescriptive Analytics How can we make it happen?
  18. 18. Azure Stream Analytics Process real-time data in Azure Consumes millions of real-time events from Event Hub collected from devices, sensors, infrastructure, and applications Performs time-sensitive analysis using SQL-like language against multiple real-time streams and reference data Outputs to persistent stores, dashboards or back to devices Point of Service Devices Self Checkout Stations Kiosks Smart Phones Slates/ Tablets PCs/ Laptops Servers Digital Signs Diagnostic EquipmentRemote Medical Monitors Logic Controllers Specialized DevicesThin Clients Handhelds Security POS Terminals Automation Devices Vending Machines Kinect ATM
  19. 19. • Free and open source R distribution • Enhanced and distributed by Revolution Analytics Microsoft R Open • Secure, Scalable and Supported Distribution of R • With proprietary components created by Revolution Analytics Microsoft R Server
  20. 20. Fully managed database service built on a native JSON data model Application controlled schema with massive scale-out enables iterative development and evolving data models Automatic indexing enables robust querying over schema-free data Integrated transactional JavaScript processing + tunable consistency enable high performance application experiences Azure DocumentDB
  21. 21. SQL Server on Linux (Preview today, GA in mid-2017) Red Hat - Microsoft Partnership (Nov 2015) Microsoft joins Eclipse Foundation (Mar 2016). HD Insight PaaS on Linux GA (Sep 2015) C:Usersmarkhill> root@localhost: # bash Azure Marketplace 60% of all images in Azure Marketplace are based on Linux/OSS In partnership with the Linux Foundation, Microsoft releases the Microsoft Certified Solutions Associate (MCSA) Linux on Azure certification. 493,141,677 ?????? Microsoft Open Source Hub Ross Gardler: President Apache Software Foundation Wim Coekaerts: Oracle’s Mr Linux 1 out of 4 VMs on Azure runs Linux, and getting larger every day • 28.9% of All VMs are Linux • >50% of new VMs
  22. 22. Connect, combine, and refine any data Create data marts and publish reports Build and test predictive models Curate and catalog any data Transform + analyze Transform + analyze Visualize + decide Collect + manage Data Transform and analyze data for anyone to access anywhere
  23. 23. Make sense of disparate data and prepare it for analysis Connect, combine, and refine any data Integration, Data Quality and Master Data Services • Rich support for ETL tasks • Data cleansing and matching • Manage master data structures Connect any data and all volumes in real time • Social data • SAP and Dynamics data • Machine data
  24. 24. Query aggregated data and build reports Create data marts and reports Reporting services • Create and publish interactive reports • Consolidate reporting management • Enable reporting capabilities for anyone Analysis services • Single semantic model • 100x faster analysis with in-memory columnstore • Manage user-created BI content
  25. 25. Use the power of machine learning to predict future trends or behavior Build and test predictive models • HDInsight • SQL Server VM • SQL DB • Blobs and tables Publish API in minutes Devices Applications Dashboards Data Microsoft Azure Machine Learning API Storage space Web Microsoft Azure portal Workspace ML Studio Business problem Business valueModeling Deployment • Desktop files • Excel spreadsheet • Other data files on PC Cloud Local
  26. 26. Azure Machine Learning Get started with just a browser Requires no provisioning; simply log on to your Azure subscription or try it for free off azure.com/ml Experience the power of choice Choose from hundreds of algorithms and packages from R and Python or drop in your own custom code Take advantage of business-tested algorithms from Xbox and Bing Deploy solutions in minutes With the click of a button, deploy the finished model as a web service that can connect to any data, anywhere Connect to the world Brand and monetize solutions on our global Machine Learning Marketplace https://datamarket.azure.com/ Beyond business intelligence – machine intelligence Microsoft Azure Machine Learning Studio Modeling environment (shown) Microsoft Azure Machine Learning API service Model in production as a web service Microsoft Azure Machine Learning Marketplace APIs and solutions for broad use
  27. 27. Enable enterprise-wide self-service data source registration and discovery A metadata repository that allow users to register, enrich, understand, discover, and consume data sources Delivers differentiated value though ‒ Data source discovery; rather than data discovery ‒ Support for data from any source; Structured and unstructured, on premises and in the cloud ‒ Publishing, discovery and consumption through any tool ‒ Annotation crowdsourcing: empowering any user to capture and share their knowledge. This, while allowing IT to maintain control and oversight
  28. 28. Azure Data Factory Connect to relational or non- relational data that is on- premises or in the cloud Orchestrate data movement & data processing Publish to Power BI users as a searchable data view Operationalize (schedule, manage, debug) workflows Lifecycle management, monitoring Orchestrate trusted information production in Azure Microsoft Confidential – Under Strict NDA C# MapReduce Hive Pig Stored Procedures Azure Machine Learning
  29. 29. Discover, explore, and combine any data type or size, regardless of location Ask questions of data to visualize, analyze, and forecast Make faster decisions, share broadly, and access insights on any device Visualize + decide Transform + analyze Visualize + decide Collect + manage Data Visualize data and make decisions quickly using everyday tools
  30. 30. 35 Analyze & Visualize in Excel Discover & Combine in Excel Collaborate, Get Insights, & Access Anywhere Through Office 365 Microsoft Power BI
  31. 31. Power BI Tools Defined • Front-end (Excel) • Data shaping and cleanup. Self-service ETL (Power Query) • Data analysis (Power Pivot) • Visualization and data discovery (Power View, Power Map, Power BI Designer) • Dashboarding (Power BI Dashboard) • Publishing and sharing (Power BI sites) • Natural language query (Power BI Q&A) • Mobile (Power BI for Mobile) • Access on-premise data (DMG, Analysis Services Connector) Power Query Power Pivot Power View Power Map Power BI Designer Power BI Dashboard Power BI Site Power BI Q&A Power BI for mobile
  32. 32. Power Query: Discover, explore, and combine any data Right from Excel, find any data: corporate, social, machine, Hadoop, open Easily merge, transform, and clean up data
  33. 33. Power BI dashboards and KPIs for monitoring the health of your business New data visualizations and touch- optimized exploration in HTML5 Power BI mobile apps across devices including iPad and iPhone Support for new data sources including SalesForce.com, Dynamics CRM online and SQL Server Analysis Services Dashboard Tree Map
  34. 34. Q&A: Ask questions of data Build ad hoc reports with a drag-and-drop interface Look ahead to forecast where business will go Map up to 1 million rows of data in 3-D
  35. 35. Data Management Gateway (DMG) • Power View/Q&A: DMG refreshes workbook so reporting not real-time (daily frequency) and 250MB upload limit • Power Query: Reporting is real-time Analysis Services Connector (ASC) • Power BI Dashboard: Get real-time reports with ASC and SSAS Tabular DirectQuery against SQL Server or APS. Create reports with Power View (limited functionality) • You can publish Power View reports to Power BI Sites and have it use ASC (by uploading Excel file via Get Data in Power BI Dashboard) • Does not support Q&A • Can run on any domain machine • Multidimensional cubes coming soon Intranet Power BI Site PDW HDI APS DMG Metadata catalog O365 Power View/Q&A 3rd- Party Hadoop Power BI Dashboard SSAS Tabular Public Internet ASC Power Pivot workbook SQL Server
  36. 36. PolyBase Query relational and non-relational data with T-SQL
  37. 37. Use cases where PolyBase simplifies using Hadoop data Bringing islands of Hadoop data together High performance queries against Hadoop data (Predicate pushdown) Archiving data warehouse data to Hadoop (move) (Hadoop as cold storage) Exporting relational data to Hadoop (copy) (Hadoop as backup/DR, analysis, cloud use) Importing Hadoop data into data warehouse (copy) (Hadoop as staging area/data lake)
  38. 38. Consumption Experiences Data Visualization Data Analysis Data Modeling Data Discovery & ETL Data Warehouse/Big Data Microsoft Analytics Platform
  39. 39. Cortana Intelligence Suite Transform data into intelligent action Action People Automated Systems Apps Web Mobile Bots Intelligence Dashboards & Visualizations Cortana Bot Framework Cognitive Services Power BI Information Management Event Hubs Data Catalog Data Factory Machine Learning and Analytics HDInsight (Hadoop and Spark) Stream Analytics Intelligence Data Lake Analytics Machine Learning Big Data Stores SQL Data Warehouse Data Lake Store Data Sources Apps Sensors and devices Data
  40. 40. Benefits Accelerate time-to-value by easily deploying IoT applications for the most common use cases, such as remote monitoring, asset management, and predictive maintenance Plan and budget appropriately through a simple predictable business model Grow and extend solutions to support millions of assets Preconfigured solutions Introducing Microsoft Azure IoT SuiteHelping accelerate your business transformation Azure IoT services Azure IoT Suite Predictive MaintenanceRemote Monitoring Asset Management And more… Addresses common scenarios: Mine data Take actionConnect assets Enables you to
  41. 41. Stream Analytics TransformIngest Example overall data flow and Architecture Web logs Present & decide IoT, Mobile Devices etc. Social Data Event Hubs HDInsight Azure Data Factory Azure SQL DB Azure Blob Storage Azure Machine Learning (Fraud detection etc.) Power BI Web dashboards Mobile devices DW / Long-term storage Predictive analytics Event & data producers Analytics Platform Sys.
  42. 42. BI and analytics Data management and processing Data sources Non-relational data Data enrichment and federated query OLTP ERP CRM LOB Devices Web Sensors Social Self-service Corporate Collaboration Mobile Machine learning Single query model Extract, transform, load Data quality Master data management Box software Appliances Cloud SQL Server Box software Appliances Cloud
  43. 43. Industrial automation company partnering with multinational oil company Oil and Gas Leading industrial automation company who employs over 20,000 people. partnering with Leading multinational oil and gas company (one of the six oil and gas super majors) who employs over 90,000 people. Part 1: What They Did | IoT internet-connected sensors to generate analytics for proactive maintenance Challenge Manage sites used for dispensing liquefied natural gas (clean fuel for commercial customers who do heavy-duty road transportation) Built LNG refueling stations across US interstate highway Stations are unmanned so they built 24x7 remote management and monitoring to track diagnostics of each station for maintenance or tuning Built internet-connected sensors embedded in 350 dispenser sites worldwide generating tens of thousands data points per second • Temperature, pressure, vibration, etc. Data needs outgrew company’s internal datacenter and data warehouse Solution Chose Azure HDInsight, Data Factory, SQL Database, Machine Learning Dashboards used to detect anomalies for proactive maintenance • Changes in performance of the components • Energy consumption of components • Component downtime and reliability Future: Goal is to expand program to hundreds of thousands of dispensers IoT, Analytics
  44. 44. BK1 Industrial automation company partnering with multinational oil company Part 2: How They Did It | IoT internet-connected sensors to generate analytics for proactive maintenance How They Did It Collect data from internet-collected sensors • Tens of thousands data points per second • Interpolate time-series prior to analysis • Stored raw sensor data in Blobs every 5 minutes Use Hadoop to execute scripts and Data Factory to orchestrate • Hive and Pig scripts orchestrated by Data Factory • Data resulting from scripts loaded in SQL Database • Queries detect site anomalies to indicate maintenance/tuning Produced dashboards with role-based reporting • Azure Machine Learning , SSRS, Power BI for O365 • Provide users with customizable interface • View current and historical data (day-to-day operations, asset performance over time, etc.) • Leveraged Azure Mobile Notification Hub for real-time notifications, alarms, or important events Use Azure ML to predict • Understand which pumps, run at what speeds, maximized water supply while minimizing energy use IoT, Analytics
  45. 45. Software Company For Web Analytics Technology A software company for web analytics, live chatting, targeting and business intelligence in e- business. Part 1: What They Did | Web Analytics – Traffic, trends, visitor actions + Recommendation Engine Challenge They build an e-business service that does site analysis, real-time monitoring of site metrics, an interactive support chat, and dynamic content builder Needed to find the right set of products that can help them achieve this Solution Chose Azure HDInsight, SQL Server (with Analysis Services) Use HDInsight to preprocess and store raw data Use Analysis Services which generates views from HDInsight Gives their customers self-service BI on top of these views Web Analytics Recommendation Engine
  46. 46. BK1 Software Company For Web Analytics Part 2: How They Did It | Web Analytics – Traffic, trends, visitor actions + Recommendation Engine How They Did It Store data in Azure Blobs • Track visitor data via JavaScript code • Used for real-time tracking and statistics • HDInsight used to pre-process and store raw data Customers of this company have self-service BI • Drag and drop UI • Leveraging Analysis Services, results can be represented as tables, charts, etc. • Analysis Services uses data from HDInsight as source • Uses HIVE ODBC driver to connect to HIVE tables Web Analytics Recommendation Engine
  47. 47. Game Development Company Gaming A predominantly mobile-based game development company. While they are a mid-sized organization, they have partnered with media giants on various gaming projects Part 1: What They Did | In-game Analytics Challenge As a game development studio, they wanted to do in-game analytics to understand their players more and what they do in the games Solution Chose Azure HDInsight (MapReduce and Storm), Service Bus and also use SQL Server for reporting Switched from Amazon AWS EMR Collects telemetry and logging data to gain in-game analytics: • How many players using the game • How many players invited their friends • How far along did players get into the tutorial • How many attempts did they make on one level/stage In-game Analytics
  48. 48. BK1 Game Development Company Part 2: How They Did It | In-game Analytics How They Did It Collect data from games in Azure Blobs • Game sends telemetry/logging data as JSON files • Contains every action of user in the game • Data is pushed to Azure Service Bus as real-time • Tens of Gigabytes of data captured daily HDInsight picks up real-time data and processes • From Service Bus, HDInsight processes using Apache Storm and MapReduce • Constantly running experiments to determine insight • A/B testing • In-game metrics and analytics • Spin up 32-node cluster nightly for four hours Output sent to SQL Server for BI • Transfer data to SQL Server for BI In-game Analytics Service Bus SQL Server On-premises
  49. 49. Big Data is coming
  50. 50. Summary Understand the benefits of big data
  51. 51. Resources  The Modern Data Warehouse: http://bit.ly/1xuX4Py  Fast Track Data Warehouse Reference Architecture for SQL Server 2014: http://bit.ly/1xuX9m6  Should you move your data to the cloud? http://bit.ly/1xuXbKU  Presentation slides for Modern Data Warehousing: http://bit.ly/1xuXcP5  Presentation slides for Building an Effective Data Warehouse Architecture: http://bit.ly/1xuXeX4  Hadoop and Data Warehouses: http://bit.ly/1xuXfu9  What is the Microsoft Analytics Platform System (APS)? http://bit.ly/1xuXipO  Parallel Data Warehouse (PDW) benefits made simple: http://bit.ly/1xuXlSy  What is Advanced Analytics?
  52. 52. Q & A ? James Serra, Big Data Evangelist Email me at: JamesSerra3@gmail.com Follow me at: @JamesSerra Link to me at: www.linkedin.com/in/JamesSerra Visit my blog at: JamesSerra.com (where this slide deck will be posted)

Notas del editor

  • So you got a handle on what Big Data is and how you can use it to find business value in your data.  Now you need an understanding of the Microsoft products that can be used to create a Big Data solution.  Microsoft has many pieces of the puzzle and in this presentation I will show how they fit together.  How does Microsoft enhance and add value to Big Data?  From collecting data, transforming it, storing it, to visualizing it, I will show you Microsoft’s solutions for every step of the way. 
  • Fluff, but point is I bring real work experience to the session
  • http://www.ispot.tv/ad/7f64/directv-hang-gliding
  • Questions to ask audience: How many have seen my Building an Effective Data Warehouse Architecture presentation and/or Modern Data Warehousing?


    SQL DB vs SQL DW vs APS vs SQL vNext
    commodity servers vs. appliance

  • http://www.gartner.com/technology/reprints.do?id=1-1RNK7M1&ct=140310&st=sb
  • Open architecture, community hardware, simplicity with PolyBase, performance

    We own all the technology from end-to-end, cradle to grave

    Bare Metal to Insight à A story
    A major challenge that we have in moving from product feature-based demos to solution-based sales efforts is our ability to tell a comprehensive story around everything that we do. For example, Microsoft is the ONLY company on the planet that can truly deliver the entire data lifecycle. From Windows Server (Or Azure IaaS VMs), to SQL Server (or Azure SQL DB), to HDInsight (or Hortonworks Data Platform, or Cloudera on Azure), Power BI (or SharePoint, or Excel, or…) to Azure Machine Learning, we have all of the building blocks of a fantastic story.. Because we tend to focus on feature/functionality in a lot of our engagements, we sometimes miss the power of this comprehensive story..
  • Key Points:
    To be a data-driven business, you need to establish a culture of data
    To establish a data-driven culture, you need to follow a consistent data lifecycle
    Microsoft provides the complete, end-to-end data platform approach to get a return on your data
    Talk Track:
    We talked about why data is important to business. Now let’s spend some time focusing on how you can get better returns on data.
    The special ingredients to creating a data-driven business are twofold:
    Empower a culture of data: Allow everyone in the organization to become data-driven decision makers
    Establish a data lifecycle method that accelerates the flow of data across the organization
    The data lifecycle can be broken into three important parts:
    First, how do you collect and then manage all of the data coming into the business? Think about the existing velocity and volume of data across a company. Now add new data types such as unstructured and streaming data into the mix. We need to make it easier to store all of this data and prepare it for the next step in the journey.
    Second, when we have data collected and managed, we then need to make it useful to everyone in the organization. Our ability to transform and then analyze all data types more efficiently ensures that data doesn’t lose speed as it moves to the next step.
    Third, now that the data is ready we can visualize that data and make decisions. Ask a question—and because we have done the good work to collect, manager, transform, and analyze that data—we get a visual, thought-provoking answer to our question.
    When you combine all of this with the Microsoft data platform with an end-to-end approach, that gives you the flexibility you need to derive results. Results that deliver the data dividends you are looking for.
    Microsoft can help you advance along the data spectrum, providing the right levels of innovation and value.
  • Key Points:
    Microsoft’s complete data platform is designed to support the data lifecycle
    We provide real choice and flexibility when it comes to making on-premises or cloud decisions
    We will continue to make investments to improve the platform
    Talk Track:
    Microsoft’s complete data platform was designed to best support the data lifecycle.
    Our unique value proposition, unlike the other data platform vendors, is that we offer flexibility and choice of whether you continue with on-premises solutions, move to the cloud, or both.
    We do not believe in a world that is all for the cloud or all on-premises. We believe that the solutions you are building, the modern applications you have, are actually spanning that continuum in new and unique ways.
    The beauty is that both our on-premises and cloud data platform options are well integrated and consistent.
    We will continue to advance our capabilities to improve the database, data warehouse, and infrastructure layer—especially around in-memory performance improvements and scalability and redundancy perspectives.
    Enterprise information management is a key aspect of our data platform that will ensure data is well managed and discoverable by the entire organization. Capabilities like MDS, DQS, and Data Catalog will ensure that the data is well organized and consumable.
    Finally, we are very proud of our leadership position that we have in our Office product line, which provides the familiar, accessible tools for everyone to visualize, share, and make great decisions. Look at the success of our Office for iPad launch and you can see that people are excited about using Office to explore.
  • For your customers:
    This is a significant, fast-growing and strategically important domain
    The customer dilemma

    Competitive dynamics and Microsoft position

    Key workloads, scenarios
    Customers are spending their money on x, y, z
  • Key Points:
    In order to get the most out of data, you have to eliminate the old data barriers from a storage and processing perspective
    Microsoft provides the most flexible way to store and manage any data, of any size
    Talk Track:
    Remember when storage and transaction processing were the big bottlenecks? Back then that represented just structured data. You had to store the same data in different places to ensure availability, which further drove up costs. Then unstructured data came flowing in and created even more expensive constraints. When you add streaming data the challenges are again amplified.
    What if you could have a single place where you can capture and store any kind of data with what we like to call “bottomless storage” capacity that is fast and highly available by design?
    The ability to effectively collect and manage any data of any size is dramatically improved when you eliminate constraints through a flexible approach that provides both on-premises and cloud-based storage and management capabilities.
    And underlying this ability MUST be speed—without it, these volumes of data simply spin…and no one gets the information they need.
    Microsoft delivers this through SQL Server and Microsoft Azure.
    You can pick the right capabilities, whether they be SQL Server on-premises, hybrid, or Microsoft Azure.
    Now, you have the ability to store and manage any data, of any size, so that the organization can capitalize on the raw potential of this data and quickly move along to the next step in the data journey.
  • Azure Stream Analytics is a cost effective event processing engine that helps uncover real-time insights from devices, sensors, infrastructure, applications, and data. It will enable various opportunities including Internet of Things (IoT) scenarios such as real-time fleet management or gaining insights from devices like mobile phones and connected cars. Deployed in the Azure cloud, Stream Analytics has elastic scale where resources are efficiently allocated and paid for as requested. Developers are given a rapid development experience where they describe their desired transformations in SQL and the system abstracts the complexities of the parallelization, distributed computing, and error handling from them.

    Looking forward into H2 FY15, Stream Analytics will become generally available after previewing at TechEd EMEA 2014.
  • http://www.jamesserra.com/archive/2015/03/creating-a-large-data-warehouse-in-azure/
  • SMP is one server where each CPU in the server shares the same memory, disk, and network controllers (scale-up). MPP means data is distributed among many independent servers running in parallel and is a shared-nothing architecture, where each server operates self-sufficiently and controls its own memory and disk (scale-out).
  • In CY15, Microsoft is introducing Azure SQL Data Warehouse, a fully managed data warehousing-as-a-service in the cloud solution that can scale from gigabytes to petabytes and that can query both relational and Hadoop. It is for organizations that want to do data warehousing and analytics but don’t want to deal with the complex setup of procuring and building state-of-the-art hardware servers that are optimally tuned, Microsoft Azure SQL Data Warehouse is a managed service that makes deploying scale out data warehouses simple.

    Scale-out on relational or non-relational data
    Leveraging the MPP technologies we have in APS/PDW and the Azure SQL DB database technologies in the cloud, Microsoft is bringing scale-out data warehouse technologies to Azure. Customers can scale out to petabytes of relational data and also federate queries to Hadoop using PolyBase.
    Powered by the Cloud
    With Azure SQL Data Warehouse, you can deploy a data warehouse without the complexities. There is no hardware to procure, maintain or tune. Instead, this is done for you and you have access to a pre-tuned warehouse that you can spin up or down on-demand
    Market-leading Total Cost of Ownership
    TCO is a function of acquisition and ongoing maintenance and costs. Azure SQL DW will have the lowest TCO because you can spin up or down at will and only incur costs when spun up. Unlike other cloud DW vendors who require you to have compute up 24x7, this gives you much higher flexibility and lower costs over time. Over time, Azure SQL DW will minimize costs of maintain the system with IT Ops and DBAs as well as the cost to move your current on-premises DW to the cloud because of a relatively high T-SQL compatibility.


  • Key Points:
     
    We’ve talked about the various capabilities of SQL Server 2014—as your go-to database for collecting various types of data, how it is set up for your on-ramp to the Microsoft Azure cloud for backup and disaster recovery, and its leading position for security among major database vendors.
    Underlying all of these benefits, though, is its key differentiator: in-memory capabilities built right into the product.
    So how does in-memory actually benefit your business, and how does it make a difference as you collect and manage your data?
    SQL Server 2014’s in-memory enhancements mean breakthrough speed across all workloads—OLTP, data warehousing, and BI—so that your business can move in real time at every stage of data collection and management.
    And the best part about in-memory being built into the product? You don’t have to buy any expensive add-ons. You don’t have to buy expensive new hardware. You don’t have to completely rewrite your apps. You don’t have to grow new skills, as you can use existing SQL skills.
    Additionally, these capabilities are flexible: you can choose which tables reside in-memory and which ones remain on-disk, which helps you to minimize capex as data volumes grow.
    Whether you’re running SQL Server 2014 on-premises, or SQL Server in the cloud on a Microsoft Azure VM, you reap the benefits of in-memory.
    So the data journey doesn’t stop with ‘collect and manage’—instead, you get the benefit of this underlying speed thanks to in-memory, which then powers your ones and zeros ahead to its various business uses:
    Faster transactions for your OLTP workloads
    With these in-memory capabilities built into SQL Server 2014, you get transactional write speed that is up to 30 times faster
    How is this achieved? This new technology removes locks and latches without compromising durability, and thus contributes to a massive reduction of contention in the database.
    With this reduced contention comes increased throughput and speed.
    Faster queries for data warehousing
    These faster query speeds are all thanks to the new in-memory columnstore technology.
    If you’re trying to explore massive data volumes, in-memory gives you the speed to do so quickly.
    With this technology, data is compressed, creating very large performance gains
    Faster insights for business intelligence
    In addition to being built into SQL Server 2014, in-memory technology extends through Excel and Power BI, meaning faster insights and ultimately faster business decisions.
    And with over 1 billion users of Microsoft Office across the world, you can push data across its journey and directly into the hands of any user to make rapid decisions at every level of the organization.
  • Looking back for CY14: It’s been a big year for HDInsight, our 100% Apache Hadoop distribution in Azure.  HDInsight launched as a Globally Available service on October 2013 (Q2, FY14).  From that time there has been major updates including having HDInsight be updated to the latest Hadoop 2.2 clusters and now on Hadoop 2.4 clusters. Microsoft has made major contributions to the Apache open source community through the Stinger project that have improved Hive queries up to 100x faster.  This was made possible by taking some of Microsoft’s know-how in SQL Server and translating that into the open source community.  Finally, Microsoft continues to add Hadoop sub-projects to HDInsight including Hbase, a columnar NoSQL database as a preview feature, Mahout for a library of Machine Learning, Storm preview for real-time stream processing. We also saw HDInsight arrive in many Azure data centers like Japan, China, and Australia. Finally, you heard Hortonworks releasing the next version of Hortonworks Data Platform (HDP 2.2) which will have built-in connectors to move data from on-premises to Azure as well as HDInsight being refreshed on this newest version. 

    Looking forward: For CY15, HDInsight will continue its relatively fast release cadence.  In Q1, CY15, you hear about HDInsight deploying on Linux as preview with GA in Q2. For the first time, Microsoft customers can deploy a Hadoop-as-a-Service (PaaS) offering that is built on Linux, supported by Microsoft. Customers can more easily leverage the existing documentation and samples/templates built in the Hadoop ecosystem already in Linux. Also in Q1, Microsoft will GA Storm which allows customers to do real-time stream processing on their data. HDInsight will introduce new VM sizes: A2-A9, D2-D4, D11-D14 allowing customers to deploy larger workloads with more cores and more memory. Finally, Microsoft will release another update to Visual Studio Tooling by adding IntelliSense to the Hive capabilities and adding the ability to launch Storm clusters in VS – making it easier for developers to add big data to their custom applications. In Q2, CY15, Microsoft will GA the Linux option, add more VS tooling for deeper IntelliSense integration, and new regions (Brazil, India). Finally, Microsoft is looking to release initial support for Apache Spark, an in-memory framework for distributed cluster computing that can do batch, interactive, and streaming capabilities with a single query language and fast execution. This initial support for Spark will allow BI tools like Power BI and Tableau do interactive querying on big data in HDInsight. Finally, Microsoft will also look to launching a HDFS-like store in Azure. Today, Azure Blobs is limited to 500 Terabytes in size. With the HDFS-store, you will have the ability to store any arbitrarily large files and data with faster performance.
  • Key Points:
    Organizations need the ability to simultaneously transform and analyze multiple data types and sources
    Microsoft provides the complete data platform that helps you develop a data-driven culture
     
    Talk Track:
    Now that we have our data collected and well managed, it is time to move to the next step in the data journey and that’s to make it useful for the whole organization.
    It is this step of the data lifecycle where the science happens to effectively model, analyze, and prepare data for consumption across the organization.
    Reflect on the recent days of the process necessary to transform and prepare relational data in your monolithic data warehouse. By the time the data was ready for consumption it was already out of date.
    Now think about the need to provide near or real-time data to make faster, better, decisions. Then stack on top of that the desire to analyze data to predict what is going on from the many instrumented things. Now you have potentially thousands of people across the company creating new models. How to you ensure they are both discoverable and well managed?
    Our goal is to enable you to make data accessible and consumable by everyone, anywhere. That is a lofty aspiration that we have been working to realize for a number of years.
    We want to make it easier for you to transform and operationalize data through our analytics advancements in SQL Server.
    We bring in predictive analytics through Microsoft Azure ML, allowing you to proactively monitor systems and processes.
    Finally, the cherry on top is creating a company-wide catalog of models and solutions that allow everyone to discover and use the amazing insights discovered by peers. This provides the trust necessary for developing a data-driven culture.
    All of this helps you reduce the cost of curiosity by developing an environment that fosters self-service and exploration, so that people can independently visualize data, and make well-informed and timely decisions—the next step in the data lifecycle.
  • Key Points:
    Organizations must be prepared to capture and collect a wide array of data types.
    Microsoft’s data platform investments provide comprehensive tools for IT to make sense of disparate data and prepare it for use by the rest of the company.
    Talk Track:
    Mountains of data aren’t worth much without the insights they can produce, but to move data along its journey towards insights, IT needs comprehensive, connected tools.
    In this second step of our data journey—transform + analyze—the transform piece is key. Data must be transformed—connected, combined, refined—in order to become useful for unlocking insights.
    Enter SQL Server 2014 to manage these processes, including core ETL needs. There are a handful of key tools within SQL to help IT with this step in the journey:
    To connect various data streams, SQL Server Integration Services (SSIS) can help drive efficiencies in ETL. With SSIS, IT can exact and transform various data sources, then load them into the desired destination.
    Once data is connected and combined, it is important to do a quality check. All of this data needs cleaning before it can be analyzed.
    With Data Quality Services (DQS), IT can ensure that data is ready for organizational use: that your data is de-duped, standardized, and enhanced in ways that meet business needs.
    To simplify the management of master data structures, Master Data Services (MDS) can compile maintainable master lists.
    With a master data management effort comes reliable, centralized data that is ready for analysis.
    In order to further validate data, anyone can use the Master Data Services add-in for Excel to make updates to the master data and submit the changes back to the IT administrator.
  • Key Points:
    To make data most useful organizations need to data is analyzed, organized, and modeled in a way to supports the widest use.
    Using SQL Server 2014, IT can build comprehensive data marts and easily scale reports across an organization.
    Talk Track:
    Now that data has been cleaned, matched, and certified, SQL Server tools continue to move data along its journey to insights.
    In order to get this data out to the rest of the organization, it must be configured for analysis by way of OLAP.
    With SQL Server Analysis Services (SSAS), IT can design, create, and manage multidimensional structures of aggregated data sources.
    SSAS is the industry’s leading OLAP engine and provides a single semantic model for all your content needs, supporting dashboards, analytical applications, and ad-hoc reporting.
    Analysis Services also bridges the gap between end-user created BI content in Excel and IT-managed corporate solutions. IT professionals can now import PowerPivot workbooks directly into Analysis Services so that they can be professionally managed and transformed into corporate grade solutions by IT.
    And now that all the prep work is done, reports can be created, customized, and scaled out across the organization.
    With SQL Server Reporting Services (SSRS), IT or business analysts can create interactive, tabular, graphical, or free-form reports from your prepared data sources.
    These reports can then be published to various SharePoint, via email, and refreshed so users always have the latest and greatest data and insights.
  • NOTE: This is an optional slide that introduces Microsoft’s Machine Learning capabilities. Use this slide if you know there will be Data Scientists interested in machine learning in your audience. Otherwise, hide this slide and continue on with the following slide on data curation and cataloging.
     
    Key Points:
    Machine learning is the future of how we will interact with data and the things that produce data.
    Microsoft’s approach to machine learning makes it easy for businesses to predict future trends and behaviors.
    With Microsoft Azure Machine Learning, predictive analytics are faster and more cost-effective than traditional solutions.
    Talk Track:
    Today, machine learning is one of the most buzzed-about trends in data. This trend isn’t going anywhere, but what is it?
    With machine learning, humans use data models to train computers to learn from patterns in data over time.
    So how can expand your data practice to incorporate the predictive capabilities that machine learning can provide?
    Whether you are looking to advance and evolve your organization’s analytics capabilities, or if you’ve got access to a data scientist, Microsoft’s machine learning service can help to easily predict future trends or behaviors in a way that is both faster and lower in cost than traditional solutions.
    The great news is that if you already have a Microsoft Azure subscription or data in the cloud—especially in HDInsight—you are more than halfway to realizing the benefit of this solution.

    Because Microsoft has built our machine learning solution as part of Azure, right off the bat it has all the benefits of the cloud: scalable, no new hardware investments, pay as you go, and as you see value.
    First and foremost, it’s easy to manage. This is what customers expect from Microsoft and here we deliver ease-of-use in a space where it’s lacking today. It’s all based in Azure, so there is one portal to manage, monitor, and update the solution. No moving back and forth between technologies or struggling with integrations.
    Like other Microsoft solutions, it comes with enterprise-grade security features built in. No need to worry about who might have access to sensitive data, or people logging in that aren’t supposed to. Customers have the power to manage access.
    If customers have already made investments in machine learning, they can easily import their existing work so they’re not losing anything. This adds a new way to get value from data they already have in Azure.
    And the problem I mentioned earlier where so many machine learning solutions never get deployed and provide business value—that’s a thing of the past. With Azure Machine Learning, customers can operationalize solutions in minutes.

    Let’s take a quick look at the process to get started with machine learning:
    Your data scientists can:
    Execute every step in the data science workflow in one place: ML Studio
    Access and prepare data
    Create, test, and train models, as well as import the company’s proprietary models securely into the data scientist’s private workspace
    Work with R and over 300 of the most popular R packages along with Microsoft’s business class algorithms
    Collaborate with colleagues within the office or across the globe as easy as clicking “share my workspace”
    Deploy models within minutes rather than weeks or months
    It really comes down to Predictive Analytics, using your past data to provide data intelligence about the future. Consider a few real world scenarios:
    Churn analysis to predict which customers may leave and help craft strategies to keep them satisfied
    Recommendation engines that can leverage huge volumes of customer data to offer customers suggestions of what they might want next.
    Fraud detection to flag orders or behaviors which are indicative of a scam and can help you stay one step ahead of criminals.
  • Microsoft’s Big Data vision in the cloud is to enable organizations to solve large, complex problems end-to-end, from storing and managing TBs of data without investing in hardware and software, to seamless integration with the 1 billion users of Excel. As part of this vision, Microsoft offers Azure Machine Learning, designed to democratize the complex task of advanced analytics.

    Advanced analytics is using products like Azure Machine Learning to find new and actionable insights that traditional approaches to business intelligence are unlikely to discover. An easy way to think about this is thinking about a dashboard. Today when confined by only BI tools without a connection to machine learning, it is solely the job of the human looking at the spreadsheet to gain insights and react to the data. But a human can only consume so many variables. A computer, on the other hand, can consume a great deal more variables to provide much deeper insight on the data. Humans can then react to the data to make decisions that drive competitive advantage, as well as program the computer further to recognize important patterns in the future. This is why we say beyond business intelligence – machine intelligence.

    The accessibility of our solution starts with set up. Previously you needed to provision your workspace on-premises for machine learning, also thinking about server space and a host of other considerations. Today you can get started with just a browser. With only an Azure subscription, you can take advantage of the full functionality of Azure Machine Learning within minutes. Taking a test drive is even easier, click Get Started off azure.com/ml and with simply a Microsoft ID you’re off to the races.

    Another limit with other machine learning solutions are siloed environments that only allow for one programming language or make changing from one algorithm to another time consuming and complex. With Azure ML, you can experience the power of choice. That choice expands to language, with both Python and R being first class citizens of Azure ML, or algorithm. You can choose from hundreds of algorithms, including business-tested ones running our Microsoft businesses today. And swapping out algorithms to land on the right one for you is done with a click. Additionally you can drop in custom R and Python code – your “special sauce” – and mix and match that with the other options in the tool.
    Most revolutionary of all you can deploy solutions in minutes as a web service, which is simply a url which can connect to any data, anywhere – including on-premises or in another cloud environment. The ability to put a model into production almost immediately, as well as revise it easily, is unique to Microsoft and allows companies to stay on top of the changing business landscape more effectively than is offered by any other provider today.

    We even take that a step further, allowing model developers to connect to the world with our Machine Learning Marketplace, where they can publish finished solutions and APIs with their own brand and business model. Developers can also discover machine learning solutions there without any machine learning skills needed – the data science is inside. Check it out at https://datamarket.azure.com/.

  • Key Points:
    To unlock more value from data, IT must make curated and catalogued data available across the organization.
    Microsoft’s complete data platform offers an end-to-end solution to curate and catalog any data.
     
    Talk Track:
    To compete in today’s business climate, you have to make decisions based on facts, driven by insights, to figure out what the data can and is trying to tell you.
    It is not enough just to have a handful of people with the tools and the access, they really need it to scale to everybody.
    In the BI space, your goal is to inform all of your people, to empower them to make the best decisions and reach customers, with the right information to improve customer experiences.
    We talked a bit about publishing reports using SSRS, but with the latest advancements in the data platform, we’ve taken data curation and cataloging to the next level.
    In order to transition from the back end of data transformation and analysis to the front end where any user can access data to unlock insights, you need a dedicated space on which you can easily publish curated and catalogued data.
    So how do you manage data sources?
    You likely want to understand usage patterns and create reporting efficiencies so that you’re not dealing with recreating reports every time they’re requested, or so that you can eliminate redundant reports.
    And in order to improve the data experience for users, IT can track how data and BI content is being created and shared across the organization. Power Pivot Management Dashboards are available in SharePoint to help IT monitor data and workbook usage and gather performance metrics from servers to better understand who’s creating BI content, how it’s being shared, how taxing the queries are on the system, and so forth.
    Data usage telemetry is also available as part of the Data Catalog in the Power BI Admin Center, providing visibility into how data is used throughout the organization to understand data usage patterns so that you can decide which data sources are the most popular and which aren’t, and allocate resources accordingly.
    And what about giving users access to the data they need, while still maintaining the appropriate levels of data security?
    Microsoft uniquely provides IT administrators with the insight and oversight they need, with tools for monitoring and managing user created BI content, as well as for transforming that content into corporate-grade solutions that are professionally managed by the IT department. This simplifies compliance without hampering user agility and creativity.
    Discovery and Risk Assessment Server scans an organization’s network shares and SharePoint document libraries on demand or on a routine basis, and incorporates the discovery results into a master inventory where they are evaluated for complexity, financial impact, risk, and errors for Excel spreadsheets.
    With Spreadsheet Inquire you can diagnoses your mission critical workbooks for errors, inconsistencies, hidden information, data dependencies, and other important aspects.  Inquire includes valuable tools like workbook analysis, relationship diagrams, and Spreadsheet Compare.
    Spreadsheet Compare looks at multiple versions of a workbook side-by-side to quickly gain visibility of important changes down to the cell level, including changes to workbook structure, formulas, cell values, VBA code, cell format, and more.
    All these spreadsheet control features allow for comprehensive diagnosis of critical spreadsheets and documentation of findings. End users, IT, auditors will all be able to know when their most important data, such as financial statements, has been changed to the cell level and even understand dependencies with other data sources. 
    And the reports that one user requests is likely to be of interest to others. So to help IT share common reports, Microsoft’s investments in Power BI help to reduce reporting redundancies and also publish well-curated data.
    Through the Power BI data catalog, which you could describe as a search engine for data, IT has a new way to provision users with access to data. IT can register data from across the organization—making the data discoverable from within Power Query so employees can easily find and connect to the data they need without having to call IT with one off data requests.
    And people can share not only workbooks but also the queries they create using Power Query in Excel. This allows members of the team to build and manage data queries for others to use when creating their own reports.
    Once published to Power BI, users can define who they want to share their data queries with and track who’s accessing which queries. Data queries are registered with the Data Catalog so that they can be easily discovered through data search in Excel. If a user tries to access a query but they do not have access to the underlying data source, a workflow will allow them to request access from the DBA responsible for the data.
    To make it even easier for anyone to access the data they need, IT can quickly create collaborative BI sites for a team to share reports, even with the ability to view and interact with larger workbooks through the browser. Power BI sites provide a highly visual experience for sharing reports including live report tiles to ease in locate the right report quickly.
  • Azure Data Factory is a cloud service for creating, managing, and monitoring the production of trusted information from on-premises and cloud data sources using transformative analytics at scale. Data Factory can be used in solutions to gain insights from operational and service health telemetry data, analyze customer actions to determine an optimal targeted marketing strategy, or predict customer churn from customer profile and service log data. Instead of writing hard-to-manage custom code to wire together a data warehouse with Hadoop, NoSQL, and SaaS, use Data Factory to quickly create and deploy highly available data processing pipelines, significantly cutting your time to solution and your operational costs. Get a single monitoring view of all of your data processing pipelines along with data lineage and service health. Bring together on-premises data like SQL Server and cloud data like Azure SQL Database, Blobs, and Tables with the transformative analytics of HDInsight (Hive, Pig, MapReduce, custom .NET code), and even Azure Machine Learning, to produce trusted information that is easily consumed by BI tools or applications.

    Looking forward into H2 FY15, Data Factory will become generally available after previewing at TechEd EMEA 2014.
  • Key Points:
    There are a lot of data visualization tools out there, which can create training and management concerns.
    Only Microsoft provides the most familiar, and powerful visualization and decision support tools.
    Talk Track:
    This is the big payoff moment: when you have well managed and curated data people are capable of mixing, matching, visualizing, refining, and ultimately, making the best decisions possible. All of our hard work surfaces here.
    My head spins at the sheer volume of data visualization solutions out in the market. Everyone is claiming that they have the latest, best, most advanced data visualization and decision support tools. The problem is that it represents another thing for people to learn—and for you to manage. If the data is not prepared and ready to consume , those shiny tools will not work as advertised.
    Microsoft’s secret weapon, well not that secret, is Excel. Over 1 billion people use Excel to make decisions. Over the past years, we have worked hard to build advanced visualization and in-memory capabilities for Excel so that it is the one familiar place everyone can go to visualize and decide.
    With Power BI, we’ve overlaid powerful visualization and business intelligence capabilities.
    With Q&A, we have the ability to ask simple questions against the prepared data models to return rich charts and graphs. You are not required to retrain people. They are immediately up to speed in Excel and excited about the new ways in which they can interact with data. Since Excel works with Office 365 and SharePoint, you can easily share and appropriately scale ideas—anywhere, on any device.
    Imagine all of the ad hoc moments where someone can quickly search, find what they are looking for, and present those insights to a customer or peer?
    Most users are not doing BI all day long and are using Microsoft technology to support 90 percent of their daily tasks. With Microsoft, BI is connected to everything you do.
    That is the power of Microsoft’s complete data platform. Everything works together to support the self-service capabilities that drive the required data culture necessary to extract as much value from your data investments as possible.  Suggestion: clean and analysis ready

  • So what exactly is What is Power BI?
    Power BI for Office 365, a self-service business intelligence (BI) solution delivered through Excel and Office 365 which provides information workers with data analysis and visualization capabilities to identify deeper business insights either on premise or within a trusted cloud environment. With Power BI for Office 365, customers can connect to data in the cloud or extend their existing on premise data sources and systems to quickly build and deploy self-service BI solutions hosted in Microsoft’s enterprise cloud.

    Power BI for Office 365 enables customers to do more with their data:
    - Analyze and present insights from data in compelling visual formats either on premises or in the cloud from Excel.
    - Share reports and data sets online with data that is always kept up to date.
    - Ask questions of your data using natural language search and get immediate answers through interactive tables, charts and graphs.
    - Access and stay connected to data and reports from your mobile devices wherever you are.

    Power BI addresses several fundamental business needs:
    Enable self-service BI solutions for everyday business users with the ease of use and familiarity of a tool they already use – Excel. Excel is the most widely-used analytical tool by information workers.

    What you’ve seen in the past are large IT managed data warehouses such as MSS with publish layers and oftentime IT owned standard reports off of them. This is a great model for much of the biz, but can really diminish the agility aspect and time to insights. If an end user has an additional question or wants to see more fields on their report, or connect more data, that’s going to go into the requirements hopper, be iterated on and take longer than desired to deliver.
    What self service BI enables is that IT warehouse or curated views off the source to be shared out and discovered much more easily and then taken further with the ability to connect to other enterprise data sets and visualize and share in a much easier fashion.
    So really that speaks to Power BI being an AND scenario for your BI systems. It is not meant to replace large existing warehouse investments, but extend their functionality.

    Enable organizations to extend their existing investments for on premise data warehouses and operational systems as well as cloud-based data sources and Hadoop clusters to create secure and easy-to-use self-service BI solutions that can also monitor employee access and usage.


    Enable collaboration, connectivity, scale and governance by offering BI in the cloud through Office 365. Users can easily share their data and analysis with colleagues and access reports wherever they go – on their desktop at work, over the Web at home, and on their mobile device. Manage important data for the team and track usage to see who’s accessing the data and what data sets are most often used.
     




  • Excel 2013 is Microsoft’s premier business analyst tool – it includes rich business intelligence features (Power Query, Power Pivot, Power View, Power Map) fully integrated with the powerful ad hoc analysis capabilities and familiar features of Excel – like Pivot Tables and Excel Charting. With Excel 2013, analysts can publish Excel Workbooks to Power BI, sharing data, analysis and reports with users of Power BI.

    For customers who don’t have access to Excel 2013, the new Power BI Designer can be used to import and model data, then author and publish Power BI Reports to the Power BI service. While lacking the rich analytical features of Excel, it does provide a simple solution expressly design for Power BI content creation.
  • Key Points:
    How does Datazen fit within the Microsoft Data Platform? Let’s summarize.
    Datazen provides insights from any enterprise data – virtually anytime, anywhere and through any device.

    Talk Track:
    Datazen is optimized for SQL Server enabling rich, interactive data visualization on all major mobile platforms. . 
    Datazen is complementary to our BI portfolio.
    For organizations who need a mobile BI solution implemented on-premises and optimized for SQL Server.
    SQL Server Enterprise Edition customers with version 2008 or later and Software Assurance are entitled to download the Datazen Server at no additional cost

    Let’s take a look at the Datazen approach and the different Datazen components.
  • http://www.jamesserra.com/archive/2013/12/power-bi-for-office-365-requirements/
  • Key Points:
    Organizations need to make it easy for everyone in the business to mix and match data to get the best return on data
    Microsoft makes it easy for anyone to get the data they need for analysis, right from the those familiar tools like Excel
     
    Talk Track:
    There has never been such an abundance of available and useful information as there is today both across the web and across your organization.
    The only problem is people are challenged with effectively discovering and connecting to this information so that they can gain the insights they need.
    As part of Power BI, Power Query enables anyone to find and connect to data, right from within Excel—corporate data that IT has published and “certified,” social sentiment data, open data, the list goes on.
    Power Query supports a wide variety of data sources including relational, structured and semi-structured data, OData, data from the Web, Hadoop, and Facebook.
    The most amazing thing about this approach is that all of this data can be accessed through a simple click on the Excel ribbon.
    With Power Query, you can conveniently shape and clean your data within Excel, allowing you to quickly get into analyzing and visualizing your data.
    Once the right data has been found and imported to Excel, there are simple, powerful tools at the ready to help combine datasets, clean them up, and get them ready for analysis.
    We talked about the various cleaning tools available to IT, but we’ve taken these tools even further within Excel to help anyone prep data to the exact way in which they need it to work, providing them with ease of combining and transforming this data so that it can be analyzed and visualized for deeper insight.
  • OPTIONAL SLIDE: Screen shots of Excel 2013 featuring Power View and PowerPivot for visual representation of solutions if demo isn’t possible, or if needed to address specific points.
  • OPTIONAL SLIDE: Screen shots of Excel 2013 featuring Project Codename “GeoFlow” for visual representation of solutions if demo isn’t possible, or if needed to address specific points.
     
  • Areas of investment for Power BI

    Power BI dashboards and KPIs – Power BI will provide new capabilities to allow for the creation of live dashboards and easy to create KPIs that can be pinned to the dashboard - from which you can drill into underlying Power View reports for additional detail and data exploration.

    New data visualizations – we will introduce a host of new data visualizations for Power BI. eg. Tree Map.

    Touch optimized data exploration in Power View HTML5 – Power View will gain new touch optimized data exploration features allowing users to explore data more easily on touch devices.

    Power BI mobile applications – beyond the mobile experiences provide through HTML5 we also continue to invest in providing native mobile apps and will soon support both iPad and iPhone.

    Support for new data sources such as SalesForce.com and Dynamics CRM – new data sources will be added such as SalesFoce.com and tighter integration with CRM online.

    Support for SQL Server Analysis Services –
    Power Query will support SQL Server Analysis Services as a data source.
    Power BI will also support interactive query directly to SQL Server Analysis services on-premises. This will be to both tabular and multidimensional cubes. This will allow organizations to managing and maintain their existing Analysis Services models without having to move their data to the cloud. Users will be able to view and explore their data from Power View in Power BI and the system will directly connect and query against SQL Server Analysis Service on-premises.

  • Key Points:
    Data visualization makes it easier to see patterns in data, uncover truths hidden within data, and ultimately tell better stories.
    Microsoft makes it easy for anyone to visualize their data right from Excel.
     
    Talk Track:
    Within Power BI are a suite of powerful capabilities. To help anyone visualize their data and make decisions based upon it, we rely on Power View and Power Map.
    Using Power View, anyone can use simple drag-and-drop tools to build quick, interactive, visual reports. And within those reports, they can drill down, hover over, and more deeply explore information using a variety of charts, graphs, and maps.
    But what about using historical data to look forward and anticipate where business will take you? Within Power View, you can also explore the forecasted results, adjust for seasonality and outliers, view result ranges at different confidence levels, and hind cast to view how the model would have predicted recent results.  
    Using Power Map, we take visualization to the next level. Combining Excel and Bing Maps gives the powerful capability of overlaying data across a 3D map to really visualize the story and actually see the patterns in the data. And with Power Map, you can build mapped tours, export them to video, and use these videos to truly tell the stories hidden within your data.
    These powerful visualization capabilities help provide instant answers to data questions, and help to present data in bold new ways. With the importance of data visualization on the rise, Microsoft offers the unique capability of visualizing data right from within Excel, the tool that over a billion people already know and use.
    So where are we headed with visualization? Deeper interactivity that blends analysis and visualizations even more fluidly, newer types of visualizations that enable you to see deeper insights more easily, richer experiences on the devices customers use most, and great storytelling experiences are just a few of the areas in which we’re investing to make sure Excel remains the data productivity app of choice as analysis and visualization needs evolve.
  • Key Points:
    Organizations need a way to easily share, communicate, and collaborate on insights derived from data.
    Microsoft’s investments in data tools uniquely facilitate decision making to help anyone move business forward.
    Talk Track:
    So we’ve analyzed data, we’ve visualized it, but what next? The insights we can uncover from within data aren’t much help without moving your business forward. How can these insights actually help you to optimize and ultimately make better, faster decisions?
    Imagine being able to ask a question of your data and getting an insightful answer back from your data. Power BI takes access to insights a step further with Q&A.
    Every insight starts with a question, and with Q&A, we have the ability to ask simple questions against prepared data models to return rich charts and graphs. This requires zero training—anyone can ask questions of data, whether they’re an Exec, an IT Pro, or an everyday user.
    Ultimately, Q&A helps anyone to interact with their data using natural language. It’s as easy as going to a colleague to ask a question. This dramatically increases how many business users can get more value from their existing BI solutions.
    But insights shouldn’t be limited to one person—most decisions are made through collaborative efforts.
    With Power BI, anyone can access insights anywhere—from any HTML5-capable device’s browser, through the Power BI mobile app—meaning that decisions can be made on the go, across departments, and from bottom to top, across the company.
  • http://www.jamesserra.com/archive/2015/01/using-power-bi-to-access-on-premise-data/

    http://sqlmag.com/sql-server/how-configure-power-bi-preview-premise-analysis-services-data-sources

    Note that you are using data in the Power Pivot model in the workbook you uploaded to Power BI and not the data directly on SQL Server (so the Power View report is hitting Power Pivot, so if data changes in SQL Server, that won’t be reflected in Power View unless you setup the DMG to refresh the data from SQL Server as I explain below).  To refresh the data in the workbook in Power BI from SQL Server, use the “Schedule Data Refresh” option on the workbook in Power BI and create a daily refresh (there is no option to update more than once a day).  To refresh the data more frequently, you must do it manually by going to the setting tab on the “Schedule Data Refresh” page and clicking the button “refresh report now” (the status of the connection must be OK to see this button).  See Schedule data refresh for workbooks in Power BI for Office 365.

    Upload limit:
    https://support.office.com/en-us/article/Reduce-the-size-of-an-Excel-report-on-a-Power-BI-for-Office-365-site-c1a9215c-1c10-4de4-9b7a-50ea6782f8de
    The report as a whole    can be up to 250 MB.
    The core worksheet contents    can be up to 10 MB.
    Excel/Power BI model size (capped at 250 MB when published in the cloud and 2GB when published in Sharepoint on Prem):

    Info on Power View reports using ASC: http://www.jenunderwood.com/2015/01/20/tip-how-to-power-bi-ssas/
  • PolyBase of APS v2 AU1 can already support HDP 2.x with the hotfix KB2973037! (HDP 2.x includes HDP 2.0 and HDP 2.1)

    Azure HDInsight supports both local HDFS storage and Azure Blob Storage for storing data. The default is Azure Blob Storage. There is a thin layer over Azure Blob Storage that exposes it as an HDFS file system called Windows Azure Storage-Blob or WASB.

    With this hotfix, you have following sp_configure values for the option "hadoop connectivity" availlable: 0 - no HDP support 1 - Hortonworks for Windows Server (HDP 1.3) HDInsight on Analytics Platform System HDInsight’s Windows Azure blob storage (WASB[S]) 2 - Hortonworks for Linux (HDP 1.3) 3 - Cloudera CDH 4.3 for Linux (also works with 4.5 and 4.6) 4 - Hortonworks Data Platform for Windows Server (HDP 2.x) 5 - Hortonworks Data Platform (HDP 2.x) for Linux

    Key goal of slide: PolyBase is available only within the Microsoft Analytics Platform System.

    Slide talk track:
    PolyBase simplifies this by allowing Hadoop data to be queried with standard Transact-SQL (T-SQL) query language without the need to learn MapReduce and without the need to move the data into the data warehouse. PolyBase unifies relational and non-relational data at the query level.

    Integrated query: PolyBase accepts a standard T-SQL query that joins tables containing a relational source with tables in a Hadoop cluster referencing a non-relational source. It then seamlessly returns the results to the user.

    PolyBase can query Hadoop data in other Hadoop distributions such as Hortonworks or Cloudera.

    No difficult learning curve: Standard T-SQL can be used to query Hadoop data. Users are not required to learn MapReduce to execute the query.

    Cloud-Hybrid Scenario Options
    PolyBase can also query across Windows Azure HDInsight, providing a Hybrid Cloud solution to the data warehouse


    The ability of querying all of your company’s data, independent of where it resides, what format it is stored in, in a performing way is crucial in today’s data-centric world with massive, increasing data volume. Today, with AU1, one can query various Hadoop distributions + data stored in Azure. For example, with one single T-SQL statement a user can query over data stored in multiple HDP 2.0 clusters, combine it with data in PDW and combine it with data stored in Azure.  No one in the industry (as far as I’m aware of) can do this in this simple fashion. Bringing all Microsoft assets together, on-prem and specifically through our Azure play including various services that will be brought online in future, we can clearly distinguish through our unique & complete end-to-end data management story.   No doubt that there are several pieces missing in our ‘Poly’ vision – including supporting other data stores, enabling push-down computation for our cloud story, more user-definable options language-wise, better automation/polices, and many more ideas we’d like to go after in the next weeks & months ahead.
  • HDInsights benefits: Cheap, quickly procure

    Key goal of slide: Highlight the four main use cases for PolyBase.

    Slide talk track:
    There are four key scenarios for using PolyBase with the data lake of data normally locked up in Hadoop.
    PolyBase leverages the APS MPP architecture along with optimizations like push-down computing to query data using Transact-SQL faster than using other Hadoop technologies like Hive. More importantly, you can use the Transact-SQL join syntax between Hadoop data and PDW data without having to import the data into PDW first.
    PolyBase is a great tool for archiving older or unused data in APS to less expensive storage on a Hadoop cluster. When you do need to access the data for historical purposes, you can easily join it back up with your PDW data using Transact-SQL.
    There are times when you need to share your PDW with Hadoop users and PolyBase makes it easy to copy data to a Hadoop cluster.
    Using a simple SELECT INTO statement, PolyBase makes it easy to import valuable Hadoop data into PDW without having to use external ETL processes.
  • In a moment Sanjay will show you how Pier 1 is truly putting their data to work. They’re experimenting with monitoring in-store activity with the power of Kinect sensors and combining that with data from customer activity on the web. They’re ingesting that through Event hubs and then putting the data that needs further processing into Azure Data Factory to use HDInsight for batch processing. Stream Analytics takes on data as well. Data Factory then moves that data into Blob storage, where it’s further processed and combined with the Analytics data already sent to Azure SQL Database. Then that Azure DB data is sent to Azure ML, where it can then be modeled, made sense out of, to then deliver predictive results to any number of devices and visualization tools. Does this sound like a lot and a lot of things to buy? Perhaps it does. But what you have bought here – all you need to do all this is Azure. That’s the beauty of the cloud.
  • Assuming that all the data sources have been moved to a datawarehouse or Hadoop and it is now to build a cube for reporting. There are three choices: Power Pivot, Tabular Multi Dimensional.

    There are many decision factors to decide which of the three cube technologies to use. Power Pivot is mainly used for quick prototyping of a cube and does not have enterprise features such as role level security, partitioning, etc. On the other hand, Tabular and Multi Dimensional are both enterprise level cubes. The following slides highlight the main decision points. Please note that as new features get built into Power Pivot/Tabular, some of the following slides will need to be updated.

    There are Four primary tools used to build reports in the Microsoft BI platform; Excel, SQL Server Data Tools (SSDT), SharePoint and Power BI Preview Designer. Unfortunately, not all these tools can consume all the cubes. Therefore, the decision of the tool is dependent on the cube and vice versa.

    Reports can then be published to either Office 365 Power BI or SharePoint OnPremises/Azure IaaS. Currently, only Excel files with Pivot Tables/Charts and Power View sheets can be published to Office 365 Power BI. On the other hand, all the repor types can be published to SharePoint. However, SharePoint lacks Q&A, mobile apps, etc.
  • In conclusion, Microsoft is focused on delivering Business Intelligence and Analytics as part of a complete set of integrated capabilities in our Data Platform as can be seen here. Our Business Intelligence solution is focused on bringing together all aspects of Business Intelligence across:

    Corporate Business Intelligence – providing IT with tools to connect, clean, provision, and manage data across the enterprise.
    Self-Service Business Intelligence – empowering users with self-service capabilities to discover, analyze, visualize, and share data through the familiar Office tools they use every day.
    Advanced Analytics – for both business analysts and data scientists to mine for deeper analytics insights, create predictive models, and apply advanced analytical techniques.

    By bringing these capabilities together we aim to meet the needs of everyone across the organization by uniquely integrating these capabilities on one platform.
  • Key goal of slide: To convey that the modern data warehouse is something that the traditional data warehouse must evolve to. To have IT agree that their warehouses need to take advantage of these new technologies (specifically focusing on the middle and bottom layer).

    Slide talk track:
    To encompass these four trends, we need to evolve our traditional data warehouse to ensure that it does not break. It needs to become the “modern data warehouse.” What is the “modern data warehouse?” This is the new warehouse that is able to excel with these new trends and can be your warehouse now and into the future.
    The modern data warehouse has the ability to:
    Handle all types of data. Whether it be your structured, relational data sources or your non-relational data sources, the Modern data warehouse will incorporate Hadoop. It can handle real-time data by using complex event processor technologies.
    Provide a way to enrich your data with Extract, Transform Load (ETL) capabilities as well as Master Data Management (MDM) and data quality
    Provide a way for any BI tool or query mechanism to interface with all these different types of data with a single query model that leverages a single query language that users already know (example: SQL).

    Questions drive BI, Analytics drive questions

  • Microsoft Azure DocumentDB is the highly-scalable NoSQL document database-as-a-service that
    enables query over schema-free data and multi-document transaction processing
    helps deliver configurable and reliable performance
    and enables rapid development

    DocumentDB is the right solution for applications that run in the cloud when predictable throughput, low latency, and flexible query are key.
    Fully managed PaaS database service backed by the power of Microsoft Azure. Unlike many other NoSQL offers, DocumentDB was built for the cloud to perform and scale in a multi-tenant environment. Cluster administration, replication, and other management functions are handled for the customer automatically. DocumentDB is backed by a 99.95% availability SLA (at GA) to provide consistent, reliable performance.
    Application controlled schema with massive scale-out enables iterative development and evolving data models. DocumentDB supports a schema-free data model where the application defines the data model. This supports modern application development scenarios where applications are developed iteratively with many versions supported concurrently and data models continuously evolve.
    Automatic indexing enables robust querying over schema-free data. DocumentDB is the first of its kind to offer SQL over schema-free JSON data and multi-document transactional processing.
    Integrated transactional JavaScript processing + tunable consistency enable high performance application experiences. DocumentDB supports stored procedures, triggers, and user-defined functions. It also supports tunable consistency with well-defined click stops to enable developers to tune database performance based on the application’s needs.

    The key scenarios for DocumentDB are the following:
    Emitting telemetry and logging data
    Storing/querying event and workflow data
    Persisting device and app configuration data
    User generated content
    Scalable, iterative app development
  • Many applications use search as the primary interaction pattern for their users. When it comes to search, user expectations are high. Google and Bing have trained users to expect great relevance, suggestions, and solid linguistics that effortlessly handle spelling mistakes, near-instantaneous responses and more.

    Azure Search is search-as-a-service that helps Developers, Architects, and ITDMs build sophisticated search experiences into web and mobile applications, reduce the friction and complexity of implementing full-text search, and differentiate their application by leveraging powerful features not available with any other search package. With Azure Search a developer can integrate search, one of the most popular navigation methods, into their application more easily than they could by running their own search package. They also get access to rich and powerful capabilities to enhance the search experience and tie results to business objectives.

    Offered in combinable units that include reliable storage and throughput, Azure Search allows developers to set-up and scale a search experience quickly and cost-effectively. As the volume of data or throughput needs of an application change, Azure Search can scale out to meet these needs and then scale back down to reduce costs.
     
    Fully managed PaaS search-as-a-service backed by the power of Microsoft Azure removes some of the complexity around providing search. Standing up a search service, tuning it, and worrying about index corruption takes special skills and is quite complex. Azure Search removes much of this complexity because it is a managed service.
    Supports sophisticated search functionality such as auto-complete, hit highlighting, ranking, faceting, and geo-spatial search. “Out-of-the-box” support for sophisticated search features to enable powerful search experiences.
    Easily tune search results to support business objectives. Promote search results that you want to show up. For example, in an e-commerce example, have high margin items show up higher in search results than lower margin items.

    Guaranteed throughput and dedicated storage which easily scales out as the application’s search needs grow. Many sites have high load variability. Azure Search makes it easy to scale up the service to handle increased requests or increased amounts of indexed data. At GA, Search will provide a 99.9% SLA when two or more replicas are used.
  • Key goal of slide: Describe the three ways a customer can deploy Big Data from Microsoft.

    Slide talk track:
    Pioneered in the Jim Gray Systems Labs by David DeWitt, PolyBase is an integrated query processor in SQL Server 2012 Analytics Platform System which represents a breakthrough innovation from traditional query processing to join structured and unstructured data from Hadoop together. Without manual intervention, PolyBase Query Processor can accept a standard SQL query and combine tables from a relational source with tables from a Hadoop source directly through external tables.  As well, PolyBase Query Processor parallelizes the ability to import/export data to and from Hadoop giving APS speed, simplicity, and responsiveness in addressing these new types of queries.

    Ability to issue standard T-SQL that joins relational data with unstructured data in Hadoop
    PolyBase rapidly imports/exports data between Hadoop and APS in parallel
    PolyBase can query Hadoop data in other Hadoop distributions such as Hortonworks or Cloudera.
    PolyBase can also query across Microsoft Azure HDInsight, providing a Hybrid Cloud solution to the data warehouse
    PolyBase can query data in Hadoop directly without movement (with external tables)
    Created in “Gray Systems Labs” by David DeWitt

  • https://customers.microsoft.com/Pages/CustomerStory.aspx?recid=18356
  • Collect + Manage
    Transform + Analyze
    Visual + Decide
    Access Methods
    Product Groupings
    Modern Data Warehouse
    Sample architectures



  • http://www.jenunderwood.com/2014/12/08/business-analytics-101/
  • MS.Com Consumer Portals
    400m unique visitors/month
    4 Billion Events/Month
    2.2 Billion Page Views/Month


    Microsoft.com is a key entry point for customers looking for various products and solutions. As a result, it presents a unique opportunity to target customers by audience for targeted product and service offerings and campaigns. Previously, different campaigns were displayed to different customers randomly. But the MSCOM team wanted to take advantage of this web property to segment and target incoming users more effectively to improve customer engagement and drive better business results. This entailed presenting incoming users with information that was most relevant to them.

    MSCOM deployed an advanced analytics solution that operationalizes how the Microsoft home page and its related display advertising network presents information to returning customers. It pulls in customer behavior not only across Microsoft.com, but also from other Microsoft web properties, including TechNet, MSDN, the Microsoft Store, and more. As a result, returning visitors from any Microsoft website are presented with product information and campaigns they have shown interest in and are most likely to engage with. The solution includes audience discovery across all web properties, and sophisticated segmentation and modeling, which leads to effective targeting. Finally, campaign owners are presented with detailed reports so that they can optimize campaigns as required.

    As a result of the solution, the MSCOM team has seen dramatic results. The MSCOM team has identified nearly 25 different high-value customer segments that significantly contribute to business success, and are showing that many of these segments show far higher conversion rates than with no targeting. In some cases, reach and conversion have been twice as high for a targeted group that requires a fraction of the budget. In addition, some campaigns have seen between 11 and 40 times higher return on investment.
  • MSN is a content-driven business that serves hundreds of millions of people around the world with fresh and relevant content. To capture and retain user interest, MSN needs to know exactly who its users user, what interests them, and how they behave on the site. When armed with comprehensive data, the team can create and serve up content that users care about, organize it more effectively, and create targeted advertising solutions. And that contributes to the bottom line.

    MSN uses advanced analytics to capture and process a huge amount of user information. The team gets complete views of its users from every possible angle. Real-time dashboards and insights enable immediate change when required, while deep BI helps inform future content. These complete views of users also enable the business unit to offer advertisers extremely targeted advertising solutions

    For MSN, analytics has led to greater agility, which is critical for any content business. A dashboard provides instant access to user behavior means the team can rotate content on demand. For instance, it can see what modules on the home page are most popular, and make them more visible to boost click-through rates even more. This step is critical, because about 85 percent of traffic comes straight from the home page.

    With detailed user data and competitive insight, MSN can serve up content that customers care about—and based on customer demographic data, can plan content in advance.

    With highly targeted advertising solutions, advertisers know that they’re getting high value solutions. For instance, if a customer wants to target—say—women aged 35 and above, MSN knows exactly what time to place those ads and on what sites. And like any online content business, MSN is always seeking to improve the site and user experience. Analytics enables constant improvement, without worrying about breaking. For instance, let’s say the design team is working on a refresh of the look and feel. Armed with data about operating system, screen size, screen resolution, and so on, they can recreate designs without worrying that the experience will be broken.
  • Each year, cybercrime takes a personal and financial toll on millions of consumers and causes enormous damage to businesses, governments and economies across the world. To address this growing problem, Microsoft has created a center of excellence for advancing the global fight against cybercrime, which includes malicious software crimes, IP crimes, and technology-facilitated child-exploitation crimes. The Microsoft Cybercrime Center uniquely combines the Microsoft Digital Crimes Unit, legal and technical expertise as well as cutting-edge tools and technology, marking a new era in effectively fighting crime on the Internet. Through cooperative efforts with customers, industry, academic, and criminal law enforcement organizations and other industry partners, the Microsoft Cybercrime Center aims to protect consumers online and make the internet safe.

    This mission requires massive capability and an extremely flexible architecture that can onboard millions (and billions) of new transactions each time a new botnet is taken down. In addition, each and every server transaction needs to be appropriately captured by high end load balancing systems, transformed into a standard format, enhanced with locational data and attributes of ownership, and then stored in a PDW for analysis and provision to partners helping fight global cybercrime. To achieve this—and to process more than 200 million transactions every day—the Microsoft Cybercrime’s Center includes a BI solution based on Microsoft Office and Office 365, SQL Server, Windows Azure, and SharePoint.

    Working with partners, which include global law enforcement, industry leaders, and a wide range of technology companies, the Microsoft Cyercrime Center has been able to take down and disrupt eight of the largest botnets in the world in just four years. Most recently these included ZeroAccess, Citadel, and Bamital. In addition, the Center helped bring one of the first cybercriminals to trial.



  • Cybercrime has cost governments, corporations and the public billions in recent years. But the techniques and level of proof required to solve enterprise cybercrime problems have been extremely challenging in the past. In particular, lost revenue from software piracy is impacting the enterprise bottom line.

    The Microsoft Digital Crimes Unit are effectively stopping unlicensed activity and piracy. Working with global law enforcement agencies—including Interpol, the FBI, and Europol—and Microsoft product engineering and business teams, it combined cyber forensics, big data analysis, and machine learning techniques to enable the ability to identify diverse piracy mechanics. As part of the process, it mines vast amounts of Microsoft product key data, and then applies predictive models to uncover behavior that signifies illegal product usage. and machine learning techniques identify the many different ways software is pirated.

    The results have been dramatic.

    So far, three massive operations in Vietnam, Singapore and China have been halted, which had an immediate impact of $5million in revenue recovery and more to come. In addition, it’s brought several legal cases to court.
  • The Microsoft Store was founded in 2009 with an online presence and three physical stores. By the end of 2014, there will be 101 stores—and the number keeps on growing . With this rapid growth, senior management needed a better way to analyze key business data in its weekly business meeting. But creating the right reports took finance about 16 hours to compile—and still only presented static views. As a result, management wasted time going back into the data trying to find the right insights.

    Using Power BI, the team transformed the process of generating reports. Once it was a cumbersome process that included aggregating and analyzing data, delivering data in the Excel template, and then cutting and pasting visualizations into PowerPoint. Now, it’s as simple as inserting the weekly data. Because Power BI creates the visualizations automatically. In addition, the solution is completely interactive so meetings are spent analyzing insights, not finding the right data. Best yet, the solution only three people only about a week to create—and most of that time was spent loading historical data.

    This simple change in reporting has led to far-reaching impacts. The finance team has reduced the time it takes to create reports from 16 hours to 2 hours. And management has been able to redirect how they spend their time. They can visualize data instantly without going back to Excel tabs. And they can answer questions themselves immediately, rather than asking for additional information. As a result, they spend more time analyzing information—and less time trying to find it.

  • Created in 2009, the Microsoft Store generates more than U.S.$1.5 billion in revenue each year. Of this number, about 2/3 comes from online sales, with about 170 million unique visitors and a projected 30 online device markets by the end of the year.
    To better manage online sales, the team needed faster, more actionable insight. The data was all there in about 30 different systems, but it was hard to access because it required going into different systems and pulling information. And mashing up data was an impossible task because of different formats and systems. Although a daily flash report provided some good reporting, it wasn’t available often enough to manage sales data during critical selling events or campaigns, like Black Friday. In addition, reporting was static, which limited deeper analysis.

    The Microsoft Store team created a web-based BI reporting dashboard that pulls together data from all 30 systems and presents it in a single, easy-to-view interface. People can view sales data in real-time by any pivot they want--such as geography or device—and home in on the insight they need. Any information they need is right there at their fingertips: They can look at sales data by device or region, track data against budgets, and drill down into every pivot for the granular detail they need. They can also track data against budgets and create custom reports. The dashboard even tracks real-time sales. Best yet web reports are automatically shown in Power View for rich visualizations, and they can access data anytime and anywhere.

    The Store Dashboard has been a real game changer for the Microsoft Store—and the benefits are phenomenal. It’s reduced data gathering and analytics from hours to the time it takes to log on. It’s sped up the flow of insight with on-demand information, rather than once-a-day metrics. And because it brings together all data sources, team members no longer waste time logging into different systems trying to access the right information—but still not getting the insight they need.
  • One of the biggest challenges for organizations is harnessing the power of social media—particularly when it is viewed as a rich source of customer data. But with millions of Facebook posts, tweets, forums, and more in different languages, applying analytics is no small task.

    Gestalt was first born trying to analyze all the Windows 7 beta feedback in 2009. Understanding risks and concerns around compatibility was important to prior to launch. For customer service, the main function is to understand what is happening across customer service centers, figure out what’s going on, and operationalize the process as much as possible to streamline how quickly and easily issues are resolved. But the unstructured data coming in was overwhelming. And the team had a brainstorm: to look at social media as the worst case scenario for unstructured data. If it could solve the challenge of applying analytics to social media—the biggest behemoth there is in terms of unstructured data—then it could solve nearly all similar problems where we actually owned the semi-structured data. It also filled in a gap, because customer service had a lot of information already, with surveys, product reviews and so on. But there was nothing that measured social media…especially at the level of detail we needed it (including double-byte support). With ubiquitous terms like Windows and Office as brand names, we needed richer capabilities than basic Boolean definitions. The initial harnessing of social data for support/customer service led to the creation of the @MicrosoftHelps branded support channels on Twitter, Facebook, and Sina Weibo across 16 languages. Once this capability stepped out of incubation and into “normal operations” for CSS, the team continued on their journey to enhance the analytics capabilities and expand data sets.

    The end result is Gestalt. a full-featured platform for real-time, interactive, and visual analytics for any unstructured data. Although social (and support) was the original focus, Gestalt has since become a resource for the company to analyze various direct feedback channels. The best known version of the platform at Microsoft s Gestalt{Social}, which parses massive amounts of customer-generated content in multiple languages, applies natural language processing, and then creates rich, custom visualizations with configurable filters (including full-text search). Today, Gestalt Social supports 60+ major products and 2,000 different areas with consistent text analytics. It connects teams across support, product teams, marketing, and development to resolve issues and get products to market faster—and it provide one set of data that can be analyzed in almost any way. In fact, Gestalt delivers data no one else can deliver, which is substantiated by a 170 percent growth in the past year alone. The result is that it enables a single, consistent conversation across the company, which leads to better customer satisfaction across all teams.

    Here are some examples of how teams are using Gestalt:
    The Office team monitors customer reactions through KB articles after an update is released to ensure no features are broken, and to fix them quickly if they are.
    Iridias, an internal tool, takes aggregated counts by topic/subtopic and feeds in the past two hours of data to monitor outages in cloud services.
    Teams can quantify and analyze the data in Yammer (Microsoft internal “public” groups).
    Engineering takes advantage of Gestalt data and applies machine learning to it to help understand features, scenarios, and issues more deeply and enables them to model and recommend improvements
    Teams use Gestalt for app feedback to learn what should be improved across for Microsoft-owned apps and applications (both in-app feedback as well as store review data).
    Gestalt sits atop the formal complaints logged in //GetHelp
    Support teams can also analyze both transactional support data (not all products) as well as the subsequent surveys







    Here are some specific ways Microsoft teams are using Gestalt:



×