Se ha denunciado esta presentación.
Se está descargando tu SlideShare. ×

Azure Stream Analytics

Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Cargando en…3
×

Eche un vistazo a continuación

1 de 24 Anuncio

Azure Stream Analytics

Descargar para leer sin conexión

Azure Stream Analytics (ASA) is an Azure Service that enables real-time insights over streaming data from devices, sensors, infrastructure, and applications. In this presentation, we provide introduction to the service, common use cases, example customer scenarios, business benefits, and demo how to get started. We will quickly build a simple real time analytic application that uses an IoT device to ingest data (Event Hubs), process and analyze data (Stream Analytics) and visualize data (PowerBI).

Azure Stream Analytics (ASA) is an Azure Service that enables real-time insights over streaming data from devices, sensors, infrastructure, and applications. In this presentation, we provide introduction to the service, common use cases, example customer scenarios, business benefits, and demo how to get started. We will quickly build a simple real time analytic application that uses an IoT device to ingest data (Event Hubs), process and analyze data (Stream Analytics) and visualize data (PowerBI).

Anuncio
Anuncio

Más Contenido Relacionado

Presentaciones para usted (20)

A los espectadores también les gustó (20)

Anuncio

Similares a Azure Stream Analytics (20)

Más de James Serra (16)

Anuncio

Más reciente (20)

Azure Stream Analytics

  1. 1. About Me  Business Intelligence Consultant, in IT for 28 years  Microsoft, Big Data Evangelist  Owner of Serra Consulting Services, specializing in end-to-end Business Intelligence and Data Warehouse solutions using the Microsoft BI stack  Worked as desktop/web/database developer, DBA, BI and DW architect and developer, MDM architect, PDW developer  Been perm, contractor, consultant, business owner  Presenter at PASS Business Analytics Conference and PASS Summit  MCSE for SQL Server 2012: Data Platform and BI  SME for SQL Server 2012 certs  Contributing writer for SQL Server Pro magazine  Blog at JamesSerra.com  SQL Server MVP  Author of book “Reporting with Microsoft SQL Server 2012”
  2. 2. Procure HW Infrastructure and setup Code for ingress, processing and egress Plan for resiliency, such as HW failures Design solution Build Monitoring and Troubleshooting
  3. 3. 36
  4. 4. © 2015 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION. James Serra jserra@microsoft.com Questions?

Notas del editor

  • Fluff, but point is I bring real work experience to the session
  • A tool for real-time processing of streaming data
  • Devices are becoming smarter and more connected, and the expectation of what can be done with the data generated and collected from these devices continues to evolve both in the commercial and consumer spaces. Gartner predicts the number of internet connected devices will increase 30x to 26 billion in 5 years!

    As more and more data is generated from a variety of connected devices, the need to get insights from this data, and predict future behavior and trends is becoming more essential for businesses. This is needed in a variety of different industries such as Manufacturing, Oil and Gas, Automobile, Finance, Online Retail, Smart Grids, Healthcare etc.

    But how do you ensure this technology trend is an opportunity for your business instead of a blockade?

    Let’s start with understanding what is meant by streaming data.
  • What is the Internet of Things?

    Key Points:
    • Today, the Internet of Things (or IoT) is a difficult trend to define precisely as there is no standard definition for it, and everyone has a different meaning
    • Despite how seemingly complex it is, it essentially consists of four basic areas

    Talk Track:
    • Before we go any further, let’s take a moment to talk about the trend… the Internet of Things, or IoT
    • The Internet of Things really comes down to four key things: Physical “things” such as line of business assets, including industry devices or sensors
    • Those “things” that have connectivity to either the internet or to each other or humans
    • Those “things” have the ability to collect and communicate information – this information may include data collected from the environment or inputted by users
    • And then the analytics that comes with the data enable people or machines to take action
  • Key Points:
    Data from connected devices has new properties – in motion – need to get value out of it in “real-time” – often without human intervention.
    What changes with data being in motion instead of at rest?
    With data that is in motion, such as cars moving on the freeway, using existing solutions to answer questions isn’t great.
    With data in motion time is an important element.

    Talk track:
    Let’s take an example to differentiate at rest and in motion data.
    How would your answer the question: “How many red cars are in the parking lot”?
    Answering it with a relational database would be like walking out to the parking lot and counting the vehicles that are red.
    How would your answer the question: “How many red cars have passed exit 165 on I5 in the last hour” using a relational database?
    Answering it with a relational database would be like pulling over, parking all cards in a lot, keeping them there for an hour, and counting the vehicles.
    Doesn’t work that well.

    So now, we know that technology is going to propel streaming data forward in a massive way in the next 5 years. What can you do with it to transform your business and gain a competitive edge?

  • Call out a few examples. If you want to go into details on 3 of them, dive into next 3 slides.

    Key Points:
    There are many use cases for what customers want to do with streaming data that is requiring new solutions to get answer questions from data in motion.
    Let’s go into detail on a couple of them.

    Talk track:
    You do have more data coming in which you want to unlock insights in real-time on, but this is very generic.
    Specific use cases for stream processing stretch across industries:
    Connected cars:
    Use weather data and data from sensors about hitting breaks joined together to identify the car should automatically break to help you avoid black ice which has caused accidents miles in front of you based on your plotted course.
    Respond to and resolve traffic incidents and maintenance needs faster by integrating sensors, license plate reading systems, cameras, and social network data
    Smart Buildings
    Tracking entry for security to make sure a badge is not used at the same time in different buildings.
    Monitoring temperature changes in real-time to detect anomalies pointing to required maintenance.
    Even predicting preventative maintenance that can be done before something breaks using streaming data as historical inputs to a predictive model.

  • Real-time alerting with connected cars:
    If 4 or more cars slam on their breaks, alerting. Use weather data and data from sensors about hitting breaks joined together to identify the car should automatically break to help you avoid black ice which has caused accidents miles in front of you based on your plotted course.
  • Facilities management and building maintenance:
    Government funded power company out of California wants to detect water loss, to enable proactive monitoring and eventually do predictive monitoring. When near pump want to get a notification that potential failure -- anomaly detection.
    They use sensors in the water main and facility to detect water leakage, water loss through evaporation, and other issues by comparing the volume coming into and out of a pipe.

    Similar scenarios seen in the Power Grid space

  • This is an example of Aerocrine using Azure Stream Analytics to remotely monitor their asthma detection devices.
  • Key Points:
    There are 3 canonical scenarios when we talk about Stream Analytics that categorize the use cases we just looked at.
    You can look at these 3 scenarios in different ways: 1) by required human interaction to respond or take an action 2) the response time required to make a decision.
    But how does a customer get started with implementing real-time stream processing today?

    Talk track:
    Enrich: Retain for Future Processing
    Large volumes of data coming in and customers don’t necessarily know what they want to do with the data yet but want to retain it for later processing, such as with HDInsight, or as input to develop machine learning models.
    Will provide minimal processing such as removing PII information, adding geo-tagging, IP lookup, etc.
    Microsoft Cybercrime is an example of this scenario. They augment IP addresses they receive through IP lookups, stripping PII, then storing it to enable a heatmap of where most IPs are originating.
    Analytics: Telemetry/Log Processing
    Next step up the maturity and value curve although there is still a human performing an action, such as monitoring or responding to an alert, with the insight gleaned from the analytics of the real-time data. This is still limited by human response time which isn’t milliseconds and could be in the minutes or hours.
    As more data is gathered and processed Machine Learning can also be used to develop and learn from patterns seen in the system such that it is possible to better predict when machines may need to be serviced or when things are about to go wrong.
    An example of this scenario is a cloud service that has clusters producing tons of log events (servers), and there is a need to monitor in real-time to resolve the issues faster.
    If the same error event comes in from 3 separate streams in a 5 minute window, raise a ticket to resolve the issue before customers report it.
    Actions: IoT
    We’re talking about the command and control tight loop when talking about IoT. Devices/sensors send data, it is processed, a decision is made, and a command is sent back to the device to control it.
    This is up the maturity and value curve again and can produce millisecond response times.
    Another example is a vending machine sending inventory and health data. In some cases, the action that needs to be taken may be as simple as rebooting the machine or pushing down a firmware upgrade which can be done without the need for human interaction.
    As more data is gathered and processed Machine Learning can also be used to develop and learn from patterns seen in the system such that it is possible to better predict when machines may need to be serviced or when things are about to go wrong.
  • Today, a customer would hit a number of problems with existing offerings – even StreamInsight that is part of SQL Server – building a real-time streaming processing solution. Stream Analytics solves these challenges…

    Azure Stream Analytics decreases customer pain points in all the activities for creating a stream processing solution. With this we can reduce the time and resources required to get a customer’s solution up and running.
  • Walk through the phases of a stream processing solution on Microsoft Azure.
  • Key goal of slide:
    As we think about Azure services for IoT, there are a collection of capabilities involved.
    First there are producers. These can be basic sensors, small form factor devices, traditional computer systems, or even complex assets made up of a number of data sources.
    Next we have the Event Ingestion capabilities within and around Azure . The primary destination is Service Bus Event Hubs, but this relies on client agent technology either at the edge device level or within a field or cloud gateway.
    As data is ingressed to Azure, there can be a number of destinations engaged. Traditional database technology, table or blob, or even more complex destinations like Document DB are possible
    As this data is processed in Azure, there are a number of capabilities that can be utilized. Machine Learning, HD Insight, Stream Analytics are examples of tools that can process the data in various ways.
    Finally the concept of data presentation uses Azure services. Data may populate a LOB portal, be pushed to apps, or presented in analytics and productivity tools.

    Through all of these areas, there is the possibility of utilizing existing investments either within your Azure environment, or elsewhere.
  • Output of ASA: Blob storage, Event Hub, Power BI, SQL Database, Table storage
    Input: Data stream (Blob storage, Event Hub), Reference data (Blog storage)
  • Key Points:
    Stream Analytics is a real-time event processing engine as a managed service in the cloud that enables business transformation and competitive edge with fewer resources in less time.
    Optimizes for gaining real-time insights in less time with fewer people resources.
    Let’s dive into what makes Stream Analytics unique among all other streaming engines.
    We’ll start with “fully managed real-time analytics”:
    In connection with Event Hubs for ingestion of millions of events in real time, Stream Analytics performs analytics on this data. Streams can come from high volume sources such as clickstreams, log files, metering, and devices and combined with historical records. These insights can power dashboards or trigger alerts that drive real-time action.

    Talk track:
    By optimizing for gaining real-time insights in less time with fewer people, Stream Analytics eliminates:
    Infrastructure procurement/setup
    Developing a solution for ingress, processing, and egress
    Developing a solution for integrating with components like ML, BI
    Operationalizing the solution for resiliency and infrastructure failures
    Scaling up and down the solution
    Monitoring and troubleshooting
  • Key Points:
    Stream Analytics provides processing events at scale – millions per second – with variable loads analyzing the data in real-time – event correlating with reference data.

    Talk track:
    Processes millions of events per second
    Scale accommodates variable loads and preserves even order on a per-device basis
    Performs continuous real-time analytics for transforming, augmenting, correlating using temporal operations. This allows pattern and anomaly detection
    Correlates streaming data with reference – more static – data
    Think of augmenting events containing IPs with geo-location data or real-time stock market trading events with stock information.
  • Key Points:
    Stream Analytics is a fully-managed PaaS offering on Microsoft Azure.
    A customer should focus on what is the stream processing logic and not the deployment.

    Talk track:
    Eliminates the need to acquire and maintain hardware.
    Removes the need to deploy. You can get up and running with a few clicks and within minutes in the Azure Management Portal.
    Use and pay for resources only when you need them.
    You can easily scale up and down as needed by your business.
  • Key Points:
    Startup costs are kept low from multi-tenancy and only paying for the resources you use.

    Talk track:
    Startup costs are kept low by sharing resources through multi-tenancy job execution. You pay for the resources you use and you can incrementally add resources.
    You can provision and run Stream Analytics for as little as $24/month per job for 1 MB/s throughput (= 1 streaming unit at $0.016/hr) + $0.0005 GB (50% preview discount) volume of data processed by the streaming job per job.
  • Key Points:
    Next, we’ll focus on “mission critical reliability and scale” all without having to write code:
    Deployed in the Azure cloud, Stream Analytics deploys application on cloud scale. You can request for more or less resources at any time and Stream Analytics will scale to any future volume of data while still providing high throughput, low latency, and at a guaranteed resiliency (where no data will be lost or incorrect output made).

    Talk track:

  • Key Points:
    Stream Analytics has built-in guaranteed event delivery and business continuity which is critical for providing reliability and resiliency.

    Talk track:
    You will not lose any events.
    The service provides exactly once delivery of events. You don’t have to write any code for this and you can use it to replay events on failures or from a particular time based on the retention policy you have setup with Event Hubs.
    3 9’s availability built into the service.
    Recovery from failures does not need to start at the beginning of a window. It can start from when the failure occurred in the window.
    This enables businesses to be as real-time as possible.
  • Key Points:
    Scale is built into the service using a scale-out distributed architecture.

    Talk track:
    Scale is controlled using a slider in the Azure Management Portal and not writing code. You can scale up and scale down which is important. For example, when the stock market is closed for the day, you don’t need to process any of the non-existent events coming in so why keep the resources up?
  • Key Points:
    Reduce friction and complexity by abstracting the complexities of writing code for scale out over distributed systems and for custom analytics. Instead, developers need only describe the desired transformations using a declarative SQL-like language and the service will handle everything else.


    Talk track:
  • Stream Analytics gives developers the fastest productivity experience by abstracting the complexities of writing code for scale out over distributed systems and for custom analytics. Instead, developers need only describe the desired transformations using a declarative SQL language and the system will handle everything else.
    Normally, event processing solutions are arduous to implement because of the amount of custom code that needs to be written. Developers have to write code that reflects distributed systems taking into account coding for
    parallelization, deployment over a distributed platform, scheduling and monitoring. Furthermore, code for the analytical functions also must be written.
    While other cloud services for the most part have solutions that handle programming over the distributed platform, likely their code still is procedural and thus lower level and more complex to write (as compared to SQL commands.
    On-premise software may not even be designed to scale to data of high volumes through distributed scale out architectures.

    Key Points:
    Normally, event processing solutions are arduous to implement because of the amount of custom code that needs to be written. Developers have to write code that reflects distributed systems taking into account coding for parallelization, deployment over a distributed platform, scheduling and monitoring. Furthermore, code for the analytical functions also must be written.
    While other cloud services for the most part have solutions that handle programming over the distributed platform, likely their code still is procedural and thus lower level and more complex to write (as compared to SQL commands).
    On-premises software may not even be designed to scale to data of high volumes through distributed scale out architectures.

    Talk track:
    Developers focus on using a SQL-like language to construct stream processing logic and not worrying about accounting for parallelization, deployment to a distributed platform or creating temporal operators.
    Use the SQL-like language across streams to filter, project, aggregate, compare reference data, and perform temporal operations.
    Development, maintenance, and debugging can be done entirely through the Azure Management Portal.
    For public preview, support:
    Input: Azure Event Hubs, Azure Blobs
    Output: Azure Event Hubs, Azure Blobs, and Azure SQL Database, Azure Tables
  • Key Points:
    Stream Analytics has built temporal operators right into its SQL-like language relieving a developer of having to write code for them.
    Configure out-of-order events and decide how to handle late arriving events in the Azure Management Portal instead of writing code.

    Talk track:
  • Key Points:
    Stream Analytics has built-in monitoring and scheduling without needing to write code so a developer can focus on the stream processing logic.

    Talk track:

  • TODO: Judy to provide other criteria to showcase the differences between the options on while still highlighting the value prop for all of them.

    Microsoft offers both on-premises and cloud-based real-time stream processing options. StreamInsight is offered as part of SQL Server and should be used for on-premises deployments.

    The Microsoft Azure platform offers a vast set of data services, and while it’s a luxury to have such a broad array of capabilities to select from, it can also present a challenge. Designing a solution requires that you evaluate which offerings are best suited to your requirements as part of the planning and design project phases. There are a number of instances where Azure provides similar platforms for a given task.

    For example, Storm for Azure HDInsight and Azure Stream Analytics are both platform-as-a-service (PaaS) offerings providing real-time event stream processing.

    Both of these services are highly capable engines suitable for a range of solution deployments, however, some of the differences will influence the decision for which services is best suited to a project.

    Storm for Azure HDInsight is an Apache open-source stream analytics platform running on Microsoft Azure to do real-time data processing. Storm is highly flexible with contributions from the Hadoop community and highly customizable through any development language like Java and .NET (deep Visual Studio IDE integration).

    Azure Stream Analytics is a fully managed Microsoft first party event processing engine that provides real-time analytics in a SQL-based query language to speed time of development. Stream Analytics makes it easy to operationalize event processing with a small number of resources and drives a low price point with its multi-tenancy architecture.
  • The full functionality of Azure Stream Analytics is through REST APIs.
    REST APIs enables programmatic access either through a browser (using Javascript for example) or through a native Windows/Linux client or even a mobile device.
    REST APIs are useful for automation through scripting.
    Also they can be used to embed access to ASA through other management tools.
    All task operations conform to the HTTP/1.1 protocol specification and each operation returns an x-ms-request-id header that can be used to obtain information about the request. You must make sure that requests made to these resources are secure. For more information, see Authenticating Azure Resource Manager requests.
  • http://gallery.azureml.net/Tutorial/6f95aeaca0fc43b3aec37ad3a0526a21
  • http://gallery.azureml.net/Tutorial/6f95aeaca0fc43b3aec37ad3a0526a21

    Connect sensor to laptop with Bluetooth
    Create Azure EventHub (creates Service Bus namespace)
    Create EventHub rules
    Create app to read sensor data and send it to EventHub. Configure Event Hub, Service Bus, and access policy name and access key. Run app
    Create Azure Stream Analytics job
    Add ASA input – Data stream
    Add ASA output – Power BI
    Add ASA query
    Start ASA job
    Create Power BI reports
  • The Swedish medical device company provides devices like NIOX MINO and NIOX VERO, which measure fractional exhaled nitric oxide (FeNO), a biomarker for airway inflammation. These instruments are used by thousands of physicians and nurses in hospitals and asthma clinics around the globe to identify asthma and monitor patients’ progress in controlling the disease.

    Millions of asthma sufferers worldwide depend on Aerocrine monitoring devices to diagnose and treat their disease effectively. But those devices are sensitive to small changes in ambient environment. The company had another challenge: it couldn't transmit data for remote monitoring by telemetry. That’s why Aerocrine is using a cloud analytics solution to boost reliability.

    They used the Azure platform to gather telemetry data from computers connected to more than a dozen global NIOX MINO devices and then transmit the data for analysis. The company developed an application that transmits complex data like sensor and environmental data from each instrument’s sensor, the serial number of devices and sensors, and the number of airflow measurements remaining on each instrument. The data is transmitted to Microsoft Azure to be ingested by Azure Event Hubs.

    By using Azure Event Hubs as well as Azure Stream Analytics—a real-time event processing engine for uncovering insights from device data and presenting it in data streams—Aerocrine can capture and manage real-time data about FeNO devices for the first time. That data will eventually be presented to Aerocrine’s sales and customer service representatives via Microsoft Power BI for Office 365 dashboards.

    Benefits:
    Proactively Alerting Customers to Minimize Device Downtime
    Discovering Trigger Points and Transforming the Business
    Helping Physicians Deliver Improved Asthma Care

    https://customers.microsoft.com/Pages/CustomerStory.aspx?recid=12216
  • Key Points:
    We have a number of programs and trials to help you get started exploring the Microsoft data platform

    Talk Track:
    We want to make it easy for you to explore the Microsoft data platform. Go online and trial our products like Power BI and Azure.
    Or take advantage of our great Immersion Experience Program where you get a hands-on, guided tour of our end-to-end BI capabilities.
    Feel free to make contact with your Microsoft or Microsoft Partner Representative to schedule a call to learn more about what we can do with you.

    Thank you so much for your time.


  • Streaming unit – Unit of compute capacity (1 streaming unit = 1 MB/s throughput)

    Key Points:
    Stream Analytics has low startup costs. Multi-tenancy lowers costs you incur by sharing processing resources between jobs.
    It is billed per job based on the volume of data processed per day and the number of streaming units used per hour.
    In public preview, a job can scale up to 12 streaming units.

    Talk track:
    A streaming unit represents the resources provided to a query to execute against the volume of data being ingested up to 1 MB/s.
    The number of streaming units used by a job can be scaled up or down in the Azure Management Portal and is maxed out at 12 for public preview.
    The complexity of the query and the required rate of return (latency) for the query influence the required number of streaming units.
    For public preview, the number of streaming units to use is a bit of trial and error to achieve the desired throughput or rate of return (latency).
  • This is for a single job using GA pricing achieving 1 MB/sec of average processing using a single Event Hub input.
  • Now let us dig deeper into what a typical ASA application looks like. An ASA application has three major components:
    Input – Inputs are the sources of events. Note that the ‘original’ source of streaming events are devices, machines, applications, sensors, applications etc. However, ASA is not intended to connect to them directly. Rather ASA lets Azure Event Hubs be the primary interface to the wide variety of event sources. ASA is optimized to get streaming data from Azure Event Hubs and Azure Blob Storage. Azure Blob Storage is the likely place where log or reference data is stored. The list of input sources that ASA directly integrates with may increase in the future, but Azure Event Hubs and Azure Blob Storage will be the primary sources. There can be multiple inputs used in each Stream Analytics job that can come from Azure Event Hubs and Azure Blob Storage.
    Query – Queries are the main component of an ASA application. Queries implement the “analytics logic”. Queries are a set of transformations that are applied to the input stream to produce another set of output events. Queries are the only thing that an ASA application developers actually ‘develops’. Everything else is done through guided wizards in the Azure Portal. Note that ASA has a SQL-like query language but unlike traditional databases, ASA queries run continuously against the stream of incoming events. The queries stop being applied only when the job itself stops.
    Output – As queries execute they continuously produce results. The results can be stored in Blob Storage, Event Hubs, Azure Tables and Azure SQL database. Note that if the output is stored in Event Hub or Blob Storage, it can become the input to another ASA job. So it is possible to ‘chain’ together multiple jobs to implement a series of transformations. A single ASA job can output to multiple permanent stores.
  • Speaker Notes:
    Mention that this slide provides an at-a-glance overview of the query capabilities of the SA query language. As noted earlier while some/most of the operators are vanilla T-SQL capabilities there are some features specific to analysis of streaming data such as the Windowing Extensions, scaling functions, DATEDIFF etc.
    The subsequent slides will go into these functions in greater detail.

×