Time Series Anomaly Detection with Azure and .NETT

Marco Parenzan
Marco ParenzanSenior Solutions Architect @ beanTech, Microsoft MVP
Time Series Anomaly Detection
with Azure and .NET (part 1)
Marco Parenzan // @marco_parenzan
Marco Parenzan
• Senion Solution Architect @ beanTech
• 1nn0va Community Lead (Pordenone)
• Microsoft Azure MVP
• Profiles
• Linkedin: https://www.linkedin.com/in/marcoparenzan/
• Slideshare: https://www.slideshare.net/marco.parenzan
• GitHub: https://github.com/marcoparenzan
This is the journey of…
• …a .NET developer…
• …or an IoT developer…
• …a one-man band (sometimes )…
• …facing typical data science world topics…
• …that wants to use .NET everywhere!
A typical scenario
Scenario
• In an industrial fridge, you monitor temperatures to check not the
temperature «per se», but to check the healthy of the plant
From real industrial fridges 
Storage
Account
IoT Hub
Devices
Events
Ingest
The batch point of view...
With no any specific request...
what is IoT all about?
Efficiency Anomalies
Batch Streaming
Time Series
• Definition
• Time series is a sequence of data points recorded in time order, often taken at successive
equally paced points in time.
• Examples
• Stock prices, Sales demand, website traffic, daily temperatures, quarterly sales
• Time series is different from regression analysis because of its time-dependent
nature.
• Auto-correlation: Regression analysis requires that there is little or no autocorrelation in the
data. It occurs when the observations are not independent of each other. For example, in
stock prices, the current price is not independent of the previous price. [The observations
have to be dependent on time]
• Seasonality, a characteristic which we will discuss below.
Anomaly Detection in Time Series
• In time series data, an anomaly or outlier can be termed as a data
point which is not following the common collective trend or seasonal
or cyclic pattern of the entire data and is significantly distinct from
rest of the data. By significant, most data scientists mean statistical
significance, which in order words, signify that the statistical
properties of the data point is not in alignment with the rest of the
series.
• Anomaly detection has two basic assumptions:
• Anomalies only occur very rarely in the data.
• Their features differ from the normal instances significantly.
Threshold anomalies?
• Threshold Anomalies for a time window
• Slow changing damages
• Fridge is no more efficient
• Threshold alarms are not enough
• Anomalies cannot be just «over a threshold
for some time»...
• Condenser or Evaporator with difficulties
starting
• Distinguish from Opening a door (that is
also an anomaly)
• Or also counting the number of times that
there are peaks (too many times)
• You can considering each of these
events as anomalies that alter the
temperature you measure in different
part of the fridge
How can we implement processing?
Ingest Process
Storage
Account
Azure
IoT Hub-Related
Services
Devices
Events
?
We explore some of them,
probably the most Microsoft and Azure oriented
But….
I’m not a data scientist!
Or a BI Analyst!
I’m a .NET Developer!
Make me think as a Data
Scientist!
How Data Scientists work
Spark Unifies:
 Batch Processing
 Interactive SQL
 Real-time processing
 Machine Learning
 Deep Learning
 Graph Processing
An unified, open source, parallel, data processing framework for Big Data Analytics
Spark Core Engine
Spark SQL
Batch processing
Spark Structured
Streaming
Stream processing
Spark MLlib
Machine
Learning
Yarn
Spark MLlib
Machine
Learning
Spark
Streaming
Stream processing
GraphX
Graph
Computation
http://spark.apache.org
Apache Spark
Batch vs. Notebooks
• Batch
• Work on slow data stored into a
Datalake
• Submit a complete app in one
single deploy
• Receive the entire output
• Notebook
• «sketching» the code
• Write/delete/rewrite continuously
• Run cell by cell (but also all at
once) interactive
• In a world of Mathematica
Jupyter
• Evolution and generalization of the seminal role of Mathematica
• In web standards way
• Web (HTTP+Markdown)
• Python adoption (ipynb)
• Written in Java
• Python has an interop bridge...not native (if ever
important)Python is a kernel for Jupyter
Python!
• Simple to start (that why C# is pythonizing…)
• “Open Source”
• TensorFlow, Scikit-learn, Keras, Pandas, PyTorch
• Remember one thing:
• Often behind a Data Science framework there is a native library and Python
binds that library
• Spark is written in Java and there is a bridge for Python to Spark
• Jupyter is written in Java and there is a bridge (kernel) for Python
The Data Scientist toolbox
Helping no-data scientits developers (all! )
• Unsupervised Machine LearningNo labelling
• Auto(mated) MLfind the best tuning for you with parameters and algorithms
• Automated Training Set for Anomaly Detection Algorithms
• the algorithms automatically generates a simulated training set based non your input
data https://docs.microsoft.com/en-us/azure/machine-learning/algorithm-cheat-sheet
Spectrum Residual Cnn (SrCnn)
• to monitor the time-series continuously and alert for potential incidents on time
• The algorithm first computes the Fourier Transform of the original data. Then it
computes the spectral residual of the log amplitude of the transformed signal
before applying the Inverse Fourier Transform to map the sequence back from
the frequency to the time domain. This sequence is called the saliency map. The
anomaly score is then computed as the relative difference between the saliency
map values and their moving averages. If the score is above a threshold, the value
at a specific timestep is flagged as an outlier.
• There are several parameters for SR algorithm. To obtain a model with good
performance, we suggest to tune windowSize and threshold at first, these are the
most important parameters to SR. Then you could search for an appropriate
judgementWindowSize which is no larger than windowSize. And for the
remaining parameters, you could use the default value directly.
• Time-Series Anomaly Detection Service at Microsof
[https://arxiv.org/pdf/1906.03821.pdf]
The .NET toolbox
Data Science and AI for the .NET developer
• ML.NET is first and foremost a framework that you can use
to create your own custom ML models. This custom
approach contrasts with “pre-built AI,” where you use pre-
designed general AI services from the cloud (like many of
the offerings from Azure Cognitive Services). This can work
great for many scenarios, but it might not always fit your
specific business needs due to the nature of the machine
learning problem or to the deployment context (cloud vs.
on-premises).
• ML.NET enables developers to use their existing .NET skills
to easily integrate machine learning into almost any .NET
application. This means that if C# (or F# or VB) is your
programming language of choice, you no longer have to
learn a new programming language, like Python or R, in
order to develop your own ML models and infuse custom
machine learning into your .NET apps.
ML.NET Components
Anomaly Detection
.NET Interactive and Jupyter
and Visual Studio Code
• .NET Interactive gives C# and F# kernels to Jupyter
• .NET Interactive gives all tools to create your hosting application
independently from Jupyter
• In Visual Studio Code, you have two different notebooks (looking similar but
developed in parallel by different teams)
• .NET Interactive Notebook (by the .NET Interactive Team) that can run also Python
• Jupyter Notebook (by the Azure Data Studio Team – probably) that can run also C# and
F#
• There is a little confusion on that 
• .NET Interactive has a strong C#/F# Kernel...
• ...a less mature infrastructure (compared to Jupiter)
.NET for Apache Spark 1.1.1
• .NET bindings (C# e F#) to Spark
• Written on the Spark interop layer,
designed to provide high
performance bindings to multiple
languages
• Re-use knowledge, skills, code you
have as a .NET developer
• Compliant with .NET Standard
• You can use .NET for Apache Spark
anywhere you write .NET code
• Original project Moebius
• https://github.com/microsoft/Mobius
Experimenting ML.NET
and .NET Interactive
The Azure toolbox
Functions everywhere
Platform
App delivery
OS
On-premises
Code
App Service on Azure Stack
Windows
●●●
Non-Azure hosts
●●●
●●●
+
Azure Functions
host runtime
Azure Functions
Core Tools
Azure Functions
base Docker image
Azure Functions
.NET Docker image
Azure Functions
Node Docker image
●●●
Azure Cognitive Services
• Cognitive Services brings AI within reach of every developer—without requiring
machine-learning expertise. All it takes is an API call to embed the ability to see,
hear, speak, search, understand, and accelerate decision-making into your apps.
Enable developers of all skill levels to easily add AI capabilities to their apps.
• Five areas:
• Decision
• Language
• Speech
• Vision
• Web search
Anomaly Detector
Identify potential problems early on.
Content Moderator
Detect potentially offensive or unwanted
content.
Metrics Advisor PREVIEW
Monitor metrics and diagnose issues.
Personalizer
Create rich, personalized experiences for every
user.
Azure Synapse Analytics for the Big Data
Limitless analytics service with unmatched time to insight
Platform
Azure
Data Lake Storage
Common Data Model
Enterprise Security
Optimized for Analytics
METASTORE
SECURITY
MANAGEMENT
MONITORING
DATA INTEGRATION
Analytics Runtimes
DEDICATED SERVERLESS
Form Factors
SQL
Languages
Python .NET Java Scala
Experience Synapse Analytics Studio
Artificial Intelligence / Machine Learning / Internet of Things
Intelligent Apps / Business Intelligence
METASTORE
SECURITY
MANAGEMENT
MONITORING
What is ADX?
34
• A Telemetry data Search engine => ELK replacement
• A TSDB => OSS LAMBDA (MinIO + Kafka) replacement
• A Tool to Materialize data into ADLS & SQL
• A Tool for monitoring, summarizing information and send notifications
Conclusions?
First conclusions
• .NET ecosystem in Data Science World is completing
• C# is pythonizing since C# 7.x
• .NET for Spark that runs in Synapse and DataBricks
• .Net Interactive notebooks in Visual Studio Code, Synapse, Cosmos...
• Azure has lot of support for Data Science in .NET and adopt
everything described
See you for second part with the complete
journey with more demoes
Part 2: Sept 23rd, 7.20 AM EDT
Time Series Anomaly Detection
with Azure and .NET (part 1)
Thank you!
Marco Parenzan
Senior Solution Architect @ beanTech
Microsoft Azure MVP
1nn0va Community Lead
• https://docs.microsoft.com/en-us/azure/cognitive-services/anomaly-detector/
• https://docs.microsoft.com/en-us/dotnet/machine-learning/tutorials/sales-anomaly-detection
• https://github.com/dotnet/interactive
• https://docs.microsoft.com/en-us/dotnet/machine-learning/how-to-guides/serve-model-serverless-azure-functions-ml-net
• https://azure.microsoft.com/en-us/services/cognitive-services/metrics-advisor/
1 de 38

Recomendados

.net interactive for notebooks and for your data job por
.net interactive for notebooks and for your data job.net interactive for notebooks and for your data job
.net interactive for notebooks and for your data jobMarco Parenzan
67 vistas16 diapositivas
Deep dive time series anomaly detection with different Azure Data Services por
Deep dive time series anomaly detection with different Azure Data ServicesDeep dive time series anomaly detection with different Azure Data Services
Deep dive time series anomaly detection with different Azure Data ServicesMarco Parenzan
181 vistas50 diapositivas
Security From The Big Data and Analytics Perspective por
Security From The Big Data and Analytics PerspectiveSecurity From The Big Data and Analytics Perspective
Security From The Big Data and Analytics PerspectiveAll Things Open
475 vistas14 diapositivas
Azure Digital Twins por
Azure Digital TwinsAzure Digital Twins
Azure Digital TwinsMarco Parenzan
836 vistas62 diapositivas
7 New Tools Java Developers Should Know por
7 New Tools Java Developers Should Know7 New Tools Java Developers Should Know
7 New Tools Java Developers Should KnowTakipi
48.5K vistas16 diapositivas
Spring Integration Splunk por
Spring Integration SplunkSpring Integration Splunk
Spring Integration SplunkDamien Dallimore
2.1K vistas9 diapositivas

Más contenido relacionado

La actualidad más candente

Metrics & more por
Metrics & more Metrics & more
Metrics & more Stefan Thies
3.3K vistas41 diapositivas
Advanced Splunk Administration por
Advanced Splunk AdministrationAdvanced Splunk Administration
Advanced Splunk AdministrationGreg Hanchin
1.1K vistas2 diapositivas
Monitoring real-life Azure applications: When to use what and why por
Monitoring real-life Azure applications: When to use what and whyMonitoring real-life Azure applications: When to use what and why
Monitoring real-life Azure applications: When to use what and whyKarl Ots
624 vistas36 diapositivas
DEF CON 27 - CHRISTOPHER ROBERTS - firmware slap por
DEF CON 27 - CHRISTOPHER ROBERTS - firmware slapDEF CON 27 - CHRISTOPHER ROBERTS - firmware slap
DEF CON 27 - CHRISTOPHER ROBERTS - firmware slapFelipe Prado
61 vistas50 diapositivas
From Pipelines to Refineries: scaling big data applications with Tim Hunter por
From Pipelines to Refineries: scaling big data applications with Tim HunterFrom Pipelines to Refineries: scaling big data applications with Tim Hunter
From Pipelines to Refineries: scaling big data applications with Tim HunterDatabricks
2K vistas29 diapositivas
node-crate: node.js and big data por
 node-crate: node.js and big data node-crate: node.js and big data
node-crate: node.js and big dataStefan Thies
7K vistas35 diapositivas

La actualidad más candente(20)

Metrics & more por Stefan Thies
Metrics & more Metrics & more
Metrics & more
Stefan Thies3.3K vistas
Advanced Splunk Administration por Greg Hanchin
Advanced Splunk AdministrationAdvanced Splunk Administration
Advanced Splunk Administration
Greg Hanchin1.1K vistas
Monitoring real-life Azure applications: When to use what and why por Karl Ots
Monitoring real-life Azure applications: When to use what and whyMonitoring real-life Azure applications: When to use what and why
Monitoring real-life Azure applications: When to use what and why
Karl Ots624 vistas
DEF CON 27 - CHRISTOPHER ROBERTS - firmware slap por Felipe Prado
DEF CON 27 - CHRISTOPHER ROBERTS - firmware slapDEF CON 27 - CHRISTOPHER ROBERTS - firmware slap
DEF CON 27 - CHRISTOPHER ROBERTS - firmware slap
Felipe Prado61 vistas
From Pipelines to Refineries: scaling big data applications with Tim Hunter por Databricks
From Pipelines to Refineries: scaling big data applications with Tim HunterFrom Pipelines to Refineries: scaling big data applications with Tim Hunter
From Pipelines to Refineries: scaling big data applications with Tim Hunter
Databricks2K vistas
node-crate: node.js and big data por Stefan Thies
 node-crate: node.js and big data node-crate: node.js and big data
node-crate: node.js and big data
Stefan Thies7K vistas
Durable Functions vs Logic App : la guerra dei workflow!! por Massimo Bonanni
Durable Functions vs Logic App : la guerra dei workflow!!Durable Functions vs Logic App : la guerra dei workflow!!
Durable Functions vs Logic App : la guerra dei workflow!!
Massimo Bonanni120 vistas
Monitoring Error Logs at Databricks por Anyscale
Monitoring Error Logs at DatabricksMonitoring Error Logs at Databricks
Monitoring Error Logs at Databricks
Anyscale2.2K vistas
DataEngConf SF16 - Methods for Content Relevance at LinkedIn por Hakka Labs
DataEngConf SF16 - Methods for Content Relevance at LinkedInDataEngConf SF16 - Methods for Content Relevance at LinkedIn
DataEngConf SF16 - Methods for Content Relevance at LinkedIn
Hakka Labs511 vistas
Ingestion in data pipelines with Managed Kafka Clusters in Azure HDInsight por Microsoft Tech Community
Ingestion in data pipelines with Managed Kafka Clusters in Azure HDInsightIngestion in data pipelines with Managed Kafka Clusters in Azure HDInsight
Ingestion in data pipelines with Managed Kafka Clusters in Azure HDInsight
Splunk as a_big_data_platform_for_developers_spring_one2gx por Damien Dallimore
Splunk as a_big_data_platform_for_developers_spring_one2gxSplunk as a_big_data_platform_for_developers_spring_one2gx
Splunk as a_big_data_platform_for_developers_spring_one2gx
Damien Dallimore5.3K vistas
Microservices and Devs in Charge: Why Monitoring is an Analytics Problem por SignalFx
Microservices and Devs in Charge: Why Monitoring is an Analytics ProblemMicroservices and Devs in Charge: Why Monitoring is an Analytics Problem
Microservices and Devs in Charge: Why Monitoring is an Analytics Problem
SignalFx860 vistas
Validating Big Data Jobs—Stopping Failures Before Production on Apache Spark... por Databricks
 Validating Big Data Jobs—Stopping Failures Before Production on Apache Spark... Validating Big Data Jobs—Stopping Failures Before Production on Apache Spark...
Validating Big Data Jobs—Stopping Failures Before Production on Apache Spark...
Databricks1.4K vistas
Advanced Use Cases for Analytics Breakout Session por Splunk
Advanced Use Cases for Analytics Breakout SessionAdvanced Use Cases for Analytics Breakout Session
Advanced Use Cases for Analytics Breakout Session
Splunk1.3K vistas
Operationalizing Docker at Scale: Lessons from Running Microservices in Produ... por SignalFx
Operationalizing Docker at Scale: Lessons from Running Microservices in Produ...Operationalizing Docker at Scale: Lessons from Running Microservices in Produ...
Operationalizing Docker at Scale: Lessons from Running Microservices in Produ...
SignalFx1.6K vistas
From MapReduce to Apache Spark por Jen Aman
From MapReduce to Apache SparkFrom MapReduce to Apache Spark
From MapReduce to Apache Spark
Jen Aman1.1K vistas
Shmoocon 2015 - httpscreenshot por jstnkndy
Shmoocon 2015 - httpscreenshotShmoocon 2015 - httpscreenshot
Shmoocon 2015 - httpscreenshot
jstnkndy1.9K vistas
CloudBrew 2017 - Security + DevOps + Azure = Awesomeness por Karl Ots
CloudBrew 2017 - Security + DevOps + Azure = AwesomenessCloudBrew 2017 - Security + DevOps + Azure = Awesomeness
CloudBrew 2017 - Security + DevOps + Azure = Awesomeness
Karl Ots281 vistas
What is going on - Application diagnostics on Azure - TechDays Finland por Maarten Balliauw
What is going on - Application diagnostics on Azure - TechDays FinlandWhat is going on - Application diagnostics on Azure - TechDays Finland
What is going on - Application diagnostics on Azure - TechDays Finland
Maarten Balliauw746 vistas
Spark Uber Development Kit por Jen Aman
Spark Uber Development KitSpark Uber Development Kit
Spark Uber Development Kit
Jen Aman3.9K vistas

Similar a Time Series Anomaly Detection with Azure and .NETT

Deep Dive Time Series Anomaly Detection in Azure with dotnet por
Deep Dive Time Series Anomaly Detection in Azure with dotnetDeep Dive Time Series Anomaly Detection in Azure with dotnet
Deep Dive Time Series Anomaly Detection in Azure with dotnetMarco Parenzan
134 vistas49 diapositivas
.NET per la Data Science e oltre por
.NET per la Data Science e oltre.NET per la Data Science e oltre
.NET per la Data Science e oltreMarco Parenzan
180 vistas40 diapositivas
.NET for Azure Synapse (and viceversa) por
.NET for Azure Synapse (and viceversa).NET for Azure Synapse (and viceversa)
.NET for Azure Synapse (and viceversa)Marco Parenzan
75 vistas33 diapositivas
Hot to build continuously processing for 24/7 real-time data streaming platform? por
Hot to build continuously processing for 24/7 real-time data streaming platform?Hot to build continuously processing for 24/7 real-time data streaming platform?
Hot to build continuously processing for 24/7 real-time data streaming platform?GetInData
262 vistas32 diapositivas
Machine learning and Deep learning on edge devices using TensorFlow por
Machine learning and Deep learning on edge devices using TensorFlowMachine learning and Deep learning on edge devices using TensorFlow
Machine learning and Deep learning on edge devices using TensorFlowAditya Bhattacharya
126 vistas25 diapositivas
Vulnerability, exploit to metasploit por
Vulnerability, exploit to metasploitVulnerability, exploit to metasploit
Vulnerability, exploit to metasploitTiago Henriques
2.6K vistas41 diapositivas

Similar a Time Series Anomaly Detection with Azure and .NETT(20)

Deep Dive Time Series Anomaly Detection in Azure with dotnet por Marco Parenzan
Deep Dive Time Series Anomaly Detection in Azure with dotnetDeep Dive Time Series Anomaly Detection in Azure with dotnet
Deep Dive Time Series Anomaly Detection in Azure with dotnet
Marco Parenzan134 vistas
.NET per la Data Science e oltre por Marco Parenzan
.NET per la Data Science e oltre.NET per la Data Science e oltre
.NET per la Data Science e oltre
Marco Parenzan180 vistas
.NET for Azure Synapse (and viceversa) por Marco Parenzan
.NET for Azure Synapse (and viceversa).NET for Azure Synapse (and viceversa)
.NET for Azure Synapse (and viceversa)
Marco Parenzan75 vistas
Hot to build continuously processing for 24/7 real-time data streaming platform? por GetInData
Hot to build continuously processing for 24/7 real-time data streaming platform?Hot to build continuously processing for 24/7 real-time data streaming platform?
Hot to build continuously processing for 24/7 real-time data streaming platform?
GetInData262 vistas
Machine learning and Deep learning on edge devices using TensorFlow por Aditya Bhattacharya
Machine learning and Deep learning on edge devices using TensorFlowMachine learning and Deep learning on edge devices using TensorFlow
Machine learning and Deep learning on edge devices using TensorFlow
Aditya Bhattacharya126 vistas
Vulnerability, exploit to metasploit por Tiago Henriques
Vulnerability, exploit to metasploitVulnerability, exploit to metasploit
Vulnerability, exploit to metasploit
Tiago Henriques2.6K vistas
Consolidating MLOps at One of Europe’s Biggest Airports por Databricks
Consolidating MLOps at One of Europe’s Biggest AirportsConsolidating MLOps at One of Europe’s Biggest Airports
Consolidating MLOps at One of Europe’s Biggest Airports
Databricks327 vistas
Microservices in action at the Dutch National Police - Bert Jan Schrijver - C... por Codemotion
Microservices in action at the Dutch National Police - Bert Jan Schrijver - C...Microservices in action at the Dutch National Police - Bert Jan Schrijver - C...
Microservices in action at the Dutch National Police - Bert Jan Schrijver - C...
Codemotion468 vistas
CodeMotion Amsterdam 2018 - Microservices in action at the Dutch National Police por Bert Jan Schrijver
CodeMotion Amsterdam 2018 - Microservices in action at the Dutch National PoliceCodeMotion Amsterdam 2018 - Microservices in action at the Dutch National Police
CodeMotion Amsterdam 2018 - Microservices in action at the Dutch National Police
Bert Jan Schrijver202 vistas
Enabling IoT Devices’ Hardware and Software Interoperability, IPSO Alliance (... por Open Mobile Alliance
Enabling IoT Devices’ Hardware and Software Interoperability, IPSO Alliance (...Enabling IoT Devices’ Hardware and Software Interoperability, IPSO Alliance (...
Enabling IoT Devices’ Hardware and Software Interoperability, IPSO Alliance (...
Open Mobile Alliance1.2K vistas
Devoxx PL 2018 - Microservices in action at the Dutch National Police por Bert Jan Schrijver
Devoxx PL 2018 - Microservices in action at the Dutch National PoliceDevoxx PL 2018 - Microservices in action at the Dutch National Police
Devoxx PL 2018 - Microservices in action at the Dutch National Police
Bert Jan Schrijver141 vistas
Azure Notebooks - Jupyter for the Cloud por Cameron Vetter
Azure Notebooks - Jupyter for the CloudAzure Notebooks - Jupyter for the Cloud
Azure Notebooks - Jupyter for the Cloud
Cameron Vetter492 vistas
Blue Teaming on a Budget of Zero por Kyle Bubp
Blue Teaming on a Budget of ZeroBlue Teaming on a Budget of Zero
Blue Teaming on a Budget of Zero
Kyle Bubp646 vistas
iSense Java Summit 2017 - Microservices in action at the Dutch National Police por Bert Jan Schrijver
iSense Java Summit 2017 - Microservices in action at the Dutch National PoliceiSense Java Summit 2017 - Microservices in action at the Dutch National Police
iSense Java Summit 2017 - Microservices in action at the Dutch National Police
Bert Jan Schrijver314 vistas
Dublin JUG February 2018 - Microservices in action at the Dutch National Police por Bert Jan Schrijver
Dublin JUG February 2018 - Microservices in action at the Dutch National PoliceDublin JUG February 2018 - Microservices in action at the Dutch National Police
Dublin JUG February 2018 - Microservices in action at the Dutch National Police
Bert Jan Schrijver144 vistas
Get There meetup March 2018 - Microservices in action at the Dutch National P... por Bert Jan Schrijver
Get There meetup March 2018 - Microservices in action at the Dutch National P...Get There meetup March 2018 - Microservices in action at the Dutch National P...
Get There meetup March 2018 - Microservices in action at the Dutch National P...
Bert Jan Schrijver230 vistas
Scylla Summit 2022: Learning Rust the Hard Way for a Production Kafka+ScyllaD... por ScyllaDB
Scylla Summit 2022: Learning Rust the Hard Way for a Production Kafka+ScyllaD...Scylla Summit 2022: Learning Rust the Hard Way for a Production Kafka+ScyllaD...
Scylla Summit 2022: Learning Rust the Hard Way for a Production Kafka+ScyllaD...
ScyllaDB675 vistas

Más de Marco Parenzan

Azure IoT Central per lo SCADA engineer por
Azure IoT Central per lo SCADA engineerAzure IoT Central per lo SCADA engineer
Azure IoT Central per lo SCADA engineerMarco Parenzan
7 vistas60 diapositivas
Azure Hybrid @ Home por
Azure Hybrid @ HomeAzure Hybrid @ Home
Azure Hybrid @ HomeMarco Parenzan
10 vistas50 diapositivas
Static abstract members nelle interfacce di C# 11 e dintorni di .NET 7.pptx por
Static abstract members nelle interfacce di C# 11 e dintorni di .NET 7.pptxStatic abstract members nelle interfacce di C# 11 e dintorni di .NET 7.pptx
Static abstract members nelle interfacce di C# 11 e dintorni di .NET 7.pptxMarco Parenzan
10 vistas34 diapositivas
Azure Synapse Analytics for your IoT Solutions por
Azure Synapse Analytics for your IoT SolutionsAzure Synapse Analytics for your IoT Solutions
Azure Synapse Analytics for your IoT SolutionsMarco Parenzan
33 vistas21 diapositivas
Power BI Streaming Data Flow e Azure IoT Central por
Power BI Streaming Data Flow e Azure IoT Central Power BI Streaming Data Flow e Azure IoT Central
Power BI Streaming Data Flow e Azure IoT Central Marco Parenzan
29 vistas61 diapositivas
Power BI Streaming Data Flow e Azure IoT Central por
Power BI Streaming Data Flow e Azure IoT CentralPower BI Streaming Data Flow e Azure IoT Central
Power BI Streaming Data Flow e Azure IoT CentralMarco Parenzan
8 vistas62 diapositivas

Más de Marco Parenzan(20)

Azure IoT Central per lo SCADA engineer por Marco Parenzan
Azure IoT Central per lo SCADA engineerAzure IoT Central per lo SCADA engineer
Azure IoT Central per lo SCADA engineer
Marco Parenzan7 vistas
Static abstract members nelle interfacce di C# 11 e dintorni di .NET 7.pptx por Marco Parenzan
Static abstract members nelle interfacce di C# 11 e dintorni di .NET 7.pptxStatic abstract members nelle interfacce di C# 11 e dintorni di .NET 7.pptx
Static abstract members nelle interfacce di C# 11 e dintorni di .NET 7.pptx
Marco Parenzan10 vistas
Azure Synapse Analytics for your IoT Solutions por Marco Parenzan
Azure Synapse Analytics for your IoT SolutionsAzure Synapse Analytics for your IoT Solutions
Azure Synapse Analytics for your IoT Solutions
Marco Parenzan33 vistas
Power BI Streaming Data Flow e Azure IoT Central por Marco Parenzan
Power BI Streaming Data Flow e Azure IoT Central Power BI Streaming Data Flow e Azure IoT Central
Power BI Streaming Data Flow e Azure IoT Central
Marco Parenzan29 vistas
Power BI Streaming Data Flow e Azure IoT Central por Marco Parenzan
Power BI Streaming Data Flow e Azure IoT CentralPower BI Streaming Data Flow e Azure IoT Central
Power BI Streaming Data Flow e Azure IoT Central
Marco Parenzan8 vistas
Power BI Streaming Data Flow e Azure IoT Central por Marco Parenzan
Power BI Streaming Data Flow e Azure IoT CentralPower BI Streaming Data Flow e Azure IoT Central
Power BI Streaming Data Flow e Azure IoT Central
Marco Parenzan105 vistas
Developing Actors in Azure with .net por Marco Parenzan
Developing Actors in Azure with .netDeveloping Actors in Azure with .net
Developing Actors in Azure with .net
Marco Parenzan80 vistas
Math with .NET for you and Azure por Marco Parenzan
Math with .NET for you and AzureMath with .NET for you and Azure
Math with .NET for you and Azure
Marco Parenzan59 vistas
Power BI data flow and Azure IoT Central por Marco Parenzan
Power BI data flow and Azure IoT CentralPower BI data flow and Azure IoT Central
Power BI data flow and Azure IoT Central
Marco Parenzan106 vistas
.net for fun: write a Christmas videogame por Marco Parenzan
.net for fun: write a Christmas videogame.net for fun: write a Christmas videogame
.net for fun: write a Christmas videogame
Marco Parenzan108 vistas
Building IoT infrastructure on edge with .net, Raspberry PI and ESP32 to conn... por Marco Parenzan
Building IoT infrastructure on edge with .net, Raspberry PI and ESP32 to conn...Building IoT infrastructure on edge with .net, Raspberry PI and ESP32 to conn...
Building IoT infrastructure on edge with .net, Raspberry PI and ESP32 to conn...
Marco Parenzan361 vistas
Anomaly Detection with Azure and .NET por Marco Parenzan
Anomaly Detection with Azure and .NETAnomaly Detection with Azure and .NET
Anomaly Detection with Azure and .NET
Marco Parenzan203 vistas
Deploy Microsoft Azure Data Solutions por Marco Parenzan
Deploy Microsoft Azure Data SolutionsDeploy Microsoft Azure Data Solutions
Deploy Microsoft Azure Data Solutions
Marco Parenzan159 vistas
Anomaly Detection with Azure and .net por Marco Parenzan
Anomaly Detection with Azure and .netAnomaly Detection with Azure and .net
Anomaly Detection with Azure and .net
Marco Parenzan116 vistas
Code Generation for Azure with .net por Marco Parenzan
Code Generation for Azure with .netCode Generation for Azure with .net
Code Generation for Azure with .net
Marco Parenzan86 vistas
Running Kafka and Spark on Raspberry PI with Azure and some .net magic por Marco Parenzan
Running Kafka and Spark on Raspberry PI with Azure and some .net magicRunning Kafka and Spark on Raspberry PI with Azure and some .net magic
Running Kafka and Spark on Raspberry PI with Azure and some .net magic
Marco Parenzan170 vistas
Code Generation for Azure with .net por Marco Parenzan
Code Generation for Azure with .netCode Generation for Azure with .net
Code Generation for Azure with .net
Marco Parenzan80 vistas
.net interactive for your code and Azure por Marco Parenzan
.net interactive for your code and Azure.net interactive for your code and Azure
.net interactive for your code and Azure
Marco Parenzan42 vistas

Último

Using Qt under LGPL-3.0 por
Using Qt under LGPL-3.0Using Qt under LGPL-3.0
Using Qt under LGPL-3.0Burkhard Stubert
13 vistas11 diapositivas
Quality Engineer: A Day in the Life por
Quality Engineer: A Day in the LifeQuality Engineer: A Day in the Life
Quality Engineer: A Day in the LifeJohn Valentino
7 vistas18 diapositivas
Top-5-production-devconMunich-2023.pptx por
Top-5-production-devconMunich-2023.pptxTop-5-production-devconMunich-2023.pptx
Top-5-production-devconMunich-2023.pptxTier1 app
9 vistas40 diapositivas
Playwright Retries por
Playwright RetriesPlaywright Retries
Playwright Retriesartembondar5
5 vistas1 diapositiva
Sprint 226 por
Sprint 226Sprint 226
Sprint 226ManageIQ
11 vistas18 diapositivas
MS PowerPoint.pptx por
MS PowerPoint.pptxMS PowerPoint.pptx
MS PowerPoint.pptxLitty Sylus
7 vistas14 diapositivas

Último(20)

Quality Engineer: A Day in the Life por John Valentino
Quality Engineer: A Day in the LifeQuality Engineer: A Day in the Life
Quality Engineer: A Day in the Life
John Valentino7 vistas
Top-5-production-devconMunich-2023.pptx por Tier1 app
Top-5-production-devconMunich-2023.pptxTop-5-production-devconMunich-2023.pptx
Top-5-production-devconMunich-2023.pptx
Tier1 app9 vistas
Sprint 226 por ManageIQ
Sprint 226Sprint 226
Sprint 226
ManageIQ11 vistas
JioEngage_Presentation.pptx por admin125455
JioEngage_Presentation.pptxJioEngage_Presentation.pptx
JioEngage_Presentation.pptx
admin1254558 vistas
Top-5-production-devconMunich-2023-v2.pptx por Tier1 app
Top-5-production-devconMunich-2023-v2.pptxTop-5-production-devconMunich-2023-v2.pptx
Top-5-production-devconMunich-2023-v2.pptx
Tier1 app6 vistas
Introduction to Git Source Control por John Valentino
Introduction to Git Source ControlIntroduction to Git Source Control
Introduction to Git Source Control
John Valentino7 vistas
360 graden fabriek por info33492
360 graden fabriek360 graden fabriek
360 graden fabriek
info33492162 vistas
The Era of Large Language Models.pptx por AbdulVahedShaik
The Era of Large Language Models.pptxThe Era of Large Language Models.pptx
The Era of Large Language Models.pptx
AbdulVahedShaik7 vistas
Ports-and-Adapters Architecture for Embedded HMI por Burkhard Stubert
Ports-and-Adapters Architecture for Embedded HMIPorts-and-Adapters Architecture for Embedded HMI
Ports-and-Adapters Architecture for Embedded HMI
Burkhard Stubert29 vistas
Understanding HTML terminology por artembondar5
Understanding HTML terminologyUnderstanding HTML terminology
Understanding HTML terminology
artembondar57 vistas
20231129 - Platform @ localhost 2023 - Application-driven infrastructure with... por sparkfabrik
20231129 - Platform @ localhost 2023 - Application-driven infrastructure with...20231129 - Platform @ localhost 2023 - Application-driven infrastructure with...
20231129 - Platform @ localhost 2023 - Application-driven infrastructure with...
sparkfabrik8 vistas
Dapr Unleashed: Accelerating Microservice Development por Miroslav Janeski
Dapr Unleashed: Accelerating Microservice DevelopmentDapr Unleashed: Accelerating Microservice Development
Dapr Unleashed: Accelerating Microservice Development
Miroslav Janeski13 vistas
Unlocking the Power of AI in Product Management - A Comprehensive Guide for P... por NimaTorabi2
Unlocking the Power of AI in Product Management - A Comprehensive Guide for P...Unlocking the Power of AI in Product Management - A Comprehensive Guide for P...
Unlocking the Power of AI in Product Management - A Comprehensive Guide for P...
NimaTorabi216 vistas
2023-November-Schneider Electric-Meetup-BCN Admin Group.pptx por animuscrm
2023-November-Schneider Electric-Meetup-BCN Admin Group.pptx2023-November-Schneider Electric-Meetup-BCN Admin Group.pptx
2023-November-Schneider Electric-Meetup-BCN Admin Group.pptx
animuscrm15 vistas
Bootstrapping vs Venture Capital.pptx por Zeljko Svedic
Bootstrapping vs Venture Capital.pptxBootstrapping vs Venture Capital.pptx
Bootstrapping vs Venture Capital.pptx
Zeljko Svedic15 vistas

Time Series Anomaly Detection with Azure and .NETT

  • 1. Time Series Anomaly Detection with Azure and .NET (part 1) Marco Parenzan // @marco_parenzan
  • 2. Marco Parenzan • Senion Solution Architect @ beanTech • 1nn0va Community Lead (Pordenone) • Microsoft Azure MVP • Profiles • Linkedin: https://www.linkedin.com/in/marcoparenzan/ • Slideshare: https://www.slideshare.net/marco.parenzan • GitHub: https://github.com/marcoparenzan
  • 3. This is the journey of… • …a .NET developer… • …or an IoT developer… • …a one-man band (sometimes )… • …facing typical data science world topics… • …that wants to use .NET everywhere!
  • 5. Scenario • In an industrial fridge, you monitor temperatures to check not the temperature «per se», but to check the healthy of the plant From real industrial fridges 
  • 7. With no any specific request... what is IoT all about? Efficiency Anomalies Batch Streaming
  • 8. Time Series • Definition • Time series is a sequence of data points recorded in time order, often taken at successive equally paced points in time. • Examples • Stock prices, Sales demand, website traffic, daily temperatures, quarterly sales • Time series is different from regression analysis because of its time-dependent nature. • Auto-correlation: Regression analysis requires that there is little or no autocorrelation in the data. It occurs when the observations are not independent of each other. For example, in stock prices, the current price is not independent of the previous price. [The observations have to be dependent on time] • Seasonality, a characteristic which we will discuss below.
  • 9. Anomaly Detection in Time Series • In time series data, an anomaly or outlier can be termed as a data point which is not following the common collective trend or seasonal or cyclic pattern of the entire data and is significantly distinct from rest of the data. By significant, most data scientists mean statistical significance, which in order words, signify that the statistical properties of the data point is not in alignment with the rest of the series. • Anomaly detection has two basic assumptions: • Anomalies only occur very rarely in the data. • Their features differ from the normal instances significantly.
  • 10. Threshold anomalies? • Threshold Anomalies for a time window • Slow changing damages • Fridge is no more efficient • Threshold alarms are not enough • Anomalies cannot be just «over a threshold for some time»... • Condenser or Evaporator with difficulties starting • Distinguish from Opening a door (that is also an anomaly) • Or also counting the number of times that there are peaks (too many times) • You can considering each of these events as anomalies that alter the temperature you measure in different part of the fridge
  • 11. How can we implement processing? Ingest Process Storage Account Azure IoT Hub-Related Services Devices Events ? We explore some of them, probably the most Microsoft and Azure oriented
  • 13. I’m not a data scientist! Or a BI Analyst!
  • 14. I’m a .NET Developer!
  • 15. Make me think as a Data Scientist!
  • 17. Spark Unifies:  Batch Processing  Interactive SQL  Real-time processing  Machine Learning  Deep Learning  Graph Processing An unified, open source, parallel, data processing framework for Big Data Analytics Spark Core Engine Spark SQL Batch processing Spark Structured Streaming Stream processing Spark MLlib Machine Learning Yarn Spark MLlib Machine Learning Spark Streaming Stream processing GraphX Graph Computation http://spark.apache.org Apache Spark
  • 18. Batch vs. Notebooks • Batch • Work on slow data stored into a Datalake • Submit a complete app in one single deploy • Receive the entire output • Notebook • «sketching» the code • Write/delete/rewrite continuously • Run cell by cell (but also all at once) interactive • In a world of Mathematica
  • 19. Jupyter • Evolution and generalization of the seminal role of Mathematica • In web standards way • Web (HTTP+Markdown) • Python adoption (ipynb) • Written in Java • Python has an interop bridge...not native (if ever important)Python is a kernel for Jupyter
  • 20. Python! • Simple to start (that why C# is pythonizing…) • “Open Source” • TensorFlow, Scikit-learn, Keras, Pandas, PyTorch • Remember one thing: • Often behind a Data Science framework there is a native library and Python binds that library • Spark is written in Java and there is a bridge for Python to Spark • Jupyter is written in Java and there is a bridge (kernel) for Python
  • 22. Helping no-data scientits developers (all! ) • Unsupervised Machine LearningNo labelling • Auto(mated) MLfind the best tuning for you with parameters and algorithms • Automated Training Set for Anomaly Detection Algorithms • the algorithms automatically generates a simulated training set based non your input data https://docs.microsoft.com/en-us/azure/machine-learning/algorithm-cheat-sheet
  • 23. Spectrum Residual Cnn (SrCnn) • to monitor the time-series continuously and alert for potential incidents on time • The algorithm first computes the Fourier Transform of the original data. Then it computes the spectral residual of the log amplitude of the transformed signal before applying the Inverse Fourier Transform to map the sequence back from the frequency to the time domain. This sequence is called the saliency map. The anomaly score is then computed as the relative difference between the saliency map values and their moving averages. If the score is above a threshold, the value at a specific timestep is flagged as an outlier. • There are several parameters for SR algorithm. To obtain a model with good performance, we suggest to tune windowSize and threshold at first, these are the most important parameters to SR. Then you could search for an appropriate judgementWindowSize which is no larger than windowSize. And for the remaining parameters, you could use the default value directly. • Time-Series Anomaly Detection Service at Microsof [https://arxiv.org/pdf/1906.03821.pdf]
  • 25. Data Science and AI for the .NET developer • ML.NET is first and foremost a framework that you can use to create your own custom ML models. This custom approach contrasts with “pre-built AI,” where you use pre- designed general AI services from the cloud (like many of the offerings from Azure Cognitive Services). This can work great for many scenarios, but it might not always fit your specific business needs due to the nature of the machine learning problem or to the deployment context (cloud vs. on-premises). • ML.NET enables developers to use their existing .NET skills to easily integrate machine learning into almost any .NET application. This means that if C# (or F# or VB) is your programming language of choice, you no longer have to learn a new programming language, like Python or R, in order to develop your own ML models and infuse custom machine learning into your .NET apps.
  • 27. .NET Interactive and Jupyter and Visual Studio Code • .NET Interactive gives C# and F# kernels to Jupyter • .NET Interactive gives all tools to create your hosting application independently from Jupyter • In Visual Studio Code, you have two different notebooks (looking similar but developed in parallel by different teams) • .NET Interactive Notebook (by the .NET Interactive Team) that can run also Python • Jupyter Notebook (by the Azure Data Studio Team – probably) that can run also C# and F# • There is a little confusion on that  • .NET Interactive has a strong C#/F# Kernel... • ...a less mature infrastructure (compared to Jupiter)
  • 28. .NET for Apache Spark 1.1.1 • .NET bindings (C# e F#) to Spark • Written on the Spark interop layer, designed to provide high performance bindings to multiple languages • Re-use knowledge, skills, code you have as a .NET developer • Compliant with .NET Standard • You can use .NET for Apache Spark anywhere you write .NET code • Original project Moebius • https://github.com/microsoft/Mobius
  • 31. Functions everywhere Platform App delivery OS On-premises Code App Service on Azure Stack Windows ●●● Non-Azure hosts ●●● ●●● + Azure Functions host runtime Azure Functions Core Tools Azure Functions base Docker image Azure Functions .NET Docker image Azure Functions Node Docker image ●●●
  • 32. Azure Cognitive Services • Cognitive Services brings AI within reach of every developer—without requiring machine-learning expertise. All it takes is an API call to embed the ability to see, hear, speak, search, understand, and accelerate decision-making into your apps. Enable developers of all skill levels to easily add AI capabilities to their apps. • Five areas: • Decision • Language • Speech • Vision • Web search Anomaly Detector Identify potential problems early on. Content Moderator Detect potentially offensive or unwanted content. Metrics Advisor PREVIEW Monitor metrics and diagnose issues. Personalizer Create rich, personalized experiences for every user.
  • 33. Azure Synapse Analytics for the Big Data Limitless analytics service with unmatched time to insight Platform Azure Data Lake Storage Common Data Model Enterprise Security Optimized for Analytics METASTORE SECURITY MANAGEMENT MONITORING DATA INTEGRATION Analytics Runtimes DEDICATED SERVERLESS Form Factors SQL Languages Python .NET Java Scala Experience Synapse Analytics Studio Artificial Intelligence / Machine Learning / Internet of Things Intelligent Apps / Business Intelligence METASTORE SECURITY MANAGEMENT MONITORING
  • 34. What is ADX? 34 • A Telemetry data Search engine => ELK replacement • A TSDB => OSS LAMBDA (MinIO + Kafka) replacement • A Tool to Materialize data into ADLS & SQL • A Tool for monitoring, summarizing information and send notifications
  • 36. First conclusions • .NET ecosystem in Data Science World is completing • C# is pythonizing since C# 7.x • .NET for Spark that runs in Synapse and DataBricks • .Net Interactive notebooks in Visual Studio Code, Synapse, Cosmos... • Azure has lot of support for Data Science in .NET and adopt everything described
  • 37. See you for second part with the complete journey with more demoes Part 2: Sept 23rd, 7.20 AM EDT Time Series Anomaly Detection with Azure and .NET (part 1)
  • 38. Thank you! Marco Parenzan Senior Solution Architect @ beanTech Microsoft Azure MVP 1nn0va Community Lead • https://docs.microsoft.com/en-us/azure/cognitive-services/anomaly-detector/ • https://docs.microsoft.com/en-us/dotnet/machine-learning/tutorials/sales-anomaly-detection • https://github.com/dotnet/interactive • https://docs.microsoft.com/en-us/dotnet/machine-learning/how-to-guides/serve-model-serverless-azure-functions-ml-net • https://azure.microsoft.com/en-us/services/cognitive-services/metrics-advisor/

Notas del editor

  1. https://towardsdatascience.com/time-series-analysis-for-beginners-8a200552e332
  2. Anomaly detection is the process of identifying unexpected items or events in data sets, which differ from the norm. And anomaly detection is often applied on unlabeled data which is known as unsupervised anomaly detection. https://towardsdatascience.com/effective-approaches-for-time-series-anomaly-detection-9485b40077f1
  3. The Spectral Residual outlier detector is based on the paper Time-Series Anomaly Detection Service at Microsoft and is suitable for unsupervised online anomaly detection in univariate time series data. The algorithm first computes the Fourier Transform of the original data. Then it computes the spectral residual of the log amplitude of the transformed signal before applying the Inverse Fourier Transform to map the sequence back from the frequency to the time domain. This sequence is called the saliency map. The anomaly score is then computed as the relative difference between the saliency map values and their moving averages. If the score is above a threshold, the value at a specific timestep is flagged as an outlier. For more details, please check out the paper.