SlideShare una empresa de Scribd logo
1 de 25
IOT, STREAMING ANALYTICS AND MACHINE LEARNING
Delivering Real-Time Intelligence With Apache NiFi
Paul Kent, VP of Big Data, Platform R&D
Dan Zaratsian, Sr. Solutions Architect
Copyr ight © 2015, SAS Institute Inc. All rights reser ved.
Copyr ight © 2015, SAS Institute Inc. All rights reser ved.
• Drop-and-Drag Interface
• Secure/Encrypted
• Bi-Directional Communication
• Data Provenance
Copyr ight © 2015, SAS Institute Inc. All rights reser ved.
Copyr ight © 2015, SAS Institute Inc. All rights reser ved.
SAS ESP + HDF WHY IS THIS IMPORTANT?
RAPID PROTOTYPING OF MACHINE LEARNING MODELS
ANALYTICS WITHIN AN OPEN FRAMEWORK
Events
(Data In)
Events
(Data Out)
Filtering
Aggregation
Pattern Detection
Computation
Merging / Joins
Functions
Retention Window
Text Analytics
Unsupervised Learning
Predictive Modeling
…more…
Detect Events & Patterns of Interest
KEY CONCEPTSESP MODEL - PROCESS FLOW
SAS EVENT STREAM PROCESSING ENGINE
DATA IN
(Events)
DATA OUT
(Events)
Design of the rule model (called “Continuous Query”)
using components (called “Windows”)
DATA IN
(Events)
DATA IN
(Events)
DATA OUT
(Events)
SOURCE
1
WINDOW
SOURCE
2
WINDOW
SOURCE
3
WINDOW
FILTER
WINDOW
CALCULATIONS
WINDOW
JOIN
WINDOW
JOIN
WINDOW
NOTIFICATION
WINDOW
PREDICTIVE
MODEL
(SCORING)
WINDOW
Copyr ight © 2012, SAS Institute Inc. All rights reser ved.
EVENT STREAM
PROCESSING
ESP STUDIO INTERFACE
Copyr ight © 2015, SAS Institute Inc. All rights reser ved.
DESIGNING ESP
MODELS
DESIGN COMPONENTS
DS2 PROCEDURAL WINDOW
In-Stream Analytics:
1. Build analytical model using EM,VA, etc.
• Decision Tree
• Neural Network
• Regression
• Rule Induction
• And more
2. Use PROC DSTRANS to convert code to DS2
3. Deploy model to procedural window
Only when the existing model is additive in nature
and can process one event at a time.
Copyr ight © 2012, SAS Institute Inc. All rights reser ved.
DESIGNING ESP
MODELS
DESIGN COMPONENTS
PATTERN WINDOW overview
Build complex network of events using temporal conditions
Multiple events in can produce one event out
E1 E2
And
Followed
By
E4 E5
And
Not
E6
E3Or
5 min
1 hour
Followed
By
“Detect when event A is followed by event B and not Event C in a 3min time frame”
Copyr ight © 2012, SAS Institute Inc. All rights reser ved.
DESIGNING ESP
MODELS
DESIGN COMPONENTS
TEXT ANALYTICS WINDOWS
• Process unstructured text fields
• 3 dedicated Text Analytics windows
• Text Context (.liti files)
• Text Category (.mco files)
• Text Sentiment (.sam files)
• An appropriate Text Analytics license is required.
SAS EVENT STREAM
PROCESSING
HTML5 STREAMVIEWER
• HTML5 interface
• Uses HTTP (RESTful) XML server
• 2 Modes:
• Streaming mode: display all events
• Update : events processed with opcode
• Google charts
• Subscribe & Publish
INTEGRATION
SAS EVENT STREAM PROCESSING &
HORTONWORKS DATA FLOW (NIFI)
&
SAS Event Stream Processing Hortonworks Data Flow (Nifi)
Copyr ight © 2015, SAS Institute Inc. All rights reser ved.
SAS®
EVENT STREAM
PROCESSING CONCEPTUAL OVERVIEW
SAS-generated
Insights
Enrichment
Data
Event Actions
SAS In-Memory
SAS
®
Event Stream Processing Model
Continuous
Query
Publish
Subscribe
Streaming Events
Analytic
Models
Business
Rules
Nifi
Copyr ight © 2015, SAS Institute Inc. All rights reser ved.
SAS EVENT STREAM
PROCESSING
• CONNECTORS & ADAPTERS
PUB/SUB API Connect to any system with Java or C
Public, documented and easy to use
Adapters are standalone processes and can be networked
Publish to ESP Source windows – Subscribe to any ESP window
All Connectors & Adapters are built using the Pub/Sub API
•File/Socket
•XML / JSON
•Database (odbc)
•SAS® LASR™
•Hadoop
•SAS® Dataset
OUT OF THE BOX
*Publish only
**Subscribe only
•ESP Project
•RabbitMQ
•Solace
•Tervela
•Google Protobuff
•Twitter*
•SAS® HDAT
•JMS
•IBM WebSphere MQ
•Tibco RendezVous
•Syslog *
•Network Sniffer*
•HTTP RESTful
•OSIsoft PI
•Axeda
•Teradata
•SMTP **
•ESP to ESP
Copyr ight © 2015, SAS Institute Inc. All rights reser ved.
DIRECTION: ADAPTERS & CONNECTORS
•Flume: Integrate ESP with streaming log data
•Kafka: Integration with large scale message processing
•MQTT: Support within IoT and Connected things
•Cassandra (adaptor only): integration with large-scale, distributed data source
•HortonWorks Data Flow (NiFi) Processor: support NiFI streams
•MapR: MapR Streams support
•Boardreader: Blogs, News, Boards, Reviews
•Spryware: Market data through direct exchange feeds
•IOT Gateways and devices
PUB/SUB API
Connect to any system with Java or C
Documented and easy to use
Adapters are standalone processes and can be networked
Publish to ESP Source windows – Subscribe to any ESP window
All Connectors & Adapters are built using the Pub/Sub API
SAS® EVENT STREAM
PROCESSING 4.1
Copyr ight © 2015, SAS Institute Inc. All rights reser ved.
SAS® EVENT STREAM
PROCESSING 4.1
FLEXIBILITY AND INTEGRATION
• Python Pub/sub API: drive ESP using Python
• Leverage Analytic Decisions within ESP
• Decision/Rules/Analytical Model Integration via SAS Micro Analytic Service
• Lightweight, fast service for decision deployment
• Leverage Languages in ESP (In-process Event Stream Handlers)
• DATAstep (native)
• DS2 (current)
• Python
• Future: R language (post-16w48)
Company Confidential - For Internal Use Only
Copyright © 2015, SAS Institute Inc. All rights reserved.
Edge
Analytics
In-Motion
Analytics
At-Rest
Analytics
Connected Systems, Devices
Monitor equipment on
for failures and safety
issues, and take action.
Identify fraudulent
transactions and be
alerted in real-time.
Intelligently integrate customer
information with real-time
streaming data
Strategic Data IntegrationTransactions, Logs, Clickstreams
STREAMING
ANALYTICS
Where are the Opportunities?
• Competitive Pressure (Technology, Sensors, Analytics)
• Risk
• Safety
• Security
• Personalization
Extend the existing analytical footprint!
Capture value otherwise lost through information lag
INTEGRATION
SAS EVENT STREAM PROCESSING &
HORTONWORKS DATA FLOW (NIFI)
&
SAS Event Stream Processing Hortonworks Data Flow (Nifi)
Copyright © 2012, SAS Institute Inc. All rights reserved.
Paul Kent, VP of Big Data, Platform R&D
Dan Zaratsian, Sr. Solutions Architect
Copyright © 2012, SAS Institute Inc. All rights reserved.Copyright © 2012, SAS Institute Inc. All rights reserved.
Demo
Company Confidential - For Internal Use Only
Copyright © 2015, SAS Institute Inc. All rights reserved.
Company Confidential - For Internal Use Only
Copyright © 2015, SAS Institute Inc. All rights reserved.
Company Confidential - For Internal Use Only
Copyright © 2015, SAS Institute Inc. All rights reserved.
Company Confidential - For Internal Use Only
Copyright © 2015, SAS Institute Inc. All rights reserved.

Más contenido relacionado

La actualidad más candente

GDPR-focused partner community showcase for Apache Ranger and Apache Atlas
GDPR-focused partner community showcase for Apache Ranger and Apache AtlasGDPR-focused partner community showcase for Apache Ranger and Apache Atlas
GDPR-focused partner community showcase for Apache Ranger and Apache AtlasDataWorks Summit
 
Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop
Successes, Challenges, and Pitfalls Migrating a SAAS business to HadoopSuccesses, Challenges, and Pitfalls Migrating a SAAS business to Hadoop
Successes, Challenges, and Pitfalls Migrating a SAAS business to HadoopDataWorks Summit/Hadoop Summit
 
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Why is my Hadoop cluster s...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Why is my Hadoop cluster s...Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Why is my Hadoop cluster s...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Why is my Hadoop cluster s...Data Con LA
 
Data Ingest Self Service and Management using Nifi and Kafka
Data Ingest Self Service and Management using Nifi and KafkaData Ingest Self Service and Management using Nifi and Kafka
Data Ingest Self Service and Management using Nifi and KafkaDataWorks Summit
 
Log Analytics Optimization
Log Analytics OptimizationLog Analytics Optimization
Log Analytics OptimizationHortonworks
 
Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It! Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It! Cécile Poyet
 
Accelerating query processing with materialized views in Apache Hive
Accelerating query processing with materialized views in Apache HiveAccelerating query processing with materialized views in Apache Hive
Accelerating query processing with materialized views in Apache HiveDataWorks Summit
 
Tapping into the Big Data Reservoir (CON7934)
Tapping into the Big Data Reservoir (CON7934)Tapping into the Big Data Reservoir (CON7934)
Tapping into the Big Data Reservoir (CON7934)Jeffrey T. Pollock
 
Lego-like building blocks of Storm and Spark Streaming Pipelines
Lego-like building blocks of Storm and Spark Streaming PipelinesLego-like building blocks of Storm and Spark Streaming Pipelines
Lego-like building blocks of Storm and Spark Streaming PipelinesDataWorks Summit/Hadoop Summit
 
Enterprise Metadata Integration
Enterprise Metadata IntegrationEnterprise Metadata Integration
Enterprise Metadata IntegrationDr. Mirko Kämpf
 
Interactive Analytics at Scale in Apache Hive Using Druid
Interactive Analytics at Scale in Apache Hive Using DruidInteractive Analytics at Scale in Apache Hive Using Druid
Interactive Analytics at Scale in Apache Hive Using DruidDataWorks Summit
 

La actualidad más candente (20)

GDPR-focused partner community showcase for Apache Ranger and Apache Atlas
GDPR-focused partner community showcase for Apache Ranger and Apache AtlasGDPR-focused partner community showcase for Apache Ranger and Apache Atlas
GDPR-focused partner community showcase for Apache Ranger and Apache Atlas
 
Apache Atlas: Governance for your Data
Apache Atlas: Governance for your DataApache Atlas: Governance for your Data
Apache Atlas: Governance for your Data
 
Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop
Successes, Challenges, and Pitfalls Migrating a SAAS business to HadoopSuccesses, Challenges, and Pitfalls Migrating a SAAS business to Hadoop
Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop
 
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Why is my Hadoop cluster s...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Why is my Hadoop cluster s...Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Why is my Hadoop cluster s...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Why is my Hadoop cluster s...
 
Active Learning for Fraud Prevention
Active Learning for Fraud PreventionActive Learning for Fraud Prevention
Active Learning for Fraud Prevention
 
Why is my Hadoop* job slow?
Why is my Hadoop* job slow?Why is my Hadoop* job slow?
Why is my Hadoop* job slow?
 
Data Ingest Self Service and Management using Nifi and Kafka
Data Ingest Self Service and Management using Nifi and KafkaData Ingest Self Service and Management using Nifi and Kafka
Data Ingest Self Service and Management using Nifi and Kafka
 
Apache Hadoop Crash Course
Apache Hadoop Crash CourseApache Hadoop Crash Course
Apache Hadoop Crash Course
 
Log Analytics Optimization
Log Analytics OptimizationLog Analytics Optimization
Log Analytics Optimization
 
Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It! Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It!
 
HDP Next: Governance
HDP Next: GovernanceHDP Next: Governance
HDP Next: Governance
 
Accelerating query processing with materialized views in Apache Hive
Accelerating query processing with materialized views in Apache HiveAccelerating query processing with materialized views in Apache Hive
Accelerating query processing with materialized views in Apache Hive
 
Securing Hadoop in an Enterprise Context
Securing Hadoop in an Enterprise ContextSecuring Hadoop in an Enterprise Context
Securing Hadoop in an Enterprise Context
 
Tapping into the Big Data Reservoir (CON7934)
Tapping into the Big Data Reservoir (CON7934)Tapping into the Big Data Reservoir (CON7934)
Tapping into the Big Data Reservoir (CON7934)
 
Analysis of Major Trends in Big Data Analytics
Analysis of Major Trends in Big Data AnalyticsAnalysis of Major Trends in Big Data Analytics
Analysis of Major Trends in Big Data Analytics
 
Apache Hadoop Crash Course - HS16SJ
Apache Hadoop Crash Course - HS16SJApache Hadoop Crash Course - HS16SJ
Apache Hadoop Crash Course - HS16SJ
 
Lego-like building blocks of Storm and Spark Streaming Pipelines
Lego-like building blocks of Storm and Spark Streaming PipelinesLego-like building blocks of Storm and Spark Streaming Pipelines
Lego-like building blocks of Storm and Spark Streaming Pipelines
 
Enterprise Metadata Integration
Enterprise Metadata IntegrationEnterprise Metadata Integration
Enterprise Metadata Integration
 
Interactive Analytics at Scale in Apache Hive Using Druid
Interactive Analytics at Scale in Apache Hive Using DruidInteractive Analytics at Scale in Apache Hive Using Druid
Interactive Analytics at Scale in Apache Hive Using Druid
 
Creating the Internet of Your Things
Creating the Internet of Your ThingsCreating the Internet of Your Things
Creating the Internet of Your Things
 

Similar a IOT, Streaming Analytics and Machine Learning

Machine learning in the physical world by Kip Larson from AWS IoT
Machine learning in the physical world by  Kip Larson from AWS IoTMachine learning in the physical world by  Kip Larson from AWS IoT
Machine learning in the physical world by Kip Larson from AWS IoTBill Liu
 
IMC Summit 2016 Breakout - Matt Coventon - Test Driving Streaming and CEP on ...
IMC Summit 2016 Breakout - Matt Coventon - Test Driving Streaming and CEP on ...IMC Summit 2016 Breakout - Matt Coventon - Test Driving Streaming and CEP on ...
IMC Summit 2016 Breakout - Matt Coventon - Test Driving Streaming and CEP on ...In-Memory Computing Summit
 
Streaming Visualization
Streaming VisualizationStreaming Visualization
Streaming VisualizationGuido Schmutz
 
Alten calsoft labs analytics service offerings
Alten calsoft labs   analytics service offeringsAlten calsoft labs   analytics service offerings
Alten calsoft labs analytics service offeringsSandeep Vyas
 
Analyzing Data Streams in Real Time with Amazon Kinesis: PNNL's Serverless Da...
Analyzing Data Streams in Real Time with Amazon Kinesis: PNNL's Serverless Da...Analyzing Data Streams in Real Time with Amazon Kinesis: PNNL's Serverless Da...
Analyzing Data Streams in Real Time with Amazon Kinesis: PNNL's Serverless Da...Amazon Web Services
 
Approaches to Network Automation
Approaches to Network AutomationApproaches to Network Automation
Approaches to Network AutomationAPNIC
 
Introduction to Streaming Analytics
Introduction to Streaming AnalyticsIntroduction to Streaming Analytics
Introduction to Streaming AnalyticsGuido Schmutz
 
Machine Data 101 Workshop
Machine Data 101 Workshop Machine Data 101 Workshop
Machine Data 101 Workshop Splunk
 
Enterprise Integration Patterns Revisited (again) for the Era of Big Data, In...
Enterprise Integration Patterns Revisited (again) for the Era of Big Data, In...Enterprise Integration Patterns Revisited (again) for the Era of Big Data, In...
Enterprise Integration Patterns Revisited (again) for the Era of Big Data, In...Kai Wähner
 
Apache Spark – The New Enterprise Backbone for ETL, Batch Processing and Real...
Apache Spark – The New Enterprise Backbone for ETL, Batch Processing and Real...Apache Spark – The New Enterprise Backbone for ETL, Batch Processing and Real...
Apache Spark – The New Enterprise Backbone for ETL, Batch Processing and Real...Impetus Technologies
 
Real-Time Analytics with Confluent and MemSQL
Real-Time Analytics with Confluent and MemSQLReal-Time Analytics with Confluent and MemSQL
Real-Time Analytics with Confluent and MemSQLSingleStore
 
SpringOne Tour Denver - Spring Boot & Spring Cloud on Pivotal Application Ser...
SpringOne Tour Denver - Spring Boot & Spring Cloud on Pivotal Application Ser...SpringOne Tour Denver - Spring Boot & Spring Cloud on Pivotal Application Ser...
SpringOne Tour Denver - Spring Boot & Spring Cloud on Pivotal Application Ser...VMware Tanzu
 
Spring Boot & Spring Cloud Apps on Pivotal Application Service - Daniel Lavoie
Spring Boot & Spring Cloud Apps on Pivotal Application Service - Daniel LavoieSpring Boot & Spring Cloud Apps on Pivotal Application Service - Daniel Lavoie
Spring Boot & Spring Cloud Apps on Pivotal Application Service - Daniel LavoieVMware Tanzu
 
Big Data Analytics Platforms by KTH and RISE SICS
Big Data Analytics Platforms by KTH and RISE SICSBig Data Analytics Platforms by KTH and RISE SICS
Big Data Analytics Platforms by KTH and RISE SICSBig Data Value Association
 
Preventative Maintenance of Robots in Automotive Industry
Preventative Maintenance of Robots in Automotive IndustryPreventative Maintenance of Robots in Automotive Industry
Preventative Maintenance of Robots in Automotive IndustryDataWorks Summit/Hadoop Summit
 
MindSphere: The cloud-based, open IoT operating system. Damiano Manocchia
MindSphere: The cloud-based, open IoT operating system. Damiano ManocchiaMindSphere: The cloud-based, open IoT operating system. Damiano Manocchia
MindSphere: The cloud-based, open IoT operating system. Damiano ManocchiaData Driven Innovation
 
Tibco Augmented Intelligence - Analytics, IoT, Big Data, Streaming 20161025
Tibco Augmented Intelligence - Analytics, IoT, Big Data, Streaming 20161025Tibco Augmented Intelligence - Analytics, IoT, Big Data, Streaming 20161025
Tibco Augmented Intelligence - Analytics, IoT, Big Data, Streaming 20161025Nicola Sandoli
 
Stream Processing – Concepts and Frameworks
Stream Processing – Concepts and FrameworksStream Processing – Concepts and Frameworks
Stream Processing – Concepts and FrameworksGuido Schmutz
 

Similar a IOT, Streaming Analytics and Machine Learning (20)

Machine learning in the physical world by Kip Larson from AWS IoT
Machine learning in the physical world by  Kip Larson from AWS IoTMachine learning in the physical world by  Kip Larson from AWS IoT
Machine learning in the physical world by Kip Larson from AWS IoT
 
IoT architecture
IoT architectureIoT architecture
IoT architecture
 
IMC Summit 2016 Breakout - Matt Coventon - Test Driving Streaming and CEP on ...
IMC Summit 2016 Breakout - Matt Coventon - Test Driving Streaming and CEP on ...IMC Summit 2016 Breakout - Matt Coventon - Test Driving Streaming and CEP on ...
IMC Summit 2016 Breakout - Matt Coventon - Test Driving Streaming and CEP on ...
 
Analysing Data in Real-time
Analysing Data in Real-timeAnalysing Data in Real-time
Analysing Data in Real-time
 
Streaming Visualization
Streaming VisualizationStreaming Visualization
Streaming Visualization
 
Alten calsoft labs analytics service offerings
Alten calsoft labs   analytics service offeringsAlten calsoft labs   analytics service offerings
Alten calsoft labs analytics service offerings
 
Analyzing Data Streams in Real Time with Amazon Kinesis: PNNL's Serverless Da...
Analyzing Data Streams in Real Time with Amazon Kinesis: PNNL's Serverless Da...Analyzing Data Streams in Real Time with Amazon Kinesis: PNNL's Serverless Da...
Analyzing Data Streams in Real Time with Amazon Kinesis: PNNL's Serverless Da...
 
Approaches to Network Automation
Approaches to Network AutomationApproaches to Network Automation
Approaches to Network Automation
 
Introduction to Streaming Analytics
Introduction to Streaming AnalyticsIntroduction to Streaming Analytics
Introduction to Streaming Analytics
 
Machine Data 101 Workshop
Machine Data 101 Workshop Machine Data 101 Workshop
Machine Data 101 Workshop
 
Enterprise Integration Patterns Revisited (again) for the Era of Big Data, In...
Enterprise Integration Patterns Revisited (again) for the Era of Big Data, In...Enterprise Integration Patterns Revisited (again) for the Era of Big Data, In...
Enterprise Integration Patterns Revisited (again) for the Era of Big Data, In...
 
Apache Spark – The New Enterprise Backbone for ETL, Batch Processing and Real...
Apache Spark – The New Enterprise Backbone for ETL, Batch Processing and Real...Apache Spark – The New Enterprise Backbone for ETL, Batch Processing and Real...
Apache Spark – The New Enterprise Backbone for ETL, Batch Processing and Real...
 
Real-Time Analytics with Confluent and MemSQL
Real-Time Analytics with Confluent and MemSQLReal-Time Analytics with Confluent and MemSQL
Real-Time Analytics with Confluent and MemSQL
 
SpringOne Tour Denver - Spring Boot & Spring Cloud on Pivotal Application Ser...
SpringOne Tour Denver - Spring Boot & Spring Cloud on Pivotal Application Ser...SpringOne Tour Denver - Spring Boot & Spring Cloud on Pivotal Application Ser...
SpringOne Tour Denver - Spring Boot & Spring Cloud on Pivotal Application Ser...
 
Spring Boot & Spring Cloud Apps on Pivotal Application Service - Daniel Lavoie
Spring Boot & Spring Cloud Apps on Pivotal Application Service - Daniel LavoieSpring Boot & Spring Cloud Apps on Pivotal Application Service - Daniel Lavoie
Spring Boot & Spring Cloud Apps on Pivotal Application Service - Daniel Lavoie
 
Big Data Analytics Platforms by KTH and RISE SICS
Big Data Analytics Platforms by KTH and RISE SICSBig Data Analytics Platforms by KTH and RISE SICS
Big Data Analytics Platforms by KTH and RISE SICS
 
Preventative Maintenance of Robots in Automotive Industry
Preventative Maintenance of Robots in Automotive IndustryPreventative Maintenance of Robots in Automotive Industry
Preventative Maintenance of Robots in Automotive Industry
 
MindSphere: The cloud-based, open IoT operating system. Damiano Manocchia
MindSphere: The cloud-based, open IoT operating system. Damiano ManocchiaMindSphere: The cloud-based, open IoT operating system. Damiano Manocchia
MindSphere: The cloud-based, open IoT operating system. Damiano Manocchia
 
Tibco Augmented Intelligence - Analytics, IoT, Big Data, Streaming 20161025
Tibco Augmented Intelligence - Analytics, IoT, Big Data, Streaming 20161025Tibco Augmented Intelligence - Analytics, IoT, Big Data, Streaming 20161025
Tibco Augmented Intelligence - Analytics, IoT, Big Data, Streaming 20161025
 
Stream Processing – Concepts and Frameworks
Stream Processing – Concepts and FrameworksStream Processing – Concepts and Frameworks
Stream Processing – Concepts and Frameworks
 

Más de DataWorks Summit/Hadoop Summit

Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerUnleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerDataWorks Summit/Hadoop Summit
 
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformEnabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformDataWorks Summit/Hadoop Summit
 
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDouble Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDataWorks Summit/Hadoop Summit
 
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...DataWorks Summit/Hadoop Summit
 
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...DataWorks Summit/Hadoop Summit
 
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLMool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLDataWorks Summit/Hadoop Summit
 
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)DataWorks Summit/Hadoop Summit
 
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...DataWorks Summit/Hadoop Summit
 

Más de DataWorks Summit/Hadoop Summit (20)

Running Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in ProductionRunning Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in Production
 
State of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache ZeppelinState of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache Zeppelin
 
Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerUnleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache Ranger
 
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformEnabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science Platform
 
Revolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and ZeppelinRevolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and Zeppelin
 
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDouble Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSense
 
Hadoop Crash Course
Hadoop Crash CourseHadoop Crash Course
Hadoop Crash Course
 
Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Apache Spark Crash Course
Apache Spark Crash CourseApache Spark Crash Course
Apache Spark Crash Course
 
Dataflow with Apache NiFi
Dataflow with Apache NiFiDataflow with Apache NiFi
Dataflow with Apache NiFi
 
Schema Registry - Set you Data Free
Schema Registry - Set you Data FreeSchema Registry - Set you Data Free
Schema Registry - Set you Data Free
 
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
 
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
 
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLMool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and ML
 
How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient
 
HBase in Practice
HBase in Practice HBase in Practice
HBase in Practice
 
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)
 
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS HadoopBreaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
 
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
 
Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop
 

Último

Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Principled Technologies
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 

Último (20)

Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 

IOT, Streaming Analytics and Machine Learning

  • 1. IOT, STREAMING ANALYTICS AND MACHINE LEARNING Delivering Real-Time Intelligence With Apache NiFi Paul Kent, VP of Big Data, Platform R&D Dan Zaratsian, Sr. Solutions Architect
  • 2. Copyr ight © 2015, SAS Institute Inc. All rights reser ved.
  • 3. Copyr ight © 2015, SAS Institute Inc. All rights reser ved. • Drop-and-Drag Interface • Secure/Encrypted • Bi-Directional Communication • Data Provenance
  • 4. Copyr ight © 2015, SAS Institute Inc. All rights reser ved.
  • 5. Copyr ight © 2015, SAS Institute Inc. All rights reser ved. SAS ESP + HDF WHY IS THIS IMPORTANT? RAPID PROTOTYPING OF MACHINE LEARNING MODELS ANALYTICS WITHIN AN OPEN FRAMEWORK
  • 6. Events (Data In) Events (Data Out) Filtering Aggregation Pattern Detection Computation Merging / Joins Functions Retention Window Text Analytics Unsupervised Learning Predictive Modeling …more… Detect Events & Patterns of Interest
  • 7. KEY CONCEPTSESP MODEL - PROCESS FLOW SAS EVENT STREAM PROCESSING ENGINE DATA IN (Events) DATA OUT (Events) Design of the rule model (called “Continuous Query”) using components (called “Windows”) DATA IN (Events) DATA IN (Events) DATA OUT (Events) SOURCE 1 WINDOW SOURCE 2 WINDOW SOURCE 3 WINDOW FILTER WINDOW CALCULATIONS WINDOW JOIN WINDOW JOIN WINDOW NOTIFICATION WINDOW PREDICTIVE MODEL (SCORING) WINDOW
  • 8. Copyr ight © 2012, SAS Institute Inc. All rights reser ved. EVENT STREAM PROCESSING ESP STUDIO INTERFACE
  • 9. Copyr ight © 2015, SAS Institute Inc. All rights reser ved. DESIGNING ESP MODELS DESIGN COMPONENTS DS2 PROCEDURAL WINDOW In-Stream Analytics: 1. Build analytical model using EM,VA, etc. • Decision Tree • Neural Network • Regression • Rule Induction • And more 2. Use PROC DSTRANS to convert code to DS2 3. Deploy model to procedural window Only when the existing model is additive in nature and can process one event at a time.
  • 10. Copyr ight © 2012, SAS Institute Inc. All rights reser ved. DESIGNING ESP MODELS DESIGN COMPONENTS PATTERN WINDOW overview Build complex network of events using temporal conditions Multiple events in can produce one event out E1 E2 And Followed By E4 E5 And Not E6 E3Or 5 min 1 hour Followed By “Detect when event A is followed by event B and not Event C in a 3min time frame”
  • 11. Copyr ight © 2012, SAS Institute Inc. All rights reser ved. DESIGNING ESP MODELS DESIGN COMPONENTS TEXT ANALYTICS WINDOWS • Process unstructured text fields • 3 dedicated Text Analytics windows • Text Context (.liti files) • Text Category (.mco files) • Text Sentiment (.sam files) • An appropriate Text Analytics license is required.
  • 12. SAS EVENT STREAM PROCESSING HTML5 STREAMVIEWER • HTML5 interface • Uses HTTP (RESTful) XML server • 2 Modes: • Streaming mode: display all events • Update : events processed with opcode • Google charts • Subscribe & Publish
  • 13. INTEGRATION SAS EVENT STREAM PROCESSING & HORTONWORKS DATA FLOW (NIFI) & SAS Event Stream Processing Hortonworks Data Flow (Nifi)
  • 14. Copyr ight © 2015, SAS Institute Inc. All rights reser ved. SAS® EVENT STREAM PROCESSING CONCEPTUAL OVERVIEW SAS-generated Insights Enrichment Data Event Actions SAS In-Memory SAS ® Event Stream Processing Model Continuous Query Publish Subscribe Streaming Events Analytic Models Business Rules Nifi
  • 15. Copyr ight © 2015, SAS Institute Inc. All rights reser ved. SAS EVENT STREAM PROCESSING • CONNECTORS & ADAPTERS PUB/SUB API Connect to any system with Java or C Public, documented and easy to use Adapters are standalone processes and can be networked Publish to ESP Source windows – Subscribe to any ESP window All Connectors & Adapters are built using the Pub/Sub API •File/Socket •XML / JSON •Database (odbc) •SAS® LASR™ •Hadoop •SAS® Dataset OUT OF THE BOX *Publish only **Subscribe only •ESP Project •RabbitMQ •Solace •Tervela •Google Protobuff •Twitter* •SAS® HDAT •JMS •IBM WebSphere MQ •Tibco RendezVous •Syslog * •Network Sniffer* •HTTP RESTful •OSIsoft PI •Axeda •Teradata •SMTP ** •ESP to ESP
  • 16. Copyr ight © 2015, SAS Institute Inc. All rights reser ved. DIRECTION: ADAPTERS & CONNECTORS •Flume: Integrate ESP with streaming log data •Kafka: Integration with large scale message processing •MQTT: Support within IoT and Connected things •Cassandra (adaptor only): integration with large-scale, distributed data source •HortonWorks Data Flow (NiFi) Processor: support NiFI streams •MapR: MapR Streams support •Boardreader: Blogs, News, Boards, Reviews •Spryware: Market data through direct exchange feeds •IOT Gateways and devices PUB/SUB API Connect to any system with Java or C Documented and easy to use Adapters are standalone processes and can be networked Publish to ESP Source windows – Subscribe to any ESP window All Connectors & Adapters are built using the Pub/Sub API SAS® EVENT STREAM PROCESSING 4.1
  • 17. Copyr ight © 2015, SAS Institute Inc. All rights reser ved. SAS® EVENT STREAM PROCESSING 4.1 FLEXIBILITY AND INTEGRATION • Python Pub/sub API: drive ESP using Python • Leverage Analytic Decisions within ESP • Decision/Rules/Analytical Model Integration via SAS Micro Analytic Service • Lightweight, fast service for decision deployment • Leverage Languages in ESP (In-process Event Stream Handlers) • DATAstep (native) • DS2 (current) • Python • Future: R language (post-16w48)
  • 18. Company Confidential - For Internal Use Only Copyright © 2015, SAS Institute Inc. All rights reserved. Edge Analytics In-Motion Analytics At-Rest Analytics Connected Systems, Devices Monitor equipment on for failures and safety issues, and take action. Identify fraudulent transactions and be alerted in real-time. Intelligently integrate customer information with real-time streaming data Strategic Data IntegrationTransactions, Logs, Clickstreams
  • 19. STREAMING ANALYTICS Where are the Opportunities? • Competitive Pressure (Technology, Sensors, Analytics) • Risk • Safety • Security • Personalization Extend the existing analytical footprint! Capture value otherwise lost through information lag
  • 20. INTEGRATION SAS EVENT STREAM PROCESSING & HORTONWORKS DATA FLOW (NIFI) & SAS Event Stream Processing Hortonworks Data Flow (Nifi)
  • 21. Copyright © 2012, SAS Institute Inc. All rights reserved. Paul Kent, VP of Big Data, Platform R&D Dan Zaratsian, Sr. Solutions Architect Copyright © 2012, SAS Institute Inc. All rights reserved.Copyright © 2012, SAS Institute Inc. All rights reserved. Demo
  • 22. Company Confidential - For Internal Use Only Copyright © 2015, SAS Institute Inc. All rights reserved.
  • 23. Company Confidential - For Internal Use Only Copyright © 2015, SAS Institute Inc. All rights reserved.
  • 24. Company Confidential - For Internal Use Only Copyright © 2015, SAS Institute Inc. All rights reserved.
  • 25. Company Confidential - For Internal Use Only Copyright © 2015, SAS Institute Inc. All rights reserved.

Notas del editor

  1. The last window I wanted to highlight today is the Procedural window. As its name state, this window is dedicated to create custom procedures using code. In comparison to all other ESP windows that are dedicated to perform a specific operation, the Procedural window can do nearly anything that you want. You just need to code it. You can code directly in C++ or using the SAS DataStep 2 language. Coding in C++ is quite self explanatory, so I will focus on the DS2 procedural window. As you would easily guess, the goal is to allow to ease the implementation of analytics models into ESP, to be performed in real time on streaming data. We can for example build analytical models using enterprise miner, VA or any other SAS tool, convert the SAS code to DS2 code, and deploy the model to ESP in to the procedural window, usually with just a few tweaks. This is already used at least on 2 customer projects, in US and in Belgium with very good results. We need to aware of a few important points though, mostly due to the streaming nature of ESP: First, not every model has a meaning on streaming data. Some type of analytics models make only sense on batch data. ESP processes data on the move, so these procedural window models will only receive one event at a time. So we cannot use DS2 models that require for example multi-scan of a set of data, or need to lookup information on another external set of data. All required information has to be on incoming event data, or in the DS2 code itself. If a more complex model is needed, it then has to be implemented using other ESP components, like for example a combination of ESP windows. But based on this characteristic I would say that this window is a very good fit for all scoring models. Last important thing to be aware of is that the DS2 code is not natively build to cope with the speed and throughput of SAS ESP. So don’t expect usual ESP performance when using DS2 procedural windows. Having said that, This is a great feature a a strong differentiator, so don’t hesitate to use it when it fits the need. Including as much analytics features as possible into ESP and being able to seamlessly integrate existing SAS models is probably the main strategy for ESP in the next releases, so stay tune for a lot of improvements in this area in the near future.
  2. The Pattern window is probably the one that reflects the most what ESP is used for: This window allows to detect temporal conditions on events like for example “Tell me when an event A was followed by an event B and not event C within 3 minutes”, then when this sequence of conditions is detected, build and generate an event, usually to alert or trigger another application, or just to be processed downstream. This is very powerful to detect specific complex behaviors on real time. As an example, in the operator tree illustrated on the left of the screen, we want to detect when we have event 1 and event 2 or, event 1 and event 3, followed by event 4 in the next 5 minutes but we don’t want any occurrence of event 5 in these 5 minutes, then, followed by event 6 in the next hour. If this sequence of events is matched by the input events, at some moment during the stream, the pattern window will generate an event. This generated event is usually built using values coming from all the event that have been caught by these pattern condition. This is of course just an example and this window allows to detect very complex occurring or non occurring patterns of events using this operator tree paradigm.
  3. On another interesting domain ESP is able to process unstructured fields using Text Analytics dedicated windows. It was already possible in the previous ESP version to find classified terms on events text fields using the Text Context window. The version 3.1 brings 2 additional windows, the Text Category and the Text Sentiment windows in order to further enrich ESP capabilities in this domain. These windows generate new events that can be further analyzed by other window types. For example, a pattern window could follow a text context window to look for tweet patterns of interest. Just as a side note remember that you need to have an appropriate license of SAS Text Analytics to be able to use these windows.
  4. Lets understand this with a simple example.
  5. Let’s cover now how ESP connects to other applications. SAS Event Stream Processing provides a public pub/sub API for developing and creating any custom connector or adapter, hence having the flexibility to connect to any application. It can be done in C language or Java for maximum flexibility. Actually all existing Connectors & Adapters are build using the Pub/Sub API Adapters and connectors are nearly identical, the difference is that adapters are standalone components, external to the ESP process. As a result, Adapters can be networked. SAS ESP provides many Connectors and Adapters out of the box, for files, MQ messaging buses, JMS, databases, XML, SAS LASR or HDAT, Hadoop, Tibco, OSIsoft PI, etc… and many more will come. This release 3.1 of ESP also brings 3 new adapters and connectors: A new REST web service adapter to connect ESP with RTDM or other web services ESP subscribing applications. A new Sniffer connector dedicated to capture packets from network interfaces, and do network analysis. The Twitter Adapter that was on tool pool is now part of the ESP standard installation. Multiple improvements have also been added to the existing adapters.
  6. ANAND: That explains why we also hear from our customers various ways in which they refer to the need of applying analytics at different stages of processing their data and as a result why Streaming Analytics goes by many names. Specially in todays world of Big Data and Internet of Things which is just picking up pace, the same terms might mean different across organizations. You will be coming across many of these names when discussing with customers, so make sure you understand what they mean as that can change the scope of how, where and what type of analytics needs to be applied. {DAN} Right said Anand.. There are considerations for each one of these types and all are important towards deriving value from data. In its simplest form, you can look at : Edge Analytics  Analytics applied at specific device/sensor, i.e, at the asset and not upstream. Examples include video analytics, sensor networks, optimizing smart grids. In-Motion Analytics  Analytics applied while the data is in motion.. Between sensors.. Between the sensor and another machine or human interface. Here examples include analysis of online transactions, system logs and web clickstreams with continuous application of analytics for monitoring, identification and action in real-time. At-Rest  Event Stream Processing is not limited to real-time data stream, but rather it’s important to leverage and integrate data at-rest with real-time streams in order to make better informed decisions at all levels of your business.
  7. 1) First are ecommerce interactions for example: Clickstream analysis will help in optimizing user experience on commercial web sites, to adapt advertising or page layout to a specific user behavior, history or profile. This requires low latency decision, with immediate pattern recognition as we are dealing with live events. It will result with a better customer experience and increase in sales or customer satisfaction. Or maybe also reduce the churn. 2) In Fraud detection there are many applications of Event Stream Processing: Event Stream Processing can analyze and correlate in real time transactions and user behaviors to detect suspicious events and potential frauds. It can then halt the pending transaction and issue alerts or further investigations. This requires extremely low latency decision and complex pattern recognition based and user behavior history, their usual network behavior, well known scenarios, and in the same time a great flexibility to adapt to the ever growing “creativity” of fraud organizations. 3) Connected devices and the Internet of Things is probably an area where we will have the most uses cases in the very near future: With the advent of what we call the IoT, many equipment will generate data through sensors to measure their activity like voltage values or temperature. Event Stream Processing can then be used to monitor these streams of information and detect failure signs or specific behaviors in real time to take the appropriate decision faster. 4) Many other use cases could be found in telecommunications environments, analyzing the huge amount of communications information to improve in real time, advertising, customer interaction, IT systems, fraud detection, etc… 5) Manufacturing For example in manufacturing systems, Event processing can be used in plants to detect anomalies or determine if significant changes require re-planning of production. Plant floor systems get events from numerous sensors and push them to a centralized control system that will explore event patterns and emit aggregated, rich events to take decisions.   6) Energy and utility Energy and utility is another important domain where for example optimized grid power networks can choose the best power source based on existing conditions and projected needs. Monitored water systems can prevent infrastructure failures, alert staff about leakages and help understand the impact of water usage on the surrounding environment. Avoid downtimes caused by defect assets on oil drilling platforms.   Let’s walk through some real customer cases