Más contenido relacionado La actualidad más candente (20) Similar a How to Apply Big Data Analytics and Machine Learning to Real Time Processing - Kai Waehner - Codemotion Milan 2016 (20) How to Apply Big Data Analytics and Machine Learning to Real Time Processing - Kai Waehner - Codemotion Milan 20164. © Copyright 2000-2016 TIBCO Software Inc.
Agenda
1) Machine Learning and Big Data Analytics
2) Building an Analytic Model
3) Real Time Processing
4) Live Demo
5. © Copyright 2000-2016 TIBCO Software Inc.
Agenda
1) Machine Learning and Big Data Analytics
2) Building an Analytic Model
3) Real Time Processing
4) Live Demo
7. Real World Examples of Machine Learning
Spam Detection
Search Results +
Product Recommendation
Picture Detection
(Friends, Locations, Products)
Machine Learning is already present in daily life…
Now, every enterprise is beginning to leverage it!
The Next Disruption:
Google Beats Go Champion
8. © Copyright 2000-2016 TIBCO Software Inc.
Analytics Maturity Model
Immediate
Long-Term
Competitive AdvantageValue to the Organization
A good Big Data Analytics platform can provide value to the organization
across the full spectrum of use cases
Self-service
Dashboards
Event
Processing
Advanced Analytics
Measure Diagnose Predict Optimize Alert Automate
Analytics Maturity
Visual
Analytics
Event
Processing
Analytics
9. © Copyright 2000-2016 TIBCO Software Inc.
Analytics Maturity Model
Immediate
Long-Term
Competitive AdvantageValue to the Organization
Visual
Analytics
Event
Processing
Advanced Analytics
Measure Diagnose Predict Optimize Alert Automate
Analytics Maturity
A good Big Data Analytics platform can provide value to the organization
across the full spectrum of use cases
Analytics
10. © Copyright 2000-2016 TIBCO Software Inc.
Analytics Maturity Model
Immediate
Long-Term
Competitive AdvantageValue to the Organization
Self-service
Dashboards
Event
Processing
Advanced Analytics
Measure Diagnose Predict Optimize Alert Automate
Analytics Maturity
A good Big Data Analytics platform can provide value to the organization
across the full spectrum of use cases
Visual
Analytics
Event
Processing
Analytics
11. © Copyright 2000-2016 TIBCO Software Inc.
Agenda
1) Machine Learning and Big Data Analytics
2) Building an Analytic Model
3) Real Time Processing
4) Live Demo
13. © Copyright 2000-2016 TIBCO Software Inc.
Analytics Maturity Model
Immediate
Long-Term
Competitive AdvantageValue to the Organization
A good Big Data Analytics platform can provide value to the organization
across the full spectrum of use cases
Self-service
Dashboards
Event
Processing
Advanced Analytics
Measure Diagnose Predict Optimize Alert Automate
Analytics Maturity
Visual
Analytics
Event
Processing
Analytics
15. © Copyright 2000-2016 TIBCO Software Inc.
Variety of Data in Enterprises
Custom GUI-driven
data access via
SDK
Siebel
eBusiness
Local data sources
AccessExcel STDF
Drag-and-drop
MySQL
SQL
Server
Oracle
Information Services
(join, transform, reusable,
parameterized, dynamic query
for in-memory use)
Databases
JDBC/ODB
C
Hadoop
SFDC
PostgreSQL
Teradata
Netezza
Etc.XML
RDBMS
Flat
Files
Spread-
sheets
Web
Services
Oracle
E-Business
RDBMS
RDBMS
RDBMS
SAP BWSAP R/3 D
A
T
A
F
A
B
R
I
C
Salesforce
ODBC
OLE DB
SqlClient
Direct
connection
Oracle
TeradataAsterMS SSAS
Teradata
Direct Query
(dynamically query and retrieve data
for visualization and analysis)
Databases
MySQL
Etc.
OBIEE
Netezza
Hadoop
18. cust_id dept sku dollar gift date
1 104 C 12003 2.40 FALSE 2016-10-17
2 105 A 12005 62.85 FALSE 2016-10-17
3 102 C 12007 69.23 TRUE 2016-10-17
4 104 B 12004 9.33 FALSE 2016-10-18
5 105 C 12010 14.16 TRUE 2016-10-18
6 101 B 12003 90.43 FALSE 2016-10-19
7 103 C 12005 90.97 FALSE 2016-10-19
n … … … … … …
cust_id A B C total # orders first_dat
e
last_dat
e
1 100 21.76 23.67 0.00 45.43 2 2016-10-
19
2016-10-
20
2 101 0.01 74.65 0.00 74.66 3 2016-10-
19
2016-10-
20
3 102 0.00 60.92 50.29 111.21 6 2016-10-
17
2016-10-
20
4 103 0.00 0.00 52.30 52.30 2 2016-10-
19
2016-10-
20
5 104 31.34 9.33 2.40 43.06 4 2016-10-
17
2016-10-
20
6 105 62.85 0.00 56.00 118.85 3 2016-10-
17
2016-10-
20
© Copyright 2000-2016 TIBCO Software Inc.
Data Munging - Transformations
20. “The greatest value of a picture
is when it forces us to notice
what we never expected to see”
John W. Tukey, 1977
© Copyright 2000-2016 TIBCO Software Inc.
Exploratory Data Analysis
21. Visual Analytics - Interactive Brush-Linked
© Copyright 2000-2016 TIBCO Software Inc.
… and “Inline Data Wrangling” à Ad-hoc data preparation instead of just ETL
22. © Copyright 2000-2016 TIBCO Software Inc.
Analytics Maturity Model
Immediate
Long-Term
Competitive AdvantageValue to the Organization
Visual
Analytics
Event
Processing
Advanced Analytics
Measure Diagnose Predict Optimize Alert Automate
Analytics Maturity
A good Big Data Analytics platform can provide value to the organization
across the full spectrum of use cases
Analytics
24. © Copyright 2000-2016 TIBCO Software Inc.
Which picture represents a model?
A model is a simplification of the truth that helps you with decision making.
27. Employees who write longer emails earn higher salaries!
© Copyright 2000-2016 TIBCO Software Inc.
Model Building
31. © Copyright 2000-2016 TIBCO Software Inc.
Model Validation
How is the IQ of a kid related to the IQ of his / her mum?
34. © Copyright 2000-2016 TIBCO Software Inc.
“…as a next-generation data discovery capability that automatically finds and explains
insights from advanced analytics to business users or citizen data scientists”
Smart Data Discovery (for the Business User)
Leverage Machine Learning
without the help of a Data Scientist
36. © Copyright 2000-2016 TIBCO Software Inc.
Agenda
1) Machine Learning and Big Data Analytics
2) Building an Analytic Model
3) Real Time Processing
4) Live Demo
37. © Copyright 2000-2016 TIBCO Software Inc.
Analytics Maturity Model
Immediate
Long-Term
Competitive AdvantageValue to the Organization
Self-service
Dashboards
Event
Processing
Advanced Analytics
Measure Diagnose Predict Optimize Alert Automate
Analytics Maturity
A good Big Data Analytics platform can provide value to the organization
across the full spectrum of use cases
Visual
Analytics
Event
Processing
Analytics
38. © Copyright 2000-2016 TIBCO Software Inc.
Traditional Data Processing: ”Request – Response”
Store
Analyz
e
Act
39. © Copyright 2000-2016 TIBCO Software Inc.
The New Era: Streaming Analytics
Act &
Monito
r
Analyz
e
Store
40. © Copyright 2000-2016 TIBCO Software Inc.
Streaming Analytics - Processing Pipeline
APIs
Adapters /
Channels
Integration
Messaging
Stream Ingest
Transformatio
n
Aggregation
Enrichment
Filtering
Stream
Preprocessing
Process
Management
Analytics
(Real Time)
Applications
& APIs
Analytics /
DW
Reporting
Stream
Outcomes
• Contextual
Rules
• Windowing
• Patterns
• Analytics
• Deep ML
• …
Stream Analytics &
Processing
Index /
Search
Normalization
Applying an Analytic Model
is just a piece of the puzzle!
41. © Copyright 2000-2016 TIBCO Software Inc.
Frameworks and Products
(no complete list!)
OPEN SOURCE CLOSED SOURCE
PRODUCT
FRAMEWORK
Azure Microsoft
Stream Analytics
42. © Copyright 2000-2016 TIBCO Software Inc.
Comparison of Stream Processing Frameworks and Products
Slide Deck and Video Recording:
http://www.kai-waehner.de/blog/2016/11/15/
streaming-analytics-comparison-open-source-frameworks-products-cloud-services/
43. © Copyright 2000-2016 TIBCO Software Inc.
Apache Storm – Hello World
http://wpcertification.blogspot.ch/2014/02/helloworld-apache-storm-word-counter.html
44. © Copyright 2000-2016 TIBCO Software Inc.
Visual Coding for Streaming Analytics
• Streaming Operators
• Connectivity
• Visual Development
• Testing & Simulation
• Mature Tooling / Support
• Middleware Integration
45. © Copyright 2000-2016 TIBCO Software Inc.
Live Visual Analytics UI
Dynamic aggregation
Live visualization
Ad-hoc continuous
query
Alerts
Action
46. © Copyright 2000-2016 TIBCO Software Inc.
How to
apply analytic models
to real time processing
without redevelopment?
Stream
Processi
ng
H20.ai
Open
Sourc
e R
TERR
Spark
ML MATL
AB
SAS
PMML
48. © Copyright 2000-2016 TIBCO Software Inc.
Agenda
1) Machine Learning and Big Data Analytics
2) Building an Analytic Model
3) Real Time Processing
4) Live Demo
49. Scenario: Predictive Scrapping of Parts in an Assembly Line
Goal: Scrap parts as early as possible automatically to reduce costs in a manufacturing process.
Question: When to scrap a part in Station 1 instead of doing re-work or sending it to Station 2?
Station 1 Station 2
Cost Before
9€
7€ 13€
Total Cost
29€
(or more)
Scrap? Scrap?
50. Fast Data Architecture for Predictive Maintenance
Operational Analytics
Operations
Live UI
CSV Batch
JSON Real Time
XML Real Time
Streaming AnalyticsAction
Aggregate
Rules
Analytics
Correlate
Live Datamart
Continuous query
processing
Alerts
Manual action,
escalation
HISTORICAL ANALYSIS Data
Scientists
Flume
HDFS
Spotfire
R / TERR
HDFS
Hadoop (Cloudera)
StreamBase
TIBCO Fast Data Platform
H2O
Oracle RDBMS
Avro Parquet … PMML
Internal Data
51. TIBCO Spotfire with H2O Integration
Data Discovery / Data Mining (“Are parts that repeat a station more likely scrap parts?”)
55. © Copyright 2000-2016 TIBCO Software Inc.
TIBCO Accelerator for Apache Spark
1. Fast Data Preparation for IoT
Dozens of enterprise and IoT data preparation adapters: MQTT,
Databases; inbound creation of HDFS, Parquet, Hbase, Avro…
2. Spotfire Model Discovery Template
Use Spotfire to explore Spark data lake, create predictive model, train
in H20, and deploy to Streaming Analytics.
3. Operationalize Predictive Models
Zookeeper deployment to StreamBase nodes living in Spark cluster
via H20, PMML, TERR models
4. Streaming Analytics for Automation
Automate action based on predictive models – make offers to
customers, stop fraudulent transactions, alert.
5. Monitor & Retrain Model
Monitor behavior of model, retrain when necessary.
6. Drag & Drop for Business Solution Developers
Code-free development environment for work with H20, HDFS, Avro,
TERR
The TIBCO Accelerator for Spark is a TIBCO
engineered, light-weight open-source fast-
start for systems to stream data into Spark,
discover patterns in Spark with Spotfire, and
operationalize the insights on Big Data.
FUNCTIONAL COMPONENTS
56. © Copyright 2000-2016 TIBCO Software Inc.
Key Take-Aways
Ø Insights are hidden in Historical Data on Big Data Platforms
Ø Machine Learning and Big Data Analytics find these Insights by building Analytics Models
Ø Event Processing uses these Models (without Redevelopment) to take Action in Real Time
57. Questions? Please contact me!
Kai Wähner
Technology Evangelist at TIBCO
kontakt@kai-waehner.de
@KaiWaehner
www.kai-waehner.de
LinkedIn