Más contenido relacionado La actualidad más candente (20) Similar a Findability Day 2016 - Big data analytics and machine learning (20) Findability Day 2016 - Big data analytics and machine learning6. © Copyright 2000-2016 TIBCO Software Inc.
Agenda
1) Machine Learning and Big Data Analytics
2) Building an Analytic Model
3) Real Time Processing
4) Real World Scenario
7. © Copyright 2000-2016 TIBCO Software Inc.
Agenda
1) Machine Learning and Big Data Analytics
2) Building an Analytic Model
3) Real Time Processing
4) Real World Scenario
9. Real World Examples of Machine Learning
Spam Detection
Search Results +
Product Recommendation
Picture Detection
(Friends, Locations, Products)
Machine Learning is already present in daily life…
Now, every enterprise is beginning to leverage it!
The Next Disruption:
Google Beats Go Champion
10. © Copyright 2000-2016 TIBCO Software Inc.
Analytics Maturity Model
Immediate
Long-Term
Competitive AdvantageValue to the Organization
A good Big Data Analytics platform can provide value to the organization
across the full spectrum of use cases
Self-service
Dashboards
Event Processing Advanced Analytics
Measure Diagnose Predict Optimize Alert Automate
Analytics Maturity
Visual Analytics Event Processing
Analytics
11. © Copyright 2000-2016 TIBCO Software Inc.
Analytics Maturity Model
Immediate
Long-Term
Competitive AdvantageValue to the Organization
Visual Analytics Event Processing Advanced Analytics
Measure Diagnose Predict Optimize Alert Automate
Analytics Maturity
A good Big Data Analytics platform can provide value to the organization
across the full spectrum of use cases
Analytics
12. © Copyright 2000-2016 TIBCO Software Inc.
Analytics Maturity Model
Immediate
Long-Term
Competitive AdvantageValue to the Organization
Self-service
Dashboards
Event Processing Advanced Analytics
Measure Diagnose Predict Optimize Alert Automate
Analytics Maturity
A good Big Data Analytics platform can provide value to the organization
across the full spectrum of use cases
Visual Analytics Event Processing
Analytics
13. © Copyright 2000-2016 TIBCO Software Inc.
The first task in a new analytics projects
is to define a Business Case!
14. © Copyright 2000-2016 TIBCO Software Inc.
Agenda
1) Machine Learning and Big Data Analytics
2) Building an Analytic Model
3) Real Time Processing
4) Real World Scenario
16. © Copyright 2000-2016 TIBCO Software Inc.
Analytics Maturity Model
Immediate
Long-Term
Competitive AdvantageValue to the Organization
A good Big Data Analytics platform can provide value to the organization
across the full spectrum of use cases
Self-service
Dashboards
Event Processing Advanced Analytics
Measure Diagnose Predict Optimize Alert Automate
Analytics Maturity
Visual Analytics Event Processing
Analytics
20. cust_id dept sku dollar gift date
1 104 C 12003 2.40 FALSE 2016-10-17
2 105 A 12005 62.85 FALSE 2016-10-17
3 102 C 12007 69.23 TRUE 2016-10-17
4 104 B 12004 9.33 FALSE 2016-10-18
5 105 C 12010 14.16 TRUE 2016-10-18
6 101 B 12003 90.43 FALSE 2016-10-19
7 103 C 12005 90.97 FALSE 2016-10-19
n … … … … … …
cust_id A B C total # orders first_dat
e
last_dat
e
1 100 21.76 23.67 0.00 45.43 2 2016-10-
19
2016-10-
20
2 101 0.01 74.65 0.00 74.66 3 2016-10-
19
2016-10-
20
3 102 0.00 60.92 50.29 111.21 6 2016-10-
17
2016-10-
20
4 103 0.00 0.00 52.30 52.30 2 2016-10-
19
2016-10-
20© Copyright 2000-2016 TIBCO Software Inc.
Data Munging - Transformations
22. “The greatest value of a picture
is when it forces us to notice
what we never expected to see”
John W. Tukey, 1977
© Copyright 2000-2016 TIBCO Software Inc.
Exploratory Data Analysis
24. © Copyright 2000-2016 TIBCO Software Inc.
Analytics Maturity Model
Immediate
Long-Term
Competitive AdvantageValue to the Organization
Visual Analytics Event Processing Advanced Analytics
Measure Diagnose Predict Optimize Alert Automate
Analytics Maturity
A good Big Data Analytics platform can provide value to the organization
across the full spectrum of use cases
Analytics
26. © Copyright 2000-2016 TIBCO Software Inc.
Which picture represents a model?
A model is a simplification of the truth that helps you with decision making.
29. Employees who write longer emails earn higher salaries!
© Copyright 2000-2016 TIBCO Software Inc.
Model Building
33. © Copyright 2000-2016 TIBCO Software Inc.
Model Validation
How is the IQ of a kid related to the IQ of his / her mum?
35. © Copyright 2000-2016 TIBCO Software Inc.
“…as a next-generation data discovery capability that automatically finds and explains
insights from advanced analytics to business users or citizen data scientists”
Smart Data Discovery (for the Business User)
Leverage Machine Learning
without the help of a Data Scientist
37. TIBCO Spotfire with R / TERR Integration
© Copyright 2000-2016 TIBCO Software Inc.
Let the business user leverage Analytic Models (created by the Data Scientist) to find insights!
Example: Customer Churn with Random Forest Algorithm
• ‘refresh model’ button lives a ‘random forest algorithm’
• requires no a priori assumptions at all, it just always works
• The business user doesn’t need to know what random forest is to be empowered by it
Select variables
for the model
38. © Copyright 2000-2016 TIBCO Software Inc.
Agenda
1) Machine Learning and Big Data Analytics
2) Building an Analytic Model
3) Real Time Processing
4) Real World Scenario
39. © Copyright 2000-2016 TIBCO Software Inc.
Analytics Maturity Model
Immediate
Long-Term
Competitive AdvantageValue to the Organization
Self-service
Dashboards
Event Processing Advanced Analytics
Measure Diagnose Predict Optimize Alert Automate
Analytics Maturity
A good Big Data Analytics platform can provide value to the organization
across the full spectrum of use cases
Visual Analytics Event Processing
Analytics
40. © Copyright 2000-2016 TIBCO Software Inc.
Operational Intelligence and Human Interaction
Actions by Operations
Human decisions in real time informed by
up to date information
38
Automated action based on models of history
combined with live context and business rules
Machine-to-Machine Automation
41. © Copyright 2000-2016 TIBCO Software Inc.
Visual Coding for Streaming Analytics with TIBCO StreamBase
• Streaming Operators
• Connectivity
• Visual Development
• Testing & Simulation
• Mature Tooling / Support
• Middleware Integration
42. © Copyright 2000-2016 TIBCO Software Inc.
Live Visual Analytics UI with TIBCO Live Datamart
Dynamic aggregation
Live visualization
Ad-hoc continuous query
Alerts
Action
43. © Copyright 2000-2016 TIBCO Software Inc.
How to
apply analytic models
to real time processing
without redevelopment?
TIBCO
StreamBaseH20.ai
Open
Source
R
TERR
Spark
ML
MATLAB
SAS
PMML
45. © Copyright 2000-2016 TIBCO Software Inc.
Agenda
1) Machine Learning and Big Data Analytics
2) Building an Analytic Model
3) Real Time Processing
4) Real World Scenario
46. Scenario: Predictive Scrapping of Parts in an Assembly Line
Goal: Scrap parts as early as possible automatically to reduce costs in a manufacturing process.
Question: When to scrap a part in Station 1 instead of doing re-work or sending it to Station 2?
Station 1 Station 2
Cost Before
9€
7€ 13€
Total Cost
29€
(or more)
Scrap? Scrap?
47. TIBCO Spotfire with H2O Integration
Data Discovery / Data Mining (“Are parts that repeat a station more likely scrap parts?”)
50. © Copyright 2000-2016 TIBCO Software Inc.
TIBCO Accelerator for Apache Spark
1. Fast Data Preparation for IoT
Dozens of enterprise and IoT data preparation adapters:
MQTT, Databases; inbound creation of HDFS, Parquet, Hbase,
Avro…
2. Spotfire Model Discovery Template
Use Spotfire to explore Spark data lake, create predictive
model, train in H20, and deploy to Streaming Analytics.
3. Operationalize Predictive Models
Zookeeper deployment to StreamBase nodes living in Spark
cluster via H20, PMML, TERR models
4. Streaming Analytics for Automation
Automate action based on predictive models – make offers to
customers, stop fraudulent transactions, alert.
5. Monitor & Retrain Model
Monitor behavior of model, retrain when necessary.
6. Drag & Drop for Business Solution Developers
Code-free development environment for work with H20, HDFS,
Avro, TERR
The TIBCO Accelerator for Spark is a TIBCO
engineered, light-weight open-source fast-
start for systems to stream data into Spark,
discover patterns in Spark with Spotfire, and
operationalize the insights on Big Data.
FUNCTIONAL COMPONENTS
51. © Copyright 2000-2016 TIBCO Software Inc.
Key Take-Aways
Ø Insights are hidden in Historical Data on Big Data Platforms
Ø Machine Learning and Big Data Analytics find these Insights by building Analytics Models
Ø Event Processing uses these Models (without Redevelopment) to take Action in Real Time
52. Questions? Please contact me!
Kai Wähner
Technology Evangelist
kontakt@kai-waehner.de
@KaiWaehner
www.kai-waehner.de
LinkedIn