Gene Villeneuve - Moving from descriptive to cognitive analytics

© 2013 IBM Corporation
A New Era of Smart
Moving from Descriptive
to Cognitive Analytics on
your Big Data Projects
 Date: October 7, 2014
 Gene Villeneuve
Director & European Sales Leader
Predictive & Business Intelligence

A New Era of Smart
Agenda
 Introduction and some clarification regarding terminology
The evolution of analytics
Descriptive  Predictive  Prescriptive  Cognitive
 Analytics in the Context of Big Data
 Big Data & Analytics Reference Model
 Sample projects and customer case studies illustrating the evolution of analytics
 Current research & development areas
© 2013 2 IBM Corporation

A New Era of Smart
INTRODUCTION &
TERMINOLOGY

A New Era of Smart
Analytics: a Business Imperative across Industries
 LOB buyers are driving new demand for industry solutions
At the point
of impact
Big Data
and
Analytics
All
perspectives
All
decisions
All information
All
people
 The new era of computing enables new analytic methods
Search
Deterministic
Enterprise data
Machine language
Simple outputs
Programmatic
 Discovery
 Probabilistic
 Big Data
 Natural language
 Intelligent options
Cognitive
* Source: IBM Market Development & Insight – GMV 1H2013

A New Era of Smart
The Evolution of Analytics
Cognitive
Analytics
Predictive
Analytics
Prescriptive
Analytics
Descriptive
Analytics
Descriptive
 “After-the-facts”
analytics by analyzing
historical data
 Provides clarity as to
where an enterprise
or an organization
stands related to
defined business
measures
 Applied to all LoB for
fact finding,
visualization of
success and failure
Cognitive
 Pertaining to the
mental processes of
perception, memory,
judgment, learning,
and reasoning
 Range of different
analytical strategies
that are used to learn
about certain types of
business related
functions
 Natural language
processing
Predictive
 Leverages data
mining, statistics and
ML algorithms, etc. to
analyze current and
historical data to
predict future events
and business
outcome.
 Discovers patterns
derived from historical
and transactional
data to optimize
business measures
Prescriptive
 Synthesizes big data,
mathematical and
computational
sciences, and
business rules to
suggest decision
options
 Takes advantage of a
future opportunity or
mitigate a future risk
and shows the
implication of each
decision option

A New Era of Smart
The Scope of Advanced Analytics
• IBM analytics breadth covers the full spectrum of decisions
• IBM is the undisputed leader in advanced analytics
Cognitive
How can we learn dynamically?
Prescriptive
How can we achieve the best outcome?
Predictive
What could happen in the future?
Descriptive
What has already happened?
Information Layer
How is data managed and stored?
How can everyone
be more right…
….more often?
BBuussiinneessss VVaalluuee
 Reasoning
 Learning
 Natural Language
 Optimization
 Rules
 Constraints
 Machine learning
 Forecasting
 Statistical Analysis
 Alerts & Drill Down
 Ad hoc Reports
 Standard Reports
 Big Data Platforms
 Content Management
 RDBMS and
Integration
IBM Big Data & Analytics

A New Era of Smart
Accelerating the Client’s Journey to Cognitive
Win on
Innovation
COGNITIVE
PRESCRIPTIVE Compete on time to business
value – through context specific
data, methods, workflow.
Continuum
PREDICTIVE DESCRIPTIVE Analytics The FOUNDATION INFORMATION Reasoning
Learning
Natural Language
Optimization
Rules
Predictive Modeling
Forecasting
Statistical Analysis
Alerts
Drilldown Query
Ad-hoc Reports
Standard Reports
Big Data Platforms
Natural, Intuitive or Automated Interaction
Context Specific Usage
Opportunities to infuse cognition and
collaboration in existing solutions and
products for differentiation
ECM
Information Integration
RDBMS

A New Era of Smart
Analytics: a Business Imperative across Industries
 Clients realize value through solutions
* Source: IBM Market Development & Insight – GMV 1H2013
IBM Watson Engagement Advisor
IBM Watson Engagement Advisor
Transforms client experience with deep personalized Q&A
Transforms client experience with deep personalized Q&A
IBM Predictive Maintenance & Quality
IBM Predictive Maintenance & Quality
Improves productivity, prevents downtime and reduces costs
Improves productivity, prevents downtime and reduces costs
IBM Credit Risk Management
IBM Credit Risk Management
Derive competitive advantage from risk management processes
Derive competitive advantage from risk management processes
IBM Enterprise Marketing Management
IBM Enterprise Marketing Management
Discover and react in real time to how consumers are interacting
Discover and react in real time to how consumers are interacting
IBM Social Media Analytics
IBM Social Media Analytics
Uncover customer sentiment, predict behavior, improve marketing
Uncover customer sentiment, predict behavior, improve marketing

A New Era of Smart
IBM’s Portfolio delivers Business Value
 Business value from automation of routine decisions, to transformative new usages of data
Line of Business
Leaders
Industry Solutions
Integrated by Design
CPO CMO CHRO CFO CIO CRO Mayors
Cloud
Predictive Prescriptive Cognitive
Mobile Social
Big Data & Analytics
Market-Growth
Initiatives
Client-Driven
Capabilities and
Platforms
Big Data Infrastructure
Supported by IBM expertise through BAO services
Smarter
Commerce
Smarter
Workforce
Smarter
Cities
Smarter
Analytics
Cloud

A New Era of Smart
Power Systems enables next Generation Big Data and Analytics
Applications
Power Solutions
Power Systems
Industry Solutions Business & Predictive Analytics Cognitive Computing
IBM Watson
Natural Language
Learning
1,000+ Concurrent
Queries
Real-time
Analytics
Parallel
processing
memory processing
Stream
Computing
Massive IO bandwidth
Continuous data
load
Design Open & flexible infrastructure - Available on premise or through the Cloud
Large-scale
??
??
??
??
??
??
??
??
??
??
??
??
99.997% Availability 0 Incidents, Vulnerability 1.3M IOPS Scalability

A New Era of Smart
ANALYTICS IN THE CONTEXT OF BIG
DATA

A New Era of Smart
Analytics in the Context of Big Data - The Big Data Analytics Challenge
From noisy data to trustworthy insights
VVeerraacciittyy
 Understand jargon and acronyms, eliminate spam
Heterogeneous data
VVaarriieettyy
 Combine, correlate information over 100’s of sources (sites,
forums, message boards, newswires…)
Timely Decision making
VVeelloocciittyy
 Make decisions in near real-time over 10K+ messages/second
<20% >80%
Data Content
 Requiring overcoming the high
volume, real-time, and unstructured
nature of social media and
Enterprise data streams
Growing volume of data
VVoolluummee
 Social media or other media source data
 Extract concepts from several 100M messages/day
 100M+ active users per source
Learning, NLP, Discovery
• Auditory & visual processing
• Logic & reasoning
• Improve interventions
Data Volume
360-degree Profiles
• Micro-segmentation
• Predict Behavior
Listening and Monitoring
• Sentiment, Buzz
• Key influencers
Analytics Complexity
Manual Interaction
• Polling & Extrapolation

A New Era of Smart
Analytics in the Context of Big Data - Key Drivers for Cognitive Analytics
 The need for cognitive analytics is driven by the confluence of
SoLoMo (Social, Local, Mobile), Big Data, and Cloud
VVeerraacciittyy VVaarriieettyy
VVeelloocciittyy VVoolluummee
Cognitive Systems

A New Era of Smart
Analytics in the Context of Big Data - Veracity / Trust / Sentiment
 Addressing the information trustworthiness of social media data
 Some dimensions of trustworthiness /
 Trustworthiness  Sentiment
– Jokes
– Prosody
– Sarcasm
– Seriousness
– Emotion
– Mood
– Ambiguity
– Humor
– Dialect
– Social factors …
– Social media languages
– Context
– etc.
VVeerraacciittyy
Information
Provenance
Author
Classification
Integrity
Assumption
Usage
Intention
Content
Analysis
Relevance
Determination

A New Era of Smart
Analytics in the Context of Big Data
DeepQA: The Architecture underlying Watson
 Generates many hypotheses, collects wide range of evidence, balances the combined
confidences of >100 different analytics that analyze the evidence from different dimensions
Answer
Scoring
Learned Models
help combine and
weigh the Evidence
Models
Models
Models
Models
Models
Candidate
Answer
Generation
Answer Sources
Evidence
Retrieval
Deep
Evidence
Scoring
Primary Models
Search
Final Confidence
Synthesis Merging & Ranking
Answer &
Confidence
Evidence
Sources
Hypothesis
Generation
Hypothesis and Evidence
Scoring
Each year the EU
selects capitals of
culture; one of
the 2010 cities
was this Turkish
“meeting place of
cultures”
Question & Topic
Analysis
Hypothesis
Generation Hypothesis and Evidence
Scoring
Question
Decomposition

A New Era of Smart
Analytics in the Context of Big Data - Watson drives optimized outcomes
Generates and
evaluates
hypothesis for
better outcomes
99%
60%
10%
Understands
natural
language and
human speech
Adapts and
Learns from
user selections
and responses
3
2
1
…built on a massively parallel
probabilistic evidence-based
architecture optimized for
Linux on POWER7+

A New Era of Smart
BIG DATA ANALYTICS REFERENCE
MODEL

A New Era of Smart
Big Data & Analytics Platform
An innovative, foundational big data platform can help tackle big data’s four V’s (volume,
variety, velocity and veracity) with an integrated set of big data technologies to address the
business pain, reduce time and cost, and provide quicker return on investment
More cost-effectively analyze
Analyze streaming data
petabytes of structured and
and large data bursts for
unstructured formation
near-real-time insights
Access deep insight with
advanced in-database analytics
and operational analytics
Big data
platform
Systems management Application development Discovery
Apache Hadoop system Stream computing Data warehouse
Information integration and governance
Data Media Content Machine Social

A New Era of Smart
Big Data Analytics Reference Model - Key Capabilities
Components to build a trusted information
integration layer with ETL, data quality, real-time
data processing, federation, metadata mgmt, …
Business Analytics &
Applications Layer
Data Persistency Layer
Infrastructure Services
Data Transformation
& Integration Layer
Heterogeneous Data Sources
Visualization &
Reporting Layer
Comprehensive Big Data advanced analytics
layer with applications & research assets on
heterogeneous source data
Traditional reporting and BI analytics, with
visualization & exploration of
heterogeneous data
Traditional DW system (SOR, ODS, marts)
with MDM system, DW appliances, and
augmented with Hadoop platform
Common infrastructure services, such as
systems management, security, backup,
information governance, …
Heterogeneous data landscape including
existing data stored in BSS systems, from the
network, external, customer touch points

A New Era of Smart
Cognitive
Analytics
Predictive
Analytics
Prescriptive
Analytics
Descriptive
Analytics
SAMPLE PROJECTS AND CUSTOMER CASE
STUDIES ILLUSTRATING THE EVOLUTION OF
ANALYTICS
(IN THE CONTEXT OF BIG DATA)

A New Era of Smart
Predictive Analytics
Demographics Enrichment for unknown Subscribers
Gain analytical insight for pre-paid demographics
 Understand post-paid subscribers
– Using post-paid demographics data (age, gender, income, …)
– Gaining insight: propensity/predictive modeling, micro-segmentation,
clustering, sentiment analytics, … from appl usage data, web browsing, CDR,
social media
 Understand pre-paid subscribers
– Gaining insight: propensity/predictive modeling, micro-segmentation,
clustering, sentiment analytics, …
– Demographics data isn't available or not sufficiently trustworthy
 Correlate post- with pre-paid subscribers and map demographics
– Correlate post- with pre-paid segments, clusters, behavior, interest, …
– Map known demographics for post-paid to corresponding pre-paid
subscribers
Required Data Sources
 Voice & data CDR (MSISDN & Usage)
 Behavioral data:
– Web browsing & search (internal and external), user agent: browser, appl
and/or device that made request, content type: type of data sent/downloaded
 Public sources (will be used, not required from CSP):
– Wikipedia
– IMDB http://www.imdb.com/
– Open Directory Project (ODP)
 Subscriber reference data (e.g. from CRM or EDW)
Predictive
Analytics

A New Era of Smart
CSP
Analytical insight
Visualization
Consumption by Advertisement
DATA SOURCES
CSP & other
 Voice & data CDR
(MSISDN & Usage)
 MSP (MSISDN & URL)
 Behavioral data (e.g. blogs, use of
mobile apps, Web browsing & Web
search )
 Public sources (e.g. ODP)
 Metadata, e.g. time, size, …
 CRM or EDW
IBM Singapore
Data understanding
Data transformation
Data preparation
Predictive
Analytics
PRODUCTS & Tools
 BigInsights (incl. BigSheets,
SystemT, HDFS, Jaql, …)
 Customer Modeler
 SPSS Modeler
 NLP
 DB2
SaaS
Correlation
Predictive modeling
Propensity modeling
Micro-segmentation
Clustering
Sentiment
IBM BigInsights Admin
Customer Modeler Admin
(Predictive Analytics)
Data anonymization
Data provisioning

A New Era of Smart
Pre-paid
CSP
Data Sources:
Voice/Data CDRs
Behavioral Data
Source Data
Transformation
HDFS
Analytical Model
(pre-paid)
DB2
Predictive Model
(for pre-paid)
HDFS
Analytical Model
(post-paid)
Public Sources
(not from CSP):
Wikipedia
IMDB
ODP
Post-paid
CSP
Data Sources:
Voice/Data CDRs
Behavioral Data
• InfoSphere BigInsights
• Customer Modeler
• SPSS / DB2 / NLP
GTS SmartCloud
Enterprise
Predictive
Analytics
Analysis/Insight
Pre-paid:
• Age
• Gender
• Income
Used for gaining
Analytical Insight
Transformation
Anonymization
(to be validated)
Post-paid
CSP
Data Sources:
Subscriber
Demographics
Visualization
Used for building
Predictive Model

A New Era of Smart
XO Communications takes control of customer satisfaction
142 percent reduction
in revenue erosion for customers
at most risk of churning
$10 million+
savings/year
from increased retention and
reduced customer service costs
5 months
to achieve full return on
investment
Solution components
The transformation: XO Communications had already taken the first steps in
identifying customer retention risks through analytics; now it wanted to seize the
opportunity to put these insights into action more effectively. By using IBM®
SPSS® solutions to hone its predictive models, the company built a richer, more
up-to-date picture of its client base and began delivering this data to a greater
range of employees.
“We are only just starting to realize the true potential that IBM analytics holds
across the business.”
• IBM® SPSS® Analytics Catalyst — Bill Helmrath, Director of Business Intelligence, XO Communications
• IBM SPSS Modeler
• IBM SPSS Modeler Server
• IBM SPSS Statistics
• IBM InfoSphere® BigInsights™
YTP03235-USEN-00

A New Era of Smart
Fiserv cuts IT costs while enhancing analytics capabilities with
software and infrastructure from IBM
$8 million saved
in IT costs over a
five-year period
90% reduction
in the number of midrange
servers under management
Boosts availability
and improves the agility of
service delivery
Solution Components
 IBM® AIX®
 IBM Cognos® Business Intelligence
 IBM DB2®
 IBM InfoSphere® Warehouse
 IBM PowerHA®
 IBM PowerVM®
 IBM SPSS®
 IBM Tivoli® Storage Manager and
System Automation for Multi-
Platforms
 IBM WebSphere® Application Server
 IBM Power® 770
Business Challenge: Fiserv was seeking new ways to attract, retain and grow
profitable customer relationships while helping its clients compete with newer and
larger banks. Leveraging predictive analytics applications proved key to this goal,
but Fiserv realised that it also needed a more agile, available and scalable IT
infrastructure to support its new capabilities.
The Solution: IBM information management and predictive analytic solutions
enable Fiserv to transform billions of raw transactions into actionable insights that
help small and midsize banks better target offers and maximize their marketing
dollars. The use of cloud technologies to consolidate and virtualize servers helps
reduce costs and accelerate time-to-market.
“We have estimated a five-year-cumulative run rate reduction of about $8 million
with the server consolidation and virtualization project.”
—Leroy Hill, Manager, Midrange Engineering, Fiserv

A New Era of Smart
Cognitive Analytics
Halalan 2013 Social Media Tracking
 BUZZ – candidates, topics, personalities,
broadcasters
Cognitive
Analytics
– How much / What is being said about the
candidates (ongoing and for key “events” like
debates, advertisements, etc.), different shows,
news anchors.
– How does this change over time, what is trending.
 SENTIMENT – popular opinion
– What do voters like or dislike about the candidates,
the parties, campaigns, constituents, etc.
– How does this sentiment break down by the
different groups (voters, political affiliation, news
professionals, demographics, affinity groups, etc.)
– Understand brand sentiment, i.e., whether ABS-CBN
is being perceived as unbiased and trusted.
How are the different news personalities being
perceived: credible, neutral, fair?
 INTENT – action
– What is the intent to act (support / vote) for each
candidate.
– What election outcomes can be predicted (shifts in
candidate sentiment, voter intent, etc.)

A New Era of Smart
(just a few examples)
CURRENT RESEARCH & DEVELOPMENT
AREAS

A New Era of Smart
Cognitive Analytics: Technical Capabilities required
Watson Solutions – Build on repeatable Assets
Watson for
Healthcare
Watson for
Financial Services
Watson for
Client Engagement
Watson for Industry
Solutions
Sample Advisor Solutions Sample Advisor Solutions Sample Advisor Solutions
Utilization
Research
Banking
Insurance Call Center
Oncology
Care Mgt.
Financial Markets
Knowledge
Help Desk
Technical
ASK Services DISCOVER Services DECISION Services
NLP & Machine
Learning
100111001
10010010010
1000101100101
10001010010
00110101
Data Analytics Cloud Mobile Workload Optimized
Systems
Capabilities
Platform
Content Tooling Methods Algorithms APIs
Ready Build Teach Run
Full Lifecycle

A New Era of Smart
Massive Scale SNA (X-RIME) over BigInsights
Current Research Area
 Project Overview
– X-RIME is a library that consists of MapReduce programs, which are
used to do raw data pre-processing, transformation, SNA metrics and
structures calculation, and graph / network visualization
– Based on IBM InfoSphere BigInsights (Hadoop)
– Goes beyond SPSS SNA for churn propensity modeling
 Reference
– Commercial Solution: China Mobile enterprise blog analysis solution
– ARL MSA on Power Benchmarking: Pageranking 390 millions of nodes
on 10-nodes power7 cluster (2 hours per iteration)
– Integrated to SystemG as GraphBase
– Open Source X-RIME on SourceForge
 Selected X-RIME SNA
Algorithms
– Vertex degrees
(in/out/both/average/max
)
– Weekly connected
components
– Bi-connected
components
– Breadth first search
(BFS)
– K-core
– Maximal clique
– Community detection
based on label
propagation
based on scored label
propagation
based on propinquity
– Modularity evaluation
– Hyperlink induced topic
search (HITS)
– Pagerank
– Minimal spanning tree
(MST)
– Ego-centric network
– Vertex clustering
coefficient
– Edge clustering
coefficient
SSNNAA lliibbrraarryy
Message Passing
Framework
Graph Data Model
(Object)
MMaappRReedduuccee
HHDDFFSS
X-RIME Architecture

Gene Villeneuve - Moving from descriptive to cognitive analytics

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Destacado

Destacado (15)

Similar a Gene Villeneuve - Moving from descriptive to cognitive analytics

Similar a Gene Villeneuve - Moving from descriptive to cognitive analytics (20)

Más de IBM Sverige

Más de IBM Sverige (20)

Último

Último (20)

Gene Villeneuve - Moving from descriptive to cognitive analytics

Notas del editor