SlideShare a Scribd company logo
1 of 40
Download to read offline
random notes on big data
Chen Peng, Jianqiang Wang, Yang Huang
April 19, 2013
What is big data
● Volume: Gigabytes-
>Terabytes -
>Petabytes.
● Velocity: time
sensitive, streaming,
real-time.
Jet engine: 20TB/hr
GE: (minds + machines)
● Variety:
structured/unstructur
ed.
● Value: insights,
analytical systems.
Challenges: collect, store, organize, analyze and share
External
> web sites (blogs/reviews)
> social media (Facebook, LinkedIn,
Google+, Twitter)
> images and videos
> ...
Internal
> transactions
> server logs
> machines and sensors
> emails
> ...
Variety
Value Hierarchy
Raw Data
Normalized
Insight
Recommendation
Transact
Data is now a strategic asset
Technology stack & corresponding
firms
Google
App Engine
Google
BigQuery
Scalable
application
development and
execution
environment
Google
Compute Engine
Virtual machines
Run arbitrary workloads
at scale
(e.g. Hadoop, scientific
computing)
Google Cloud Platform
Google
Cloud Storage
Storage
Connecting glue between
each step of the data
pipeline
Data analysis
Querying large datasets
+ third party apps for
visualization (e.g.
Tableau)
Big data analytics
Analytics is
The scientific process of transforming data into
insights for making better decisions.
Data Insight Decision
IT logs, cloud,
social media,
sensors,
experiments,
etc.
statistical &
operations research
modeling
judgement,
constraints,
intuition
"resource" "product" "goal"
Predictive analytics extracts information from data and
use it to predict future trends and behavior patterns.
regression models
discrete choice models
time series models
classification models (decision tree, random forest, support vector machine,
neural network, etc.)
clustering models (k-means, density based, graph based, etc.)
association analysis
...
Big data analytics
Descriptive Analytics
Predictive Analytics
Prescriptive Analytics
Always keep in mind...
> business objectives are the origin of every data mining solution
> data preparation is more than half of the data mining process
> all patterns are subject to change
> there will always be new knowledge
Always pause and ask yourself:
Does this work relate to the business question we try to answer?
Is the original business question still valid?
Industry Use-cases/Application
Healthcare
Utilities
Retail &
marketing
Financial
services
Telecom
Use cases by industry
Industry applications of big data
analytics
Customer acquisition
predict customers' buying habits in order to promote relevant products at
multiple touch points.
http://www.youtube.com/watch?feature=player_embedded&v=3WspJ16Ubhw
Clinical decision support
Experts use predictive analysis in health care primarily to determine which
patients are at risk of developing certain conditions, like diabetes, asthma, heart
disease, and other lifetime illnesses.
Cross sale
predictive analytics can help analyze customers' spending, usage and other
behavior, leading to efficient cross sales, or selling additional products to
current customers (beer & diaper)
Ads targeting
http://www.slideshare.net/dennyglee/yahoo-tao-case-study-excerpt
Fraud detection
A predictive model can help weed out the "bads" and reduce a business's
exposure to fraud.
Image and Speech Recognition
http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.
com/en/us/people/jeff/MIT_BigData_Sep2012.pdf
Operations
Jet Engine + Humans
http://www.youtube.com/watch?v=JHc4ZTTWKrQ
Industry applications of big data
analytics
Amazon wareouse operational efficiency: http://www.youtube.com/watch?
v=Kafs9tZskuo
Beer and diaper
What are those startups doing?
Bloomreach
http://www.youtube.com/watch?feature=player_embedded&v=K12awAj4tW8
Datastax
http://www.nytimes.com/2013/02/25/business/media/for-house-of-cards-using-big-data-to-guarantee-
its-popularity.html?pagewanted=all
Paraccel
http://www.paraccel.com/solutions/paraccel-solutions-big-data.php#.UXG207WG3Ct
Kaggle
http://www.kaggle.com/c/acm-sf-chapter-hackathon-big
VC funding for "Big Data"
Data from 71 start-ups. Funding is
counted starting from 2004.
VC Funding Activity
Data from 71 start-ups. Funding is
counted starting from 2004.
Interesting view points
" Special (domain) knowledge becomes less relevant;
organizations should focus on collecting people who know
how to extract value and insights from data."
" In god we trust. All others must bring data."
" The usefulness of a variable in a model is inversely
related to the time you spend creating it."
"Noise is convex but information is concave."
"Big data is sexy but small data is beautiful."
noise
information
data size
Interesting view points
"All models are wrong, but some are useful."
"Big data is like teenage sex: everyone talks about it,
nobody really knows how to do it; everyone thinks everyone
else is doing it, so they claim they are doing it."
"Statistics: The Art and Science of Learning from Data"
The danger of big data
Open discussion
Potential opportunities / challenges for
entrepreneurs?
- visualization
- internet of things
- analytics as a service (a3
s)
Standardization v.s. customization
Human and data interaction
- data v.s intuition
Back-Up Slides
Data Science v.s. OR
risk management strategic planning
predictive analytics optimization
Risk
Measurable of Objective
skill sets of data scientists
Big data types
● Web & social media: clickstream, web content,
amazon reviews, facebook postings & 'like'...
● M2M:smart meters, oil rig sensor reading, GPS
signals...
● Transaction:retail store, healthcare claims, utility
billing...
● Biometrics:fingerprint, face, voice, handwriting..
● Human-generated data:call logs, emails, surveys...
Web & social media
● Transaction: orders, revenue,
● Conversion: click thru, convert to
purchase,...
● Session: length, bounce rate
● Lifetime value: repeat, frequency,...
● Social interaction: intensity,
influence,...
Shopping cart analysis
CTR prediction
Personalization
Retention/customer
churn
A/B testing
Targeted ads
Lifetime value
Interesting data visualization
projects
wind map
http://hint.fm/wind/gallery/oct-30.js.html
Some analytical problems people
deal with at Google ...
● search ranking
Processing Pipeline
Hadoop
MapReduce
log
sensor
web
...
Structured
Data
Note: Hadoop -- an open-source software framework that supports data-intensive distributed
applications, licensed under the Apache v2 license. It supports the running of applications on large
clusters of commodity hardware. Orginated from Google MapReduce and further developed/promoted by
Yahoo.
SQL
HIVE
Dremel ...
Analytics
Big Data
Cloud
Computing
http://www.forbes.com/sites/davefeinleib/2012/06/19/the-big-data-landscape/
How big is big?
When your data set becomes so large that you have to
start innovating around how to collect, store, organize,
analyze and share it ...
External
> web sites (blogs/reviews)
> social media (Facebook,
LinkedIn, Google+, Twitter)
> images and videos
> ...
Internal
> transactions
> server logs
> machines and sensors
> emails
> ...
Health
care
Sentiment
analysis
Patient
monitoring
Genetic
Testing
Electronic
Medical
Records
Utilities Smart
Meters
Retail Loyalty
programs
RFID tags Recommenda
tion, market
basket
Face
recognition
Telcos Customer
churn
Location-
based
IT Machine
log
Web &
Social
media
M2M Transaction Biometrics Human-
generat
ed
Example of semantic graph
Call Data Record
What is Hadoop

More Related Content

What's hot

Marketing analytics for the Banking Industry
Marketing analytics for the Banking IndustryMarketing analytics for the Banking Industry
Marketing analytics for the Banking IndustrySashindar Rajasekaran
 
Analystics in banking and financial services
Analystics in banking and financial servicesAnalystics in banking and financial services
Analystics in banking and financial servicesRoshithaSunil
 
Big Data Analytics and its Application in E-Commerce
Big Data Analytics and its Application in E-CommerceBig Data Analytics and its Application in E-Commerce
Big Data Analytics and its Application in E-CommerceUyoyo Edosio
 
Data Science Use cases in Banking
Data Science Use cases in BankingData Science Use cases in Banking
Data Science Use cases in BankingArul Bharathi
 
Application of predictive analytics
Application of predictive analyticsApplication of predictive analytics
Application of predictive analyticsPrasad Narasimhan
 
Big data & analytics for banking new york lars hamberg
Big data & analytics for banking new york   lars hambergBig data & analytics for banking new york   lars hamberg
Big data & analytics for banking new york lars hambergLars Hamberg
 
Data Mining and Business Analytics by Seyed Ziae Mousavi Mojab
Data Mining and Business Analytics by Seyed Ziae Mousavi MojabData Mining and Business Analytics by Seyed Ziae Mousavi Mojab
Data Mining and Business Analytics by Seyed Ziae Mousavi Mojabzmojab
 
Big data Business Use Cases
Big data  Business Use CasesBig data  Business Use Cases
Big data Business Use CasesPromptCloud
 
Panel: Powering Business Decision Making
Panel: Powering Business Decision MakingPanel: Powering Business Decision Making
Panel: Powering Business Decision MakingMRS
 
Big Data in Retail - Examples in Action
Big Data in Retail - Examples in ActionBig Data in Retail - Examples in Action
Big Data in Retail - Examples in ActionDavid Pittman
 
Predictive Analytics: Business Perspective & Use Cases
Predictive Analytics: Business Perspective & Use CasesPredictive Analytics: Business Perspective & Use Cases
Predictive Analytics: Business Perspective & Use CasesCagri Sarigoz
 
Data Analytics in Azure Cloud
Data Analytics in Azure CloudData Analytics in Azure Cloud
Data Analytics in Azure CloudMicrosoft Canada
 
Machine learning with sabyasachi upadhya
Machine learning with sabyasachi upadhyaMachine learning with sabyasachi upadhya
Machine learning with sabyasachi upadhyaAnthonyBennet
 
Big data in fintech ecosystem
Big data in fintech ecosystemBig data in fintech ecosystem
Big data in fintech ecosystemBBVA API Market
 
The Emergence of Alt-Data and its Applications
The Emergence of Alt-Data and its ApplicationsThe Emergence of Alt-Data and its Applications
The Emergence of Alt-Data and its ApplicationsPromptCloud
 

What's hot (20)

Marketing analytics for the Banking Industry
Marketing analytics for the Banking IndustryMarketing analytics for the Banking Industry
Marketing analytics for the Banking Industry
 
Analystics in banking and financial services
Analystics in banking and financial servicesAnalystics in banking and financial services
Analystics in banking and financial services
 
Big Data Analytics and its Application in E-Commerce
Big Data Analytics and its Application in E-CommerceBig Data Analytics and its Application in E-Commerce
Big Data Analytics and its Application in E-Commerce
 
Data Science Use cases in Banking
Data Science Use cases in BankingData Science Use cases in Banking
Data Science Use cases in Banking
 
Application of predictive analytics
Application of predictive analyticsApplication of predictive analytics
Application of predictive analytics
 
Big data & analytics for banking new york lars hamberg
Big data & analytics for banking new york   lars hambergBig data & analytics for banking new york   lars hamberg
Big data & analytics for banking new york lars hamberg
 
Data Mining and Business Analytics by Seyed Ziae Mousavi Mojab
Data Mining and Business Analytics by Seyed Ziae Mousavi MojabData Mining and Business Analytics by Seyed Ziae Mousavi Mojab
Data Mining and Business Analytics by Seyed Ziae Mousavi Mojab
 
Big data Business Use Cases
Big data  Business Use CasesBig data  Business Use Cases
Big data Business Use Cases
 
Panel: Powering Business Decision Making
Panel: Powering Business Decision MakingPanel: Powering Business Decision Making
Panel: Powering Business Decision Making
 
Big Data in Retail - Examples in Action
Big Data in Retail - Examples in ActionBig Data in Retail - Examples in Action
Big Data in Retail - Examples in Action
 
Predictive Analytics: Business Perspective & Use Cases
Predictive Analytics: Business Perspective & Use CasesPredictive Analytics: Business Perspective & Use Cases
Predictive Analytics: Business Perspective & Use Cases
 
Data Analytics in Azure Cloud
Data Analytics in Azure CloudData Analytics in Azure Cloud
Data Analytics in Azure Cloud
 
Rulex big data and analytics
Rulex big data and analyticsRulex big data and analytics
Rulex big data and analytics
 
Machine learning with sabyasachi upadhya
Machine learning with sabyasachi upadhyaMachine learning with sabyasachi upadhya
Machine learning with sabyasachi upadhya
 
Data science in finance industry
Data science in finance industryData science in finance industry
Data science in finance industry
 
SMAC
SMACSMAC
SMAC
 
Mphasis SMAC
Mphasis SMAC Mphasis SMAC
Mphasis SMAC
 
Big data in fintech ecosystem
Big data in fintech ecosystemBig data in fintech ecosystem
Big data in fintech ecosystem
 
Data mining
Data miningData mining
Data mining
 
The Emergence of Alt-Data and its Applications
The Emergence of Alt-Data and its ApplicationsThe Emergence of Alt-Data and its Applications
The Emergence of Alt-Data and its Applications
 

Viewers also liked

PageRank & Searching
PageRank & SearchingPageRank & Searching
PageRank & Searchingrahulbindra
 
Page rank
Page rankPage rank
Page rankCarlos
 
Web intelligence and big data
Web intelligence and big dataWeb intelligence and big data
Web intelligence and big dataKeyur Shah
 
CLOUD COMPUTING UNIT-5 NOTES
CLOUD COMPUTING UNIT-5 NOTESCLOUD COMPUTING UNIT-5 NOTES
CLOUD COMPUTING UNIT-5 NOTESTushar Dhoot
 
Web Intelligence - Tutorial1
Web Intelligence - Tutorial1Web Intelligence - Tutorial1
Web Intelligence - Tutorial1Obily W
 
Big data lecture notes
Big data lecture notesBig data lecture notes
Big data lecture notesMohit Saini
 
Big Data & Artificial Intelligence
Big Data & Artificial IntelligenceBig Data & Artificial Intelligence
Big Data & Artificial IntelligenceZavain Dar
 
Cloud computing (IT-703) UNIT 1 & 2
Cloud computing (IT-703) UNIT 1 & 2Cloud computing (IT-703) UNIT 1 & 2
Cloud computing (IT-703) UNIT 1 & 2Jitendra s Rathore
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with HadoopPhilippe Julio
 

Viewers also liked (12)

PageRank & Searching
PageRank & SearchingPageRank & Searching
PageRank & Searching
 
Page rank2
Page rank2Page rank2
Page rank2
 
Page rank
Page rankPage rank
Page rank
 
Web intelligence and big data
Web intelligence and big dataWeb intelligence and big data
Web intelligence and big data
 
Relationship Between Big Data & AI
Relationship Between Big Data & AIRelationship Between Big Data & AI
Relationship Between Big Data & AI
 
CLOUD COMPUTING UNIT-5 NOTES
CLOUD COMPUTING UNIT-5 NOTESCLOUD COMPUTING UNIT-5 NOTES
CLOUD COMPUTING UNIT-5 NOTES
 
Web Intelligence - Tutorial1
Web Intelligence - Tutorial1Web Intelligence - Tutorial1
Web Intelligence - Tutorial1
 
Big data unit i
Big data unit iBig data unit i
Big data unit i
 
Big data lecture notes
Big data lecture notesBig data lecture notes
Big data lecture notes
 
Big Data & Artificial Intelligence
Big Data & Artificial IntelligenceBig Data & Artificial Intelligence
Big Data & Artificial Intelligence
 
Cloud computing (IT-703) UNIT 1 & 2
Cloud computing (IT-703) UNIT 1 & 2Cloud computing (IT-703) UNIT 1 & 2
Cloud computing (IT-703) UNIT 1 & 2
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with Hadoop
 

Similar to Big Data Analytics and Applications

Big Data overview
Big Data overviewBig Data overview
Big Data overviewalexisroos
 
Transformando la vida cotidiana a través de Big Data
Transformando la vida cotidiana a través de Big DataTransformando la vida cotidiana a través de Big Data
Transformando la vida cotidiana a través de Big DataUX Nights
 
Kp-Data Analytics-ts.pptx
Kp-Data Analytics-ts.pptxKp-Data Analytics-ts.pptx
Kp-Data Analytics-ts.pptxCloudBusiness2
 
Big Data Analytics: Challenges And Applications For Text, Audio, Video, And S...
Big Data Analytics: Challenges And Applications For Text, Audio, Video, And S...Big Data Analytics: Challenges And Applications For Text, Audio, Video, And S...
Big Data Analytics: Challenges And Applications For Text, Audio, Video, And S...IJSCAI Journal
 
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...gerogepatton
 
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...gerogepatton
 
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...gerogepatton
 
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...ijscai
 
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...gerogepatton
 
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...ijscai
 
Big Data Analytics_Unit1.pptx
Big Data Analytics_Unit1.pptxBig Data Analytics_Unit1.pptx
Big Data Analytics_Unit1.pptxPrabhaJoshi4
 
C21027_Aditya_Big Data Analytics In Baking Sector.pptx
C21027_Aditya_Big Data Analytics In Baking Sector.pptxC21027_Aditya_Big Data Analytics In Baking Sector.pptx
C21027_Aditya_Big Data Analytics In Baking Sector.pptxAdityaDeshpande674450
 
Analytics solution
Analytics solutionAnalytics solution
Analytics solutioncamssguide
 
Riding and Capitalizing the Next Wave of Information Technology
Riding and Capitalizing the Next Wave of Information TechnologyRiding and Capitalizing the Next Wave of Information Technology
Riding and Capitalizing the Next Wave of Information TechnologyGoutama Bachtiar
 
Big Data in Retail (White paper)
Big Data in Retail (White paper)Big Data in Retail (White paper)
Big Data in Retail (White paper)InData Labs
 
Big data in marketing at harvard business club nick1 june 15 2013
Big data in marketing at harvard business club nick1 june 15 2013Big data in marketing at harvard business club nick1 june 15 2013
Big data in marketing at harvard business club nick1 june 15 2013nkabra
 
Big data analytics and large-scale computers
Big data analytics and large-scale computersBig data analytics and large-scale computers
Big data analytics and large-scale computersShubhamKhurana20
 
Big data - a review (2013 4)
Big data - a review (2013 4)Big data - a review (2013 4)
Big data - a review (2013 4)Sonu Gupta
 

Similar to Big Data Analytics and Applications (20)

Big Data overview
Big Data overviewBig Data overview
Big Data overview
 
Transformando la vida cotidiana a través de Big Data
Transformando la vida cotidiana a través de Big DataTransformando la vida cotidiana a través de Big Data
Transformando la vida cotidiana a través de Big Data
 
Kp-Data Analytics-ts.pptx
Kp-Data Analytics-ts.pptxKp-Data Analytics-ts.pptx
Kp-Data Analytics-ts.pptx
 
Big Data Analytics: Challenges And Applications For Text, Audio, Video, And S...
Big Data Analytics: Challenges And Applications For Text, Audio, Video, And S...Big Data Analytics: Challenges And Applications For Text, Audio, Video, And S...
Big Data Analytics: Challenges And Applications For Text, Audio, Video, And S...
 
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
 
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
 
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
 
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
 
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
 
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
 
Big Data Analytics_Unit1.pptx
Big Data Analytics_Unit1.pptxBig Data Analytics_Unit1.pptx
Big Data Analytics_Unit1.pptx
 
C21027_Aditya_Big Data Analytics In Baking Sector.pptx
C21027_Aditya_Big Data Analytics In Baking Sector.pptxC21027_Aditya_Big Data Analytics In Baking Sector.pptx
C21027_Aditya_Big Data Analytics In Baking Sector.pptx
 
Analytics solution
Analytics solutionAnalytics solution
Analytics solution
 
Data Mining With Big Data
Data Mining With Big DataData Mining With Big Data
Data Mining With Big Data
 
Riding and Capitalizing the Next Wave of Information Technology
Riding and Capitalizing the Next Wave of Information TechnologyRiding and Capitalizing the Next Wave of Information Technology
Riding and Capitalizing the Next Wave of Information Technology
 
Big Data in Retail (White paper)
Big Data in Retail (White paper)Big Data in Retail (White paper)
Big Data in Retail (White paper)
 
new.pptx
new.pptxnew.pptx
new.pptx
 
Big data in marketing at harvard business club nick1 june 15 2013
Big data in marketing at harvard business club nick1 june 15 2013Big data in marketing at harvard business club nick1 june 15 2013
Big data in marketing at harvard business club nick1 june 15 2013
 
Big data analytics and large-scale computers
Big data analytics and large-scale computersBig data analytics and large-scale computers
Big data analytics and large-scale computers
 
Big data - a review (2013 4)
Big data - a review (2013 4)Big data - a review (2013 4)
Big data - a review (2013 4)
 

More from Jay Wu

Ux wearables 20141205
Ux wearables 20141205Ux wearables 20141205
Ux wearables 20141205Jay Wu
 
Smart hardware v2
Smart hardware v2Smart hardware v2
Smart hardware v2Jay Wu
 
Mobile health
Mobile healthMobile health
Mobile healthJay Wu
 
酷云互动
酷云互动酷云互动
酷云互动Jay Wu
 
Tips for improving communication in English
Tips for improving communication in EnglishTips for improving communication in English
Tips for improving communication in EnglishJay Wu
 
Payment
PaymentPayment
PaymentJay Wu
 
Viral marketing
Viral marketingViral marketing
Viral marketingJay Wu
 
Jay w salon nov 22 zhibai
Jay w salon nov 22 zhibaiJay w salon nov 22 zhibai
Jay w salon nov 22 zhibaiJay Wu
 
KPCB report 2013
KPCB report 2013KPCB report 2013
KPCB report 2013Jay Wu
 
Gaming elements
Gaming elementsGaming elements
Gaming elementsJay Wu
 
Gamification theories
Gamification theoriesGamification theories
Gamification theoriesJay Wu
 
Negotiating at a glance
Negotiating at a glanceNegotiating at a glance
Negotiating at a glanceJay Wu
 
Chart of interests
Chart of interestsChart of interests
Chart of interestsJay Wu
 
10 lessons i learnt about negotiation
10 lessons i learnt about negotiation10 lessons i learnt about negotiation
10 lessons i learnt about negotiationJay Wu
 
Education Landscape
Education LandscapeEducation Landscape
Education LandscapeJay Wu
 
Design Thinking
Design ThinkingDesign Thinking
Design ThinkingJay Wu
 
Ad tech trends
Ad tech trendsAd tech trends
Ad tech trendsJay Wu
 
Education Tech market & trends
Education Tech market & trendsEducation Tech market & trends
Education Tech market & trendsJay Wu
 
Product design jw salon presentation
Product design jw salon presentationProduct design jw salon presentation
Product design jw salon presentationJay Wu
 
Intro to Design
Intro to DesignIntro to Design
Intro to DesignJay Wu
 

More from Jay Wu (20)

Ux wearables 20141205
Ux wearables 20141205Ux wearables 20141205
Ux wearables 20141205
 
Smart hardware v2
Smart hardware v2Smart hardware v2
Smart hardware v2
 
Mobile health
Mobile healthMobile health
Mobile health
 
酷云互动
酷云互动酷云互动
酷云互动
 
Tips for improving communication in English
Tips for improving communication in EnglishTips for improving communication in English
Tips for improving communication in English
 
Payment
PaymentPayment
Payment
 
Viral marketing
Viral marketingViral marketing
Viral marketing
 
Jay w salon nov 22 zhibai
Jay w salon nov 22 zhibaiJay w salon nov 22 zhibai
Jay w salon nov 22 zhibai
 
KPCB report 2013
KPCB report 2013KPCB report 2013
KPCB report 2013
 
Gaming elements
Gaming elementsGaming elements
Gaming elements
 
Gamification theories
Gamification theoriesGamification theories
Gamification theories
 
Negotiating at a glance
Negotiating at a glanceNegotiating at a glance
Negotiating at a glance
 
Chart of interests
Chart of interestsChart of interests
Chart of interests
 
10 lessons i learnt about negotiation
10 lessons i learnt about negotiation10 lessons i learnt about negotiation
10 lessons i learnt about negotiation
 
Education Landscape
Education LandscapeEducation Landscape
Education Landscape
 
Design Thinking
Design ThinkingDesign Thinking
Design Thinking
 
Ad tech trends
Ad tech trendsAd tech trends
Ad tech trends
 
Education Tech market & trends
Education Tech market & trendsEducation Tech market & trends
Education Tech market & trends
 
Product design jw salon presentation
Product design jw salon presentationProduct design jw salon presentation
Product design jw salon presentation
 
Intro to Design
Intro to DesignIntro to Design
Intro to Design
 

Recently uploaded

BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...Nguyen Thanh Tu Collection
 
DiskStorage_BasicFileStructuresandHashing.pdf
DiskStorage_BasicFileStructuresandHashing.pdfDiskStorage_BasicFileStructuresandHashing.pdf
DiskStorage_BasicFileStructuresandHashing.pdfChristalin Nelson
 
Mythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWMythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWQuiz Club NITW
 
Indexing Structures in Database Management system.pdf
Indexing Structures in Database Management system.pdfIndexing Structures in Database Management system.pdf
Indexing Structures in Database Management system.pdfChristalin Nelson
 
PART 1 - CHAPTER 1 - CELL THE FUNDAMENTAL UNIT OF LIFE
PART 1 - CHAPTER 1 - CELL THE FUNDAMENTAL UNIT OF LIFEPART 1 - CHAPTER 1 - CELL THE FUNDAMENTAL UNIT OF LIFE
PART 1 - CHAPTER 1 - CELL THE FUNDAMENTAL UNIT OF LIFEMISSRITIMABIOLOGYEXP
 
Comparative Literature in India by Amiya dev.pptx
Comparative Literature in India by Amiya dev.pptxComparative Literature in India by Amiya dev.pptx
Comparative Literature in India by Amiya dev.pptxAvaniJani1
 
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQuiz Club NITW
 
Employablity presentation and Future Career Plan.pptx
Employablity presentation and Future Career Plan.pptxEmployablity presentation and Future Career Plan.pptx
Employablity presentation and Future Career Plan.pptxryandux83rd
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...Nguyen Thanh Tu Collection
 
How to Uninstall a Module in Odoo 17 Using Command Line
How to Uninstall a Module in Odoo 17 Using Command LineHow to Uninstall a Module in Odoo 17 Using Command Line
How to Uninstall a Module in Odoo 17 Using Command LineCeline George
 
4.9.24 School Desegregation in Boston.pptx
4.9.24 School Desegregation in Boston.pptx4.9.24 School Desegregation in Boston.pptx
4.9.24 School Desegregation in Boston.pptxmary850239
 
Objectives n learning outcoms - MD 20240404.pptx
Objectives n learning outcoms - MD 20240404.pptxObjectives n learning outcoms - MD 20240404.pptx
Objectives n learning outcoms - MD 20240404.pptxMadhavi Dharankar
 
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptx
Unraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptxUnraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptx
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptxDhatriParmar
 
Man or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptx
Man or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptxMan or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptx
Man or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptxDhatriParmar
 
Shark introduction Morphology and its behaviour characteristics
Shark introduction Morphology and its behaviour characteristicsShark introduction Morphology and its behaviour characteristics
Shark introduction Morphology and its behaviour characteristicsArubSultan
 

Recently uploaded (20)

BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...
 
DiskStorage_BasicFileStructuresandHashing.pdf
DiskStorage_BasicFileStructuresandHashing.pdfDiskStorage_BasicFileStructuresandHashing.pdf
DiskStorage_BasicFileStructuresandHashing.pdf
 
Mythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWMythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITW
 
Indexing Structures in Database Management system.pdf
Indexing Structures in Database Management system.pdfIndexing Structures in Database Management system.pdf
Indexing Structures in Database Management system.pdf
 
PART 1 - CHAPTER 1 - CELL THE FUNDAMENTAL UNIT OF LIFE
PART 1 - CHAPTER 1 - CELL THE FUNDAMENTAL UNIT OF LIFEPART 1 - CHAPTER 1 - CELL THE FUNDAMENTAL UNIT OF LIFE
PART 1 - CHAPTER 1 - CELL THE FUNDAMENTAL UNIT OF LIFE
 
Comparative Literature in India by Amiya dev.pptx
Comparative Literature in India by Amiya dev.pptxComparative Literature in India by Amiya dev.pptx
Comparative Literature in India by Amiya dev.pptx
 
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
 
Employablity presentation and Future Career Plan.pptx
Employablity presentation and Future Career Plan.pptxEmployablity presentation and Future Career Plan.pptx
Employablity presentation and Future Career Plan.pptx
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...
 
Chi-Square Test Non Parametric Test Categorical Variable
Chi-Square Test Non Parametric Test Categorical VariableChi-Square Test Non Parametric Test Categorical Variable
Chi-Square Test Non Parametric Test Categorical Variable
 
Paradigm shift in nursing research by RS MEHTA
Paradigm shift in nursing research by RS MEHTAParadigm shift in nursing research by RS MEHTA
Paradigm shift in nursing research by RS MEHTA
 
Spearman's correlation,Formula,Advantages,
Spearman's correlation,Formula,Advantages,Spearman's correlation,Formula,Advantages,
Spearman's correlation,Formula,Advantages,
 
Plagiarism,forms,understand about plagiarism,avoid plagiarism,key significanc...
Plagiarism,forms,understand about plagiarism,avoid plagiarism,key significanc...Plagiarism,forms,understand about plagiarism,avoid plagiarism,key significanc...
Plagiarism,forms,understand about plagiarism,avoid plagiarism,key significanc...
 
How to Uninstall a Module in Odoo 17 Using Command Line
How to Uninstall a Module in Odoo 17 Using Command LineHow to Uninstall a Module in Odoo 17 Using Command Line
How to Uninstall a Module in Odoo 17 Using Command Line
 
4.9.24 School Desegregation in Boston.pptx
4.9.24 School Desegregation in Boston.pptx4.9.24 School Desegregation in Boston.pptx
4.9.24 School Desegregation in Boston.pptx
 
Objectives n learning outcoms - MD 20240404.pptx
Objectives n learning outcoms - MD 20240404.pptxObjectives n learning outcoms - MD 20240404.pptx
Objectives n learning outcoms - MD 20240404.pptx
 
CARNAVAL COM MAGIA E EUFORIA _
CARNAVAL COM MAGIA E EUFORIA            _CARNAVAL COM MAGIA E EUFORIA            _
CARNAVAL COM MAGIA E EUFORIA _
 
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptx
Unraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptxUnraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptx
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptx
 
Man or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptx
Man or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptxMan or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptx
Man or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptx
 
Shark introduction Morphology and its behaviour characteristics
Shark introduction Morphology and its behaviour characteristicsShark introduction Morphology and its behaviour characteristics
Shark introduction Morphology and its behaviour characteristics
 

Big Data Analytics and Applications

  • 1. random notes on big data Chen Peng, Jianqiang Wang, Yang Huang April 19, 2013
  • 2. What is big data
  • 3. ● Volume: Gigabytes- >Terabytes - >Petabytes. ● Velocity: time sensitive, streaming, real-time. Jet engine: 20TB/hr GE: (minds + machines) ● Variety: structured/unstructur ed. ● Value: insights, analytical systems.
  • 4. Challenges: collect, store, organize, analyze and share External > web sites (blogs/reviews) > social media (Facebook, LinkedIn, Google+, Twitter) > images and videos > ... Internal > transactions > server logs > machines and sensors > emails > ... Variety
  • 6. Technology stack & corresponding firms
  • 7. Google App Engine Google BigQuery Scalable application development and execution environment Google Compute Engine Virtual machines Run arbitrary workloads at scale (e.g. Hadoop, scientific computing) Google Cloud Platform Google Cloud Storage Storage Connecting glue between each step of the data pipeline Data analysis Querying large datasets + third party apps for visualization (e.g. Tableau)
  • 8. Big data analytics Analytics is The scientific process of transforming data into insights for making better decisions. Data Insight Decision IT logs, cloud, social media, sensors, experiments, etc. statistical & operations research modeling judgement, constraints, intuition "resource" "product" "goal"
  • 9. Predictive analytics extracts information from data and use it to predict future trends and behavior patterns. regression models discrete choice models time series models classification models (decision tree, random forest, support vector machine, neural network, etc.) clustering models (k-means, density based, graph based, etc.) association analysis ... Big data analytics Descriptive Analytics Predictive Analytics Prescriptive Analytics
  • 10. Always keep in mind... > business objectives are the origin of every data mining solution > data preparation is more than half of the data mining process > all patterns are subject to change > there will always be new knowledge Always pause and ask yourself: Does this work relate to the business question we try to answer? Is the original business question still valid?
  • 12. Industry applications of big data analytics Customer acquisition predict customers' buying habits in order to promote relevant products at multiple touch points. http://www.youtube.com/watch?feature=player_embedded&v=3WspJ16Ubhw Clinical decision support Experts use predictive analysis in health care primarily to determine which patients are at risk of developing certain conditions, like diabetes, asthma, heart disease, and other lifetime illnesses. Cross sale predictive analytics can help analyze customers' spending, usage and other behavior, leading to efficient cross sales, or selling additional products to current customers (beer & diaper) Ads targeting http://www.slideshare.net/dennyglee/yahoo-tao-case-study-excerpt
  • 13. Fraud detection A predictive model can help weed out the "bads" and reduce a business's exposure to fraud. Image and Speech Recognition http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google. com/en/us/people/jeff/MIT_BigData_Sep2012.pdf Operations Jet Engine + Humans http://www.youtube.com/watch?v=JHc4ZTTWKrQ Industry applications of big data analytics Amazon wareouse operational efficiency: http://www.youtube.com/watch? v=Kafs9tZskuo
  • 15.
  • 16. What are those startups doing? Bloomreach http://www.youtube.com/watch?feature=player_embedded&v=K12awAj4tW8 Datastax http://www.nytimes.com/2013/02/25/business/media/for-house-of-cards-using-big-data-to-guarantee- its-popularity.html?pagewanted=all Paraccel http://www.paraccel.com/solutions/paraccel-solutions-big-data.php#.UXG207WG3Ct Kaggle http://www.kaggle.com/c/acm-sf-chapter-hackathon-big
  • 17. VC funding for "Big Data" Data from 71 start-ups. Funding is counted starting from 2004.
  • 18. VC Funding Activity Data from 71 start-ups. Funding is counted starting from 2004.
  • 19. Interesting view points " Special (domain) knowledge becomes less relevant; organizations should focus on collecting people who know how to extract value and insights from data." " In god we trust. All others must bring data." " The usefulness of a variable in a model is inversely related to the time you spend creating it." "Noise is convex but information is concave." "Big data is sexy but small data is beautiful." noise information data size
  • 20. Interesting view points "All models are wrong, but some are useful." "Big data is like teenage sex: everyone talks about it, nobody really knows how to do it; everyone thinks everyone else is doing it, so they claim they are doing it." "Statistics: The Art and Science of Learning from Data"
  • 21. The danger of big data
  • 22. Open discussion Potential opportunities / challenges for entrepreneurs? - visualization - internet of things - analytics as a service (a3 s) Standardization v.s. customization Human and data interaction - data v.s intuition
  • 24. Data Science v.s. OR risk management strategic planning predictive analytics optimization Risk Measurable of Objective skill sets of data scientists
  • 25.
  • 26. Big data types ● Web & social media: clickstream, web content, amazon reviews, facebook postings & 'like'... ● M2M:smart meters, oil rig sensor reading, GPS signals... ● Transaction:retail store, healthcare claims, utility billing... ● Biometrics:fingerprint, face, voice, handwriting.. ● Human-generated data:call logs, emails, surveys...
  • 27. Web & social media ● Transaction: orders, revenue, ● Conversion: click thru, convert to purchase,... ● Session: length, bounce rate ● Lifetime value: repeat, frequency,... ● Social interaction: intensity, influence,... Shopping cart analysis CTR prediction Personalization Retention/customer churn A/B testing Targeted ads Lifetime value
  • 28. Interesting data visualization projects wind map http://hint.fm/wind/gallery/oct-30.js.html
  • 29. Some analytical problems people deal with at Google ... ● search ranking
  • 30. Processing Pipeline Hadoop MapReduce log sensor web ... Structured Data Note: Hadoop -- an open-source software framework that supports data-intensive distributed applications, licensed under the Apache v2 license. It supports the running of applications on large clusters of commodity hardware. Orginated from Google MapReduce and further developed/promoted by Yahoo. SQL HIVE Dremel ... Analytics Big Data Cloud Computing http://www.forbes.com/sites/davefeinleib/2012/06/19/the-big-data-landscape/
  • 31. How big is big? When your data set becomes so large that you have to start innovating around how to collect, store, organize, analyze and share it ... External > web sites (blogs/reviews) > social media (Facebook, LinkedIn, Google+, Twitter) > images and videos > ... Internal > transactions > server logs > machines and sensors > emails > ...
  • 32. Health care Sentiment analysis Patient monitoring Genetic Testing Electronic Medical Records Utilities Smart Meters Retail Loyalty programs RFID tags Recommenda tion, market basket Face recognition Telcos Customer churn Location- based IT Machine log Web & Social media M2M Transaction Biometrics Human- generat ed
  • 34.
  • 36.
  • 37.
  • 38.
  • 39.