SlideShare a Scribd company logo
1 of 23
Text Mining of Social Network Data
for Business Applications
ANKIT SHARMA, DATA SCIENCE PRACTICES, IMPETUS
Content
Data
Unstructured
data as
business
opportunity
Text mining
Learning from
textual data
Social media
Learning from
social media
Sentiment
analysis and
opinion
mining
Topic
modeling
Tools for text
mining
Use Cases
Hotel
review
demo
Advertising
campaign
analysis
Data Science
Practices at
Impetus
Thursday, August 7, 2014 DATA SCIENCE PRACTICES, IMPETUS - INDIA 2
Data
Structured data
Tables, Records
Semi-structured
data
XML, JSON
Unstructured data
Text, Audio, Video,
conversations, Web,
Wikis, Documents,
Web logs…
Social Media data
Tweets, Blogs,
Facebook, other
social platforms
Thursday, August 7, 2014 DATA SCIENCE PRACTICES, IMPETUS - INDIA 3
Business opportunity
Customer
Interaction
Marketing
performance
Customer
Insight
Thursday, August 7, 2014 DATA SCIENCE PRACTICES, IMPETUS - INDIA 4
Better understanding of
Customers and their behavior Measure marketing efforts
Personalized performance for Customers
Unstructured data as business opportunity
 “Unstructured” data such as natural language, which is distinguished from the “structured”
information found in conventional spreadsheets and databases.
 Unstructured data constitutes 80% of the whole enterprise data (Gartner Research)
 Unstructured text can contain business critical information, untapped opportunities and latent
risks
 Example:
 Consumer’s thoughts and opinions, found in communications such as emails, web pages,
reports, surveys, contracts, blogs, wikis, and reports.
 Whether it’s a customer complaints, employee feedback, analyst opinions, or competitors'
intentions, this valuable and actionable information lies hidden in unstructured text repositories
Thursday, August 7, 2014 DATA SCIENCE PRACTICES, IMPETUS - INDIA 5
Text mining
Text Mining integrates innovative text analytics approaches, tools and solutions to leverage the
unstructured data
Typical text mining tasks include-
 Text categorization
 Text clustering
 Concept/entity extraction
 Production of granular taxonomies
 Sentiment analysis
 Document summarization
 Entity relation modeling
For a company, the successful management of unstructured information may lead to more
profitable decisions and business opportunities
Thursday, August 7, 2014 DATA SCIENCE PRACTICES, IMPETUS - INDIA 6
Learning from text mining
Classification
Spam detection
Document
organization
Clustering
Trend analysis
Topic identification
Web mining
Trend analysis
Ontology creation
Opinion mining
Natural Language
Processing
Text summarization
Question answering
Information
extraction
Thursday, August 7, 2014 DATA SCIENCE PRACTICES, IMPETUS - INDIA 8
Logical view of documents
Thursday, August 7, 2014 DATA SCIENCE PRACTICES, IMPETUS - INDIA 9
Sentiment Analysis and Opinion Mining
 Opinion mining, sentiment analysis, and subjectivity analysis are introduced as computational
analysis of opinion, sentiment, and subjectivity in online text
 Subjectivity analysis or subjectivity classification is automatically discriminating opinion
containing text from objective text representing factual information
 Sentiment analysis originated from machine learning (ML), information retrieval (IR) and
natural language processing (NLP)
 Opinion Mining originated from the Web search and IR community and involved processing
search results for a given product, retrieving attributes and aggregating users’ opinions
10Thursday, August 7, 2014 DATA SCIENCE PRACTICES, IMPETUS - INDIA
Topic Modeling
 A topic model is a type of statistical model for discovering the abstract "topics" that occur in a collection of documents
 Topic models are a suite of algorithms that uncover the hidden thematic structure in document collections
 Identification of emerging topics in communities, trending topics in social media, hot topics in online discussion may be critical for
businesses
 LDA (Latent Dirichlet Allocation), is a generative model that allows sets of observations to be explained by unobserved groups that
explain why some parts of the data are similar. It was developed by David Blei, Andrew Ng, and Michael Jordan in 2003.
Thursday, August 7, 2014 DATA SCIENCE PRACTICES, IMPETUS - INDIA 11
Topic Modeling Cond…
“I listen to Motorhead, Pink Floyd and Metallica whenever I’m travelling in my car.”
Now topic modeling might predict this text as 75% about Music and 25% about cars
"dog" and "bone" will appear more often in documents about dogs, "cat" and "meow" will
appear in documents about cats, and "the" and "is" will appear equally in both.
What type of analysis LDA can perform:
◦ Topic identification
◦ Which topic are similar?
◦ Which documents are similar based on topic allocations
Thursday, August 7, 2014 DATA SCIENCE PRACTICES, IMPETUS - INDIA 12
Social Media based Sentiment Analysis
Thursday, August 7, 2014 DATA SCIENCE PRACTICES, IMPETUS - INDIA 13
Data from Internet and
Web 2.0
• Buzz Monitoring
• Sentiment Analysis
• Content Categorization
Trends Detection
and Recommendation
• Brand Image Monitoring
• Sentiment trends in
customer comments
• Discovering undercurrents &
recommend adjustments
• Overall Vs Service Attributes
based Sentiment Extraction
• Real-time monitoring of
consumer perceptions
• Identification of Data
sources (Twitter/ Facebook/
Discussion boards)
• Collection of consumer
expressed text
Solution framework
Thursday, August 7, 2014 DATA SCIENCE PRACTICES, IMPETUS - INDIA 14
Data from Internet
and Web 2.0
Overall/ Attribute
level Sentiment
Analysis
Trends Detection
and
Recommendation
Service Attributes
Identification
Buzz Monitoring
and Summary
Report
Content
Categorization
Sentiment Trends
Summary
Classified
Content/Topics*
* Not included in this work
Tools for Text mining
Thursday, August 7, 2014 DATA SCIENCE PRACTICES, IMPETUS - INDIA 15
Use Case
- HOTEL REVIEW
- ADVERTISING CAMPAIGN
Thursday, August 7, 2014 DATA SCIENCE PRACTICES, IMPETUS - INDIA 16
Hotel Review demo
Objective:
 Analyze hotel review text data
 Calculating hotel’s rating based on the review sentiment analysis
 Visualization of data on Maps in a web based platform with features like
zooming, clicking and hover
 Design a web-based User-interface for larger data
Thursday, August 7, 2014 DATA SCIENCE PRACTICES, IMPETUS - INDIA 17
Data:
254,574 Reviews 2560 Hotel
6 Countries
UAE, CANADA, CHINA, INDIA, UK, USA
10 Cities
Beijing, Dubai, England, Illinois, Montreal, Nevada, New
Delhi, New York City, Quebec, San Francisco, Shanghai
Thursday, August 7, 2014 DATA SCIENCE PRACTICES, IMPETUS - INDIA 18
Data Science :
LSI (Latent Semantic Indexing)
Sentiment Analysis
Part of speech tagging
Feature extraction
Feature based opinion mining
Open search : Apache Solr
Database : Apache Cassandra
Maps : Google Maps API
Sentiment score for each
hotel based on the
sentiment analysis of its
reviews by calculating the
polarity of the reviews with
positive and negative words
Hotel feature based opinion
mining for following features
– Food, Room, Location,
Service, Price, etc.
Thursday, August 7, 2014 DATA SCIENCE PRACTICES, IMPETUS - INDIA 19
Advertising campaign buzz monitoring
 Social Media Monitoring and Analysis for an advertising campaign
 Feature-level Buzz Summary for Company name, Campaign and other
hidden features
 Blogs Analysis for campaign name
Thursday, August 7, 2014 DATA SCIENCE PRACTICES, IMPETUS - INDIA 20
Data collected from:
 Tweets for 1 month
 4000+ Tweets (including 1440 re-tweets)
 43 Blogs and comments were crawled and analyzed
Features
Results for blogs
Thursday, August 7, 2014 DATA SCIENCE PRACTICES, IMPETUS - INDIA 21
83%
17%
0%
Blogs
33%
43%
24%
Comments
Negative Positive Neutral
39%
40%
21%
Blogs & Comments
0
2
4
6
8
10
12
14
16
18
Neutral Positive Negative
Feature-level Buzz Summary for 3 features
NumberofBlogs
Insights
 The ad campaign has a overall negative sentiment associated with this
 People were using hashtags to express negative sentiment like #bad #worstadsever
 The attribute/feature “********” has also negatively opinionated by users
 There is some hike in associated tweets due to frequent advertisements during a particular day
 This analysis is based on 30 days tweets only!!!
 In the long run, more visible trends can be monitored
Thursday, August 7, 2014 DATA SCIENCE PRACTICES, IMPETUS - INDIA 22
Data Science Practices at Impetus
Thursday, August 7, 2014 DATA SCIENCE PRACTICES, IMPETUS - INDIA 23
DSP
Statistical
model
development
Text
mining
Financial
data
analysis
Healthcare
data
analysis
Manufacturing
data analysis
Web
analytics
Funnel
Analysis
Thank you!
Thursday, August 7, 2014 DATA SCIENCE PRACTICES, IMPETUS - INDIA 24

More Related Content

Viewers also liked

2009 Best Companies to Work
2009 Best Companies to Work2009 Best Companies to Work
2009 Best Companies to WorkEdwin K. Hudson
 
SpagoBI - the Business Intelligence Free Platform
SpagoBI - the Business Intelligence Free PlatformSpagoBI - the Business Intelligence Free Platform
SpagoBI - the Business Intelligence Free Platformdavide.zerbetto
 
Opinion Mining
Opinion MiningOpinion Mining
Opinion MiningAli Habeeb
 
Dialogue-Earth:-Mining-Social-Media
Dialogue-Earth:-Mining-Social-MediaDialogue-Earth:-Mining-Social-Media
Dialogue-Earth:-Mining-Social-MediaTom Masterman
 
Representing and Reasoning with Modular Ontologies (2007)
Representing and Reasoning with Modular Ontologies (2007)Representing and Reasoning with Modular Ontologies (2007)
Representing and Reasoning with Modular Ontologies (2007)Jie Bao
 
Detecting insults in social media conversations
Detecting insults in social media conversationsDetecting insults in social media conversations
Detecting insults in social media conversationsraj
 
Social Media Text Analytics: Mining Value From Predictive Insights
Social Media Text Analytics: Mining Value From Predictive InsightsSocial Media Text Analytics: Mining Value From Predictive Insights
Social Media Text Analytics: Mining Value From Predictive InsightsJohn Blossom
 
The Creative Animal Goes Online (Part B)
The Creative Animal Goes Online (Part B)The Creative Animal Goes Online (Part B)
The Creative Animal Goes Online (Part B)Mitch Goodwin
 
Text analytics in social media
Text analytics in social mediaText analytics in social media
Text analytics in social mediaJeremiah Fadugba
 
Data mining on Social Media
Data mining on Social MediaData mining on Social Media
Data mining on Social Mediahome
 
Web UI, Algorithms, and Feature Engineering
Web UI, Algorithms, and Feature Engineering Web UI, Algorithms, and Feature Engineering
Web UI, Algorithms, and Feature Engineering BigML, Inc
 
How to Write a Movie Review: Full Guide
How to Write a Movie Review: Full GuideHow to Write a Movie Review: Full Guide
How to Write a Movie Review: Full GuideReview Essay
 

Viewers also liked (20)

2009 Best Companies to Work
2009 Best Companies to Work2009 Best Companies to Work
2009 Best Companies to Work
 
SpagoBI - the Business Intelligence Free Platform
SpagoBI - the Business Intelligence Free PlatformSpagoBI - the Business Intelligence Free Platform
SpagoBI - the Business Intelligence Free Platform
 
BIandDataMining
BIandDataMiningBIandDataMining
BIandDataMining
 
Intelligent web applications
Intelligent web applicationsIntelligent web applications
Intelligent web applications
 
Srinivas Resume (1) (1)
Srinivas Resume (1) (1)Srinivas Resume (1) (1)
Srinivas Resume (1) (1)
 
Sistemas operativos
Sistemas operativosSistemas operativos
Sistemas operativos
 
Opinion Mining
Opinion MiningOpinion Mining
Opinion Mining
 
2016 12-02 ирсэн бичиг
2016 12-02 ирсэн бичиг2016 12-02 ирсэн бичиг
2016 12-02 ирсэн бичиг
 
Dialogue-Earth:-Mining-Social-Media
Dialogue-Earth:-Mining-Social-MediaDialogue-Earth:-Mining-Social-Media
Dialogue-Earth:-Mining-Social-Media
 
Representing and Reasoning with Modular Ontologies (2007)
Representing and Reasoning with Modular Ontologies (2007)Representing and Reasoning with Modular Ontologies (2007)
Representing and Reasoning with Modular Ontologies (2007)
 
Detecting insults in social media conversations
Detecting insults in social media conversationsDetecting insults in social media conversations
Detecting insults in social media conversations
 
Social Media Text Analytics: Mining Value From Predictive Insights
Social Media Text Analytics: Mining Value From Predictive InsightsSocial Media Text Analytics: Mining Value From Predictive Insights
Social Media Text Analytics: Mining Value From Predictive Insights
 
The Creative Animal Goes Online (Part B)
The Creative Animal Goes Online (Part B)The Creative Animal Goes Online (Part B)
The Creative Animal Goes Online (Part B)
 
Brochure dashlo
Brochure dashloBrochure dashlo
Brochure dashlo
 
Text analytics in social media
Text analytics in social mediaText analytics in social media
Text analytics in social media
 
Social Media Mining and Analytics
Social Media Mining and AnalyticsSocial Media Mining and Analytics
Social Media Mining and Analytics
 
Data mining on Social Media
Data mining on Social MediaData mining on Social Media
Data mining on Social Media
 
Social Media Mining and Retrieval
Social Media Mining and RetrievalSocial Media Mining and Retrieval
Social Media Mining and Retrieval
 
Web UI, Algorithms, and Feature Engineering
Web UI, Algorithms, and Feature Engineering Web UI, Algorithms, and Feature Engineering
Web UI, Algorithms, and Feature Engineering
 
How to Write a Movie Review: Full Guide
How to Write a Movie Review: Full GuideHow to Write a Movie Review: Full Guide
How to Write a Movie Review: Full Guide
 

Similar to Text mining of Social Network Data for Business Intelligence - iLabs camp

Advance analytics -concepts related to drive into next wave of BI
Advance analytics -concepts related to drive into next wave of BIAdvance analytics -concepts related to drive into next wave of BI
Advance analytics -concepts related to drive into next wave of BIPavan Babu .G
 
Sentiment Analysis Using Product Review
Sentiment Analysis Using Product ReviewSentiment Analysis Using Product Review
Sentiment Analysis Using Product ReviewAbdullah Moin
 
HPE IDOL Technical Overview - july 2016
HPE IDOL Technical Overview - july 2016HPE IDOL Technical Overview - july 2016
HPE IDOL Technical Overview - july 2016Andrey Karpov
 
IW14 Session: Mike Gualtieri, Forrester Research
IW14 Session: Mike Gualtieri, Forrester ResearchIW14 Session: Mike Gualtieri, Forrester Research
IW14 Session: Mike Gualtieri, Forrester ResearchSoftware AG
 
Understanding-Artificial-Intelligence-in-Research (1).pptx
Understanding-Artificial-Intelligence-in-Research (1).pptxUnderstanding-Artificial-Intelligence-in-Research (1).pptx
Understanding-Artificial-Intelligence-in-Research (1).pptxForum of Blended Learning
 
What Is DATA MINING(INTRODUCTION)
What Is DATA MINING(INTRODUCTION)What Is DATA MINING(INTRODUCTION)
What Is DATA MINING(INTRODUCTION)Pratik Tambekar
 
Predictive Analytics, AI and the Promise of Personalization
Predictive Analytics, AI and the Promise of PersonalizationPredictive Analytics, AI and the Promise of Personalization
Predictive Analytics, AI and the Promise of PersonalizationEarley Information Science
 
TOWARDS AUTOMATIC DETECTION OF SENTIMENTS IN CUSTOMER REVIEWS
TOWARDS AUTOMATIC DETECTION OF SENTIMENTS IN CUSTOMER REVIEWSTOWARDS AUTOMATIC DETECTION OF SENTIMENTS IN CUSTOMER REVIEWS
TOWARDS AUTOMATIC DETECTION OF SENTIMENTS IN CUSTOMER REVIEWSijistjournal
 
TOWARDS AUTOMATIC DETECTION OF SENTIMENTS IN CUSTOMER REVIEWS
TOWARDS AUTOMATIC DETECTION OF SENTIMENTS IN CUSTOMER REVIEWSTOWARDS AUTOMATIC DETECTION OF SENTIMENTS IN CUSTOMER REVIEWS
TOWARDS AUTOMATIC DETECTION OF SENTIMENTS IN CUSTOMER REVIEWSijistjournal
 
Potentials and limitations of ‘Automated Sentiment Analysis
Potentials and limitations of ‘Automated Sentiment AnalysisPotentials and limitations of ‘Automated Sentiment Analysis
Potentials and limitations of ‘Automated Sentiment AnalysisKarthik Sharma
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment AnalysisAditya Nag
 
Potentials and limitations of ‘Automated Sentiment Analysis
Potentials and limitations of ‘Automated Sentiment AnalysisPotentials and limitations of ‘Automated Sentiment Analysis
Potentials and limitations of ‘Automated Sentiment AnalysisKarthik Sharma
 
Advanced analytics
Advanced analyticsAdvanced analytics
Advanced analyticsShankar R
 
Product Sentiment Analysis
Product Sentiment AnalysisProduct Sentiment Analysis
Product Sentiment Analysisnancy amala
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...ijceronline
 
Smashing SIlos: UX is the New SEO
Smashing SIlos: UX is the New SEOSmashing SIlos: UX is the New SEO
Smashing SIlos: UX is the New SEOBrightEdge
 

Similar to Text mining of Social Network Data for Business Intelligence - iLabs camp (20)

Advance analytics -concepts related to drive into next wave of BI
Advance analytics -concepts related to drive into next wave of BIAdvance analytics -concepts related to drive into next wave of BI
Advance analytics -concepts related to drive into next wave of BI
 
Ijmet 10 01_094
Ijmet 10 01_094Ijmet 10 01_094
Ijmet 10 01_094
 
Sentiment Analysis Using Product Review
Sentiment Analysis Using Product ReviewSentiment Analysis Using Product Review
Sentiment Analysis Using Product Review
 
HPE IDOL Technical Overview - july 2016
HPE IDOL Technical Overview - july 2016HPE IDOL Technical Overview - july 2016
HPE IDOL Technical Overview - july 2016
 
Resume
ResumeResume
Resume
 
IW14 Session: Mike Gualtieri, Forrester Research
IW14 Session: Mike Gualtieri, Forrester ResearchIW14 Session: Mike Gualtieri, Forrester Research
IW14 Session: Mike Gualtieri, Forrester Research
 
Understanding-Artificial-Intelligence-in-Research (1).pptx
Understanding-Artificial-Intelligence-in-Research (1).pptxUnderstanding-Artificial-Intelligence-in-Research (1).pptx
Understanding-Artificial-Intelligence-in-Research (1).pptx
 
What Is DATA MINING(INTRODUCTION)
What Is DATA MINING(INTRODUCTION)What Is DATA MINING(INTRODUCTION)
What Is DATA MINING(INTRODUCTION)
 
Predictive Analytics, AI and the Promise of Personalization
Predictive Analytics, AI and the Promise of PersonalizationPredictive Analytics, AI and the Promise of Personalization
Predictive Analytics, AI and the Promise of Personalization
 
TOWARDS AUTOMATIC DETECTION OF SENTIMENTS IN CUSTOMER REVIEWS
TOWARDS AUTOMATIC DETECTION OF SENTIMENTS IN CUSTOMER REVIEWSTOWARDS AUTOMATIC DETECTION OF SENTIMENTS IN CUSTOMER REVIEWS
TOWARDS AUTOMATIC DETECTION OF SENTIMENTS IN CUSTOMER REVIEWS
 
TOWARDS AUTOMATIC DETECTION OF SENTIMENTS IN CUSTOMER REVIEWS
TOWARDS AUTOMATIC DETECTION OF SENTIMENTS IN CUSTOMER REVIEWSTOWARDS AUTOMATIC DETECTION OF SENTIMENTS IN CUSTOMER REVIEWS
TOWARDS AUTOMATIC DETECTION OF SENTIMENTS IN CUSTOMER REVIEWS
 
Potentials and limitations of ‘Automated Sentiment Analysis
Potentials and limitations of ‘Automated Sentiment AnalysisPotentials and limitations of ‘Automated Sentiment Analysis
Potentials and limitations of ‘Automated Sentiment Analysis
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment Analysis
 
Ac02411221125
Ac02411221125Ac02411221125
Ac02411221125
 
Potentials and limitations of ‘Automated Sentiment Analysis
Potentials and limitations of ‘Automated Sentiment AnalysisPotentials and limitations of ‘Automated Sentiment Analysis
Potentials and limitations of ‘Automated Sentiment Analysis
 
Advanced analytics
Advanced analyticsAdvanced analytics
Advanced analytics
 
Product Sentiment Analysis
Product Sentiment AnalysisProduct Sentiment Analysis
Product Sentiment Analysis
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
 
243
243243
243
 
Smashing SIlos: UX is the New SEO
Smashing SIlos: UX is the New SEOSmashing SIlos: UX is the New SEO
Smashing SIlos: UX is the New SEO
 

Recently uploaded

9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxUnduhUnggah1
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSINGmarianagonzalez07
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 

Recently uploaded (20)

9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docx
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 

Text mining of Social Network Data for Business Intelligence - iLabs camp

  • 1. Text Mining of Social Network Data for Business Applications ANKIT SHARMA, DATA SCIENCE PRACTICES, IMPETUS
  • 2. Content Data Unstructured data as business opportunity Text mining Learning from textual data Social media Learning from social media Sentiment analysis and opinion mining Topic modeling Tools for text mining Use Cases Hotel review demo Advertising campaign analysis Data Science Practices at Impetus Thursday, August 7, 2014 DATA SCIENCE PRACTICES, IMPETUS - INDIA 2
  • 3. Data Structured data Tables, Records Semi-structured data XML, JSON Unstructured data Text, Audio, Video, conversations, Web, Wikis, Documents, Web logs… Social Media data Tweets, Blogs, Facebook, other social platforms Thursday, August 7, 2014 DATA SCIENCE PRACTICES, IMPETUS - INDIA 3
  • 4. Business opportunity Customer Interaction Marketing performance Customer Insight Thursday, August 7, 2014 DATA SCIENCE PRACTICES, IMPETUS - INDIA 4 Better understanding of Customers and their behavior Measure marketing efforts Personalized performance for Customers
  • 5. Unstructured data as business opportunity  “Unstructured” data such as natural language, which is distinguished from the “structured” information found in conventional spreadsheets and databases.  Unstructured data constitutes 80% of the whole enterprise data (Gartner Research)  Unstructured text can contain business critical information, untapped opportunities and latent risks  Example:  Consumer’s thoughts and opinions, found in communications such as emails, web pages, reports, surveys, contracts, blogs, wikis, and reports.  Whether it’s a customer complaints, employee feedback, analyst opinions, or competitors' intentions, this valuable and actionable information lies hidden in unstructured text repositories Thursday, August 7, 2014 DATA SCIENCE PRACTICES, IMPETUS - INDIA 5
  • 6. Text mining Text Mining integrates innovative text analytics approaches, tools and solutions to leverage the unstructured data Typical text mining tasks include-  Text categorization  Text clustering  Concept/entity extraction  Production of granular taxonomies  Sentiment analysis  Document summarization  Entity relation modeling For a company, the successful management of unstructured information may lead to more profitable decisions and business opportunities Thursday, August 7, 2014 DATA SCIENCE PRACTICES, IMPETUS - INDIA 6
  • 7. Learning from text mining Classification Spam detection Document organization Clustering Trend analysis Topic identification Web mining Trend analysis Ontology creation Opinion mining Natural Language Processing Text summarization Question answering Information extraction Thursday, August 7, 2014 DATA SCIENCE PRACTICES, IMPETUS - INDIA 8
  • 8. Logical view of documents Thursday, August 7, 2014 DATA SCIENCE PRACTICES, IMPETUS - INDIA 9
  • 9. Sentiment Analysis and Opinion Mining  Opinion mining, sentiment analysis, and subjectivity analysis are introduced as computational analysis of opinion, sentiment, and subjectivity in online text  Subjectivity analysis or subjectivity classification is automatically discriminating opinion containing text from objective text representing factual information  Sentiment analysis originated from machine learning (ML), information retrieval (IR) and natural language processing (NLP)  Opinion Mining originated from the Web search and IR community and involved processing search results for a given product, retrieving attributes and aggregating users’ opinions 10Thursday, August 7, 2014 DATA SCIENCE PRACTICES, IMPETUS - INDIA
  • 10. Topic Modeling  A topic model is a type of statistical model for discovering the abstract "topics" that occur in a collection of documents  Topic models are a suite of algorithms that uncover the hidden thematic structure in document collections  Identification of emerging topics in communities, trending topics in social media, hot topics in online discussion may be critical for businesses  LDA (Latent Dirichlet Allocation), is a generative model that allows sets of observations to be explained by unobserved groups that explain why some parts of the data are similar. It was developed by David Blei, Andrew Ng, and Michael Jordan in 2003. Thursday, August 7, 2014 DATA SCIENCE PRACTICES, IMPETUS - INDIA 11
  • 11. Topic Modeling Cond… “I listen to Motorhead, Pink Floyd and Metallica whenever I’m travelling in my car.” Now topic modeling might predict this text as 75% about Music and 25% about cars "dog" and "bone" will appear more often in documents about dogs, "cat" and "meow" will appear in documents about cats, and "the" and "is" will appear equally in both. What type of analysis LDA can perform: ◦ Topic identification ◦ Which topic are similar? ◦ Which documents are similar based on topic allocations Thursday, August 7, 2014 DATA SCIENCE PRACTICES, IMPETUS - INDIA 12
  • 12. Social Media based Sentiment Analysis Thursday, August 7, 2014 DATA SCIENCE PRACTICES, IMPETUS - INDIA 13 Data from Internet and Web 2.0 • Buzz Monitoring • Sentiment Analysis • Content Categorization Trends Detection and Recommendation • Brand Image Monitoring • Sentiment trends in customer comments • Discovering undercurrents & recommend adjustments • Overall Vs Service Attributes based Sentiment Extraction • Real-time monitoring of consumer perceptions • Identification of Data sources (Twitter/ Facebook/ Discussion boards) • Collection of consumer expressed text
  • 13. Solution framework Thursday, August 7, 2014 DATA SCIENCE PRACTICES, IMPETUS - INDIA 14 Data from Internet and Web 2.0 Overall/ Attribute level Sentiment Analysis Trends Detection and Recommendation Service Attributes Identification Buzz Monitoring and Summary Report Content Categorization Sentiment Trends Summary Classified Content/Topics* * Not included in this work
  • 14. Tools for Text mining Thursday, August 7, 2014 DATA SCIENCE PRACTICES, IMPETUS - INDIA 15
  • 15. Use Case - HOTEL REVIEW - ADVERTISING CAMPAIGN Thursday, August 7, 2014 DATA SCIENCE PRACTICES, IMPETUS - INDIA 16
  • 16. Hotel Review demo Objective:  Analyze hotel review text data  Calculating hotel’s rating based on the review sentiment analysis  Visualization of data on Maps in a web based platform with features like zooming, clicking and hover  Design a web-based User-interface for larger data Thursday, August 7, 2014 DATA SCIENCE PRACTICES, IMPETUS - INDIA 17 Data: 254,574 Reviews 2560 Hotel 6 Countries UAE, CANADA, CHINA, INDIA, UK, USA 10 Cities Beijing, Dubai, England, Illinois, Montreal, Nevada, New Delhi, New York City, Quebec, San Francisco, Shanghai
  • 17. Thursday, August 7, 2014 DATA SCIENCE PRACTICES, IMPETUS - INDIA 18 Data Science : LSI (Latent Semantic Indexing) Sentiment Analysis Part of speech tagging Feature extraction Feature based opinion mining Open search : Apache Solr Database : Apache Cassandra Maps : Google Maps API Sentiment score for each hotel based on the sentiment analysis of its reviews by calculating the polarity of the reviews with positive and negative words Hotel feature based opinion mining for following features – Food, Room, Location, Service, Price, etc.
  • 18. Thursday, August 7, 2014 DATA SCIENCE PRACTICES, IMPETUS - INDIA 19
  • 19. Advertising campaign buzz monitoring  Social Media Monitoring and Analysis for an advertising campaign  Feature-level Buzz Summary for Company name, Campaign and other hidden features  Blogs Analysis for campaign name Thursday, August 7, 2014 DATA SCIENCE PRACTICES, IMPETUS - INDIA 20 Data collected from:  Tweets for 1 month  4000+ Tweets (including 1440 re-tweets)  43 Blogs and comments were crawled and analyzed Features
  • 20. Results for blogs Thursday, August 7, 2014 DATA SCIENCE PRACTICES, IMPETUS - INDIA 21 83% 17% 0% Blogs 33% 43% 24% Comments Negative Positive Neutral 39% 40% 21% Blogs & Comments 0 2 4 6 8 10 12 14 16 18 Neutral Positive Negative Feature-level Buzz Summary for 3 features NumberofBlogs
  • 21. Insights  The ad campaign has a overall negative sentiment associated with this  People were using hashtags to express negative sentiment like #bad #worstadsever  The attribute/feature “********” has also negatively opinionated by users  There is some hike in associated tweets due to frequent advertisements during a particular day  This analysis is based on 30 days tweets only!!!  In the long run, more visible trends can be monitored Thursday, August 7, 2014 DATA SCIENCE PRACTICES, IMPETUS - INDIA 22
  • 22. Data Science Practices at Impetus Thursday, August 7, 2014 DATA SCIENCE PRACTICES, IMPETUS - INDIA 23 DSP Statistical model development Text mining Financial data analysis Healthcare data analysis Manufacturing data analysis Web analytics Funnel Analysis
  • 23. Thank you! Thursday, August 7, 2014 DATA SCIENCE PRACTICES, IMPETUS - INDIA 24