2. Overview of Social Media Analytics
Social media analytics is the process of collecting and
analyzing audience data shared on social networks to
improve an organization's strategic business decisions.
Social media analytics is the ability to gather and find
meaning in data gathered from social channels to support
business decisions — and measure the performance of
actions based on those decisions through social media.
Social media analytics uses specifically designed
software platforms that work similarly to web search
Data about keywords or topics is retrieved through
search queries or web ‘crawlers’ that span channels.
Fragments of text are returned, loaded into a database,
categorized and analyzed to derive meaningful insights.
2 Asst. Prof. Rushikesh Chikane, MIT
4. Seven Layers of Social Media Analytics
Social media at a minimum has seven layers of data.
Each layer carries potentially valuable information
and insights that can be harvested for business
Out of the seven layers, some are visible or easily
identifiable (e.g., text and actions) and other are
invisible (e.g., social media and hyperlink networks).
4 Asst. Prof. Rushikesh Chikane, MIT
6. LAYER ONE: TEXT
Social media text analytics deals with the extraction and
analysis of business insights from textual elements of
social media content, such as comments, tweets, blog
posts, and Facebook status updates. Text analytics is
mostly used to understand social media users’ sentiments
or identify emerging themes and topics.
LAYER TWO: NETWORKS
Social media network analytics extract, analyze, and
interpret personal and professional social networks, for
example, Facebook, Friendship Network, and Twitter.
Network analytics seeks to identify influential nodes (e.g.,
people and organizations) and their position in the
6 Asst. Prof. Rushikesh Chikane, MIT
7. LAYER THREE: ACTIONS
Social media actions analytics deals with extracting,
analyzing, and interpreting the actions performed by
social media users, including likes, dislikes, shares,
mentions, and endorsement. Actions analytics are mostly
used to measure popularity, influence, and prediction in
LAYER FOUR: MOBILE
Mobile analytics is the next frontier in the social business
landscape. Mobile analytics deals with measuring and
optimizing user engagement with mobile applications (or
apps for short), analyzing and understanding in-app
purchases, customer engagement, and mobile user
7 Asst. Prof. Rushikesh Chikane, MIT
8. LAYER FIVE: HYPERLINKS
Hyperlink analytics is about extracting, analyzing, and
interpreting social media hyperlinks (e.g., in-links and out-
Hyperlink analysis can reveal, for example, Internet traffic
patterns and sources of incoming or outgoing traffic to
and from a source.
LAYER SIX: LOCATION
Location analytics, also known as spatial analysis or
geospatial analytics, is concerned with mining and
mapping the locations of social media users,
contents, and data.
8 Asst. Prof. Rushikesh Chikane, MIT
9. LAYER SEVEN: SEARCH ENGINES
Search engines analytics focuses on analyzing historical
search data for gaining a valuable insight into a range of
areas, including trends analysis, keyword monitoring,
search result and advertisement history, and
advertisement spending statistics.
9 Asst. Prof. Rushikesh Chikane, MIT
10. Accessing Social Media Data
Social media data is any type of data that can be
gathered through social media. In general, the term
refers to social media metrics and
demographics collected through analytics tools on
Social media data can also refer to data collected
from content people post publicly on social media.
This type of social media data for marketing can be
collected through social listening tools.
10 Asst. Prof. Rushikesh Chikane, MIT
11. Social Network Analysis
Social network analysis (SNA) is the process of
investigating social structures through the use
of networks and graph theory. It characterizes
networked structures in terms of nodes (individual
actors, people, or things within the network) and
the ties, edges, or links (relationships or interactions)
that connect them.
SNA is the practice of representing networks of
people as graphs and then exploring these graphs. A
typical social network representation has nodes for
people, and edges connecting two nodes to
represent one or more relationships between them
11 Asst. Prof. Rushikesh Chikane, MIT
12. Asst. Prof. Rushikesh Chikane, MIT
The resulting graph can reveal patterns of
connection among people. Small networks can be
represented visually, and these visualizations are
intuitive and may make apparent patterns of
connections, and reveal nodes that are highly
connected or which play a critical role in connecting
Social network analysis (SNA) is a process of
quantitative and qualitative analysis of a social
network. SNA measures and maps the flow of
relationships and relationship changes between
Simple and complex entities include websites,
computers, animals, humans, groups, organizations
13. The benefits of social network:
Asst. Prof. Rushikesh Chikane, MIT
Helps you understand your audience better
Used for customer segmentation
Used to design Recommendation Systems
Detect fake news, among other things
14. Asst. Prof. Rushikesh Chikane, MIT
Link prediction is one of the most important research
topics in the field of graphs and networks. The objective
of link prediction is to identify pairs of nodes that will
either form a link or not in the future.
15. Asst. Prof. Rushikesh Chikane, MIT
Link prediction has a ton of use in real-world
Predict which customers are likely to buy what products
on online marketplaces like Amazon. It can help in making
better product recommendations
Suggest interactions or collaborations between
employees in an organization
Extract vital insights from terrorist networks
16. Introduction to Natural Language
Natural Language Processing is a branch of Computer
Science that deals with the understanding and
processing of natural language, e.g. texts or voice
The goal is for a machine to be able to communicate with
humans in the same way that humans have been
communicating with each other for centuries.
Learning a new language is not easy for us humans
either and requires a lot of time and perseverance.
When a machine wants to learn a natural language, it is
Therefore, some sub-areas have emerged within Natural
Language Processing that are necessary for language to
be completely understood.
16 Asst. Prof. Rushikesh Chikane, MIT
17. Text Analytics
Bag of words
Word weighting : TF-IDF
Stemming and Lemmatization
Synonyms and Part of speech tagging
17 Asst. Prof. Rushikesh Chikane, MIT
The text is cut into pieces called “tokens” or “terms.”
These tokens are the most basic unit of information you’ll
use for your model.
The terms are often words but this isn’t a necessity.
Entire sentences can be used for analysis.
We’ll use unigrams: terms consisting of one word.
Often, however, it’s useful to include bigrams (two words
per token) or trigrams (three words per token) to capture
extra meaning and increase the performance of your
This does come at a cost, though, because you’re
building bigger term-vectors by including bigrams and/or
trigrams in the equation.
18 Asst. Prof. Rushikesh Chikane, MIT
19. Bag of words
To build our classification model we’ll go with the bag of
Bag of words is the simplest way of structuring textual
data: every document is turned into a word vector.
If a certain word is present in the vector it’s labeled
“True”; the others are labeled “False”. Figure shows a
simplified example of this, in case there are only two
documents: one about the television show Game of
Thrones and one about data science.
The two word vectors together form the document-term
The document-term matrix holds a column for every term
and a row for every document
19 Asst. Prof. Rushikesh Chikane, MIT
21. Word weighting : TF-IDF
Term Frequency - Inverse Document Frequency (TF-
IDF) is a widely used statistical method in natural
language processing and information retrieval. It
measures how important a term is within a document
relative to a collection of documents (i.e., relative to
a corpus). Words within a text document are
transformed into importance numbers by a text
vectorization process. There are many different text
vectorization scoring schemes, with TF-IDF being
one of the most common.
21 Asst. Prof. Rushikesh Chikane, MIT
22. As its name implies, TF-IDF vectorizes/scores a
word by multiplying the word’s Term Frequency (TF)
with the Inverse Document Frequency (IDF).
Term Frequency: TF of a term or word is the
number of times the term appears in a document
compared to the total number of words in the
22 Asst. Prof. Rushikesh Chikane, MIT
23. Inverse Document Frequency: IDF of a term
reflects the proportion of documents in the corpus
that contain the term. Words unique to a small
percentage of documents (e.g., technical jargon
terms) receive higher importance values than words
common across all documents (e.g., a, the, and).
23 Asst. Prof. Rushikesh Chikane, MIT
24. The TF-IDF of a term is calculated by multiplying TF
and IDF scores.
TF-IDF is useful in many natural language
processing applications. For example, Search
Engines use TF-IDF to rank the relevance of a
document for a query. TF-IDF is also employed in
text classification, text summarization, and topic
24 Asst. Prof. Rushikesh Chikane, MIT
Imagine the term ’t’ appears 20 times in a document
that contains a total of 100 words.
Term Frequency (TF) of ’t’ can be calculated as
Assume a collection of related documents contains
10,000 documents. If 100 documents out of 10,000
documents contain the term ’t’, Inverse Document
Frequency (IDF) of ’t’ can be calculated as follows
25 Asst. Prof. Rushikesh Chikane, MIT
26. Using these two quantities, we can calculate TF-IDF
score of the term ’t’ for the document.
26 Asst. Prof. Rushikesh Chikane, MIT
N-gram can be defined as the contiguous sequence
of n items from a given sample of text or speech.
The items can be letters, words, or base pairs
according to the application. The N-grams typically
are collected from a text or speech corpus (A long
N-grams of texts are extensively used in text mining
and natural language processing tasks. They are
basically a set of co-occurring words within a given
window and when computing the n-grams you
typically move one word forward (although you can
move X words forward in more advanced scenarios).
27 Asst. Prof. Rushikesh Chikane, MIT
28. For example, for the sentence
“I reside in Bengaluru”.
SL.No Type of n-gram Generated n-grams
1 Unigram [“I”, ”reside”, ”in”, ”Bengaluru”]
2 Bigram [“I reside”, ”reside in”, ”in Bengaluru”]
3 Trigram [“I reside in”, “reside in Bengaluru”]
28 Asst. Prof. Rushikesh Chikane, MIT
29. When N=1, this is referred to as unigrams and this is
essentially the individual words in a sentence.
When N=2, this is called bigrams and
when N=3 this is called trigrams.
When N>3 this is usually referred to as four grams or five
grams and so on.
How many N-grams in a sentence?
If X=Num of words in a given sentence K, the number of
n-grams for sentence K would be:
29 Asst. Prof. Rushikesh Chikane, MIT
30. Stop word
Stop words are a set of commonly used words in a
language. Examples of stop words in English are “a,”
“the,” “is,” “are,” etc.
Stop words are commonly used in Text Mining and
Natural Language Processing (NLP) to eliminate words
that are so widely used that they carry very little useful
When to remove stop words?
If we have a task of text classification or sentiment analysis
then we should remove stop words as they do not provide any
information to our model, i.e keeping out unwanted words
out of our corpus, but if we have the task of language
translation then stopwords are useful, as they have to be
translated along with other words.
30 Asst. Prof. Rushikesh Chikane, MIT
Stop words are often removed from the text before
training deep learning and machine learning models since
stop words occur in abundance, hence providing little to
no unique information that can be used for classification
On removing stop words, dataset size decreases, and the
time to train the model also decreases without a huge
impact on the accuracy of the model.
Stop word removal can potentially help in improving
performance, as there are fewer and only significant
tokens left. Thus, the classification accuracy could be
31 Asst. Prof. Rushikesh Chikane, MIT
Improper selection and removal of stop words can change
the meaning of our text. So we have to be careful in
choosing our stop words.
Ex: “ This movie is not good.”
If we remove (not ) in pre-processing step the sentence
(this movie is good) indicates that it is positive which is
32 Asst. Prof. Rushikesh Chikane, MIT
33. Stemming and Lemmatization
What is Stemming?
Stemming is a technique used to extract the base form of
the words by removing affixes from them. It is just like
cutting down the branches of a tree to its stems. For
example, the stem of the words eating, eats,
eaten is eat.
Search engines use stemming for indexing the words.
That’s why rather than storing all forms of a word, a
search engine can store only the stems. In this way,
stemming reduces the size of the index and increases
33 Asst. Prof. Rushikesh Chikane, MIT
34. What is Lemmatization?
Lemmatization is a development of Stemming and
describes the process of grouping together the
different inflected forms of a word so they can be
analyzed as a single item.
Lemmatization is similar to Stemming but it brings
context to the words. So it links words with similar
meanings to one word.
Lemmatization algorithms usually also use positional
arguments as inputs, such as whether the word is an
adjective, noun, or verb.
34 Asst. Prof. Rushikesh Chikane, MIT
35. Synonyms and Part of speech tagging
Part-of-speech (POS) tagging is a process in natural
language processing (NLP) where each word in a text is
labeled with its corresponding part of speech. This can
include nouns, verbs, adjectives, and other grammatical
POS tagging is useful for a variety of NLP tasks, such as
information extraction, named entity recognition, and
machine translation. It can also be used to identify the
grammatical structure of a sentence and to disambiguate
words that have multiple meanings.
POS tagging is typically performed using machine
learning algorithms, which are trained on a large
annotated corpus of text. The algorithm learns to predict
the correct POS tag for a given word based on the
context in which it appears.
35 Asst. Prof. Rushikesh Chikane, MIT
36. Why POS tagging?
POS tagging is an important part of NLP because it
works as the prerequisite for further NLP analysis as
Grammar analysis & word-sense disambiguation
36 Asst. Prof. Rushikesh Chikane, MIT
37. Tagging a list of sentences
Rather than tagging a single sentence, the
NLTK’s TaggerI class also provides us
a tag_sents() method with the help of which we can tag a
list of sentences. Following is the example in which we
tagged two simple sentences
Un-tagging a sentence
We can also un-tag a sentence. NLTK provides
nltk.tag.untag() method for this purpose. It will take a
tagged sentence as input and provides a list of words
37 Asst. Prof. Rushikesh Chikane, MIT
38. Use of Parts of Speech Tagging in NLP
To understand the grammatical structure of a sentence:
By labeling each word with its POS, we can better understand the syntax
and structure of a sentence. This is useful for tasks such as machine
translation and information extraction, where it is important to know how
words relate to each other in the sentence.
To disambiguate words with multiple meanings:
Some words, such as “bank,” can have multiple meanings depending on
the context in which they are used. By labeling each word with its POS,
we can disambiguate these words and better understand their intended
To improve the accuracy of NLP tasks:
POS tagging can help improve the performance of various NLP tasks,
such as named entity recognition and text classification. By providing
additional context and information about the words in a text, we can build
more accurate and sophisticated algorithms.
To facilitate research in linguistics:
POS tagging can also be used to study the patterns and characteristics of
language use and to gain insights into the structure and function of
different parts of speech.
38 Asst. Prof. Rushikesh Chikane, MIT
39. Application of POS Tagging
POS tagging can be used to identify specific types of information in a text, such as
names, locations, and organizations. This is useful for tasks such as extracting
data from news articles or building knowledge bases for artificial intelligence
Named entity recognition:
POS tagging can be used to identify and classify named entities in a text, such as
people, places, and organizations. This is useful for tasks such as building
customer profiles or identifying key figures in a news story.
POS tagging can be used to help classify texts into different categories, such as
spam emails or sentiment analysis. By analyzing the POS tags of the words in a
text, algorithms can better understand the content and tone of the text.
POS tagging can be used to help translate texts from one language to another by
identifying the grammatical structure and relationships between words in the
source language and mapping them to the target language.
Natural language generation:
POS tagging can be used to generate natural-sounding text by selecting
appropriate words and constructing grammatically correct sentences. This is useful
for tasks such as chatbots and virtual assistants.
39 Asst. Prof. Rushikesh Chikane, MIT
40. Sentiment Analysis
Sentiment analysis is the process of classifying
whether a block of text is positive, negative, or,
Sentiment analysis is a subset of natural language
processing (NLP) that uses machine learning to
analyze and classify the emotional tone of text data.
The goal which Sentiment analysis tries to gain is to
be analyzed people’s opinions in a way that can help
It focuses not only on polarity (positive, negative &
neutral) but also on emotions (happy, sad, angry,
etc.)as well as intentions to buy.
40 Asst. Prof. Rushikesh Chikane, MIT
41. Why Use Sentiment Analysis?
Sentiment analysis is the contextual meaning of
words that indicates the social sentiment of a
brand and also helps the business to determine
whether the product they are manufacturing is
going to make a demand in the market or not.
Businesses can use insights from sentiment
analysis to improve their products, fine-tune
marketing messages, correct misconceptions,
and identify positive influencers.
It’s very helpful in helping businesses to gain
insights, understand customers, predict and
enhance the customer experience, tailor
marketing campaigns, and aid in decision-
41 Asst. Prof. Rushikesh Chikane, MIT
42. Types of Sentiment Analysis
Fine-grained sentiment analysis:
• This depends on the polarity base. This category can be designed as very positive,
positive, neutral, negative, or very negative. The rating is done on a scale of 1 to 5. If
the rating is 5 then it is very positive, 2 then negative, and 3 then neutral.
• The sentiments happy, sad, angry, upset, jolly, pleasant, and so on come under
emotion detection. It is also known as a lexicon method of sentiment analysis.
Aspect-based sentiment analysis:
• It focuses on a particular aspect for instance if a person wants to check the feature of
the cell phone then it checks the aspect such as the battery, screen, and camera
quality then aspect based is used.
Multilingual sentiment analysis:
• Multilingual consists of different languages where the classification needs to be done
as positive, negative, and neutral. This is highly challenging and comparatively
42 Asst. Prof. Rushikesh Chikane, MIT
• If for instance the comments on social media side as
Instagram, over here all the reviews are analyzed and
categorized as positive, negative, and neutral.
• In the play store, all the comments in the form of 1 to 5
are done with the help of sentiment analysis
• In the marketing area where a particular product
needs to be reviewed as good or bad.
• All the reviewers will have a look at the comments and
will check and give the overall review of the product.
43 Asst. Prof. Rushikesh Chikane, MIT
44. Document or text summarization
Text summarization is a very useful and important
part of Natural Language Processing (NLP).
We can summarize our text in a few lines by
removing unimportant text and converting the same
text into smaller semantic text form.
In this approach we build algorithms or programs
which will reduce the text size and create a summary
of our text data. This is called automatic text
summarization in machine learning.
Text summarization is the process of creating shorter
text without removing the semantic structure of text.
44 Asst. Prof. Rushikesh Chikane, MIT
45. Asst. Prof. Rushikesh Chikane, MIT
Text summarization is the practice of breaking down long
publications into manageable paragraphs or sentences.
The procedure extracts important information while also
ensuring that the paragraph's sense is preserved. This
shortens the time it takes to comprehend long materials
like research articles while without omitting critical
Text summarising presents a number of issues, including
text identification, interpretation, and summary
generation, as well as analysis of the resulting summary.
Identifying important phrases in the document and
exploiting them to uncover relevant information to add in
the summary are critical jobs in extraction-based
47. Asst. Prof. Rushikesh Chikane, MIT
48. Asst. Prof. Rushikesh Chikane, MIT
Extraction based summarization
The extractive text summarising approach entails
extracting essential words from a source material
and combining them to create a summary.
Without making any modifications to the texts, the
extraction is done according to the given measure
49. Asst. Prof. Rushikesh Chikane, MIT
Another way of text summarization is abstractive
summarization. We create new sentences from the
original content in this step.
This is in contrast to our previous extractive technique, in
which we only utilized the phrases that were present. It's
possible that the phrases formed by abstractive
summarization aren't present in the original text.
When abstraction is used for text summarization in deep
learning issues, it can overcome the extractive method's
Abstraction is more efficient than extraction. The text
summarising algorithms necessary for abstraction, on the
other hand, are more complex to build, which is why
extraction is still widely used.
50. Trend Analytics
Asst. Prof. Rushikesh Chikane, MIT
Trend analysis – also known as technical analysis –
is used to monitor metrics and their development
over time. As such, the technique relies on effective
Trend analysis is a methodology used in research to
gather and study data for prediction-making about
future consumer behavior based on the trend
analysis of observed and recorded data from past
and ongoing trends.
It helps determine the main characteristics of the
stock market and the consumers associated with it.
Trend analysis is the practice that gives us the ability
to look at data over time for a long-running survey.
51. Asst. Prof. Rushikesh Chikane, MIT
52. Asst. Prof. Rushikesh Chikane, MIT
This type of methodology is used to analyze patterns
and trends of a given group of relevant data or
objects of study in a specific cohort of time, as well
as its change in that period.
A clear example of this type of study is longitudinal
studies with the clear intention of detecting and
analyzing trends that arise from historical trends.
It is mainly used in ethnographic research and other
types of event-focused studies. The great
disadvantage of this type of trend analysis is that it is
exposed to many variables that could affect the final
result of the study.
53. Asst. Prof. Rushikesh Chikane, MIT
The geographic method of trend analysis is generally
easy and reliable; it can be the means to identify
commonalities and differences between user groups
belonging to the same or different geographies.
The main purpose of the geographic method is the
analysis of market trends that develop in groups of
users identified by their geographic location.
The downside of the geographic method is
consequently the geographic limitation for data
analysis, which can be influenced by factors such as
culture and traditions that are specific to the
geographic location user groups.
54. Asst. Prof. Rushikesh Chikane, MIT
The intuitive method is a type of trend analysis
implemented to analyze trends within groups of users
based on logical explanations, behavioral patterns, or
other elements perceived by a futurist.
This market trend analysis is helpful for prediction-
making without the need for large amounts of statistical
data. However, some issues with the methodology are
the overreliance on knowledge and logic provided by
futurists and researchers, which makes it prone to
become biased to its researcher.
The intuitive method is the most difficult type of trend
analysis and might not be as precise.
55. Challenges to Social media analytics
• cleaning unstructured textual data (e.g., normalizing text),
especially high-frequency streamed real-time data, still
presents numerous problems and research challenges.
• although social media data is accessible through APIs, due to
the commercial value of the data, most of the major sources
such as Facebook and Google are making it increasingly
difficult for academics to obtain comprehensive access to their
‘raw’ data; very few social data sources provide affordable data
offerings to academia and researchers. News services such as
Thomson Reuters and Bloomberg typicallycharge a premium
for access to their data.
55 Asst. Prof. Rushikesh Chikane, MIT
• In contrast, Twitter has recently announced the Twitter
Data Grants program, where researchers can apply to get
access to Twitter’s public tweets and historical data in
order to get insights from its massive set of data (Twitter
has more than 500 million tweets a day).
• once you have created a ‘big data’ resource, the data
needs to be secured, ownership and IP issues resolved
(i.e., storing scraped data is against most of the
publishers’ terms of service), and users provided with
different levels of access; otherwise, users may attempt to
‘suck’ all the valuable data from the database.
56 Asst. Prof. Rushikesh Chikane, MIT
57. Holistic data sources
• researchers are increasingly bringing together and
combining novel data sources: social media data, real-
time market & customer data and geospatial data for
• visual representation of data whereby information that
has been abstracted in some schematic form with the
goal of communicating information clearly and
effectively through graphical means. Given the
magnitude of the data involved, visualization is
becoming increasingly important.
57 Asst. Prof. Rushikesh Chikane, MIT
58. Analytics dashboards
• many social media platforms require users to write
APIs to access feeds or program analytics models
in a programming language, such as Java.
• While reasonable for computer scientists, these
skills are typically beyond most (social science)
• Non-programming interfaces are required for giving
what might be referred to as ‘deep’ access to ‘raw’
data, for example, configuring APIs, merging social
media feeds, combining holistic sources and
developing analytical models.
58 Asst. Prof. Rushikesh Chikane, MIT