Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
Data Mining of Informational Stream in Social Networks
1. Data Mining of Informational Stream
in Social Networks
Forecasting of Social, Market
and Financial Trends
Bohdan Pavlyshenko
e-mail: b.pavlyshenko@gmail.com
blog: bpavlyshenko.blogspot.com
2. Used technologies: R, Python, Java, Hadoop/MapReduce/Pig/Hive
The prototypes of data mining systems are based on the theory of
formal concept analysis and on the theory of frequent itemsets. Using a
model of a semantic concept lattice makes it possible to analyze
semantically related sets of words and to construct association rules.
The use of quantitative characteristics of informational streams for
marketing trend forecasting and for the analysis of users’ attitude towards
different goods and services (Opinion Mining)
Detection of predictive potential of association rules in informational
streams and the use of these rules in autoregressive models (ARIMA, VAR)
for predicting, in particular, the financial trends on stock markets. Such a
model takes into account both the past behavior of financial time row of a
company and the time dynamics of quantitative characteristics of
association rules.
3. The analysis of communities and their leaders who form analyzed trends
in social networks. The analysis of the presence of manipulative formation of
users’ attitude towards this or that commodity or economic trend.
The causality analysis on the basis of Granger tests for singling out the
principal and subordinate time rows, particularly for informational streams,
economic indicators, etc.
The creation of a subsystem of recommendations for users. For example,
in an online store, this system analyzes users’ behavior, their purchases, their
feedback towards goods or services. Based on the user’s activity, one can
create his/her semantic profile and then make various offers to this user,
taking into account his/her activity and the decisions of users with similar
profiles. Such an approach may shorten significantly the time the user spent
while searching goods and services, and give him/her unknown but necessary
offers, revealed on the basis of other similar users’ activities.
4. The analysis of financial tweets
The package “Tweet Miner for Stock Market”
5. The analysis of financial tweets
The formation of keyword frequent sets with the biggest support value
6. The analysis of financial tweets
The analysis of causal relationship between the frequent sets in tweets and
Apple stock prices.
The results obtained show that it is possible to predict stock prices on the
basis of data mining of informational streams in social networks.
7. The analysis of financial tweets
Forecasting based on ARIMA model
Granger causality test between quantitative characteristics
of tweets and Apple stock prices.
test 1
Granger causality test
Model 1: V3 ~ Lags(V3, 1:1) + Lags(V2, 1:1)
Model 2: V3 ~ Lags(V3, 1:1)
Res.Df Df F Pr(>F)
1 87
2 88 -1 10.05 0.002103 **
--Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
test 2
Granger causality test
Model 1: V2 ~ Lags(V2, 1:1) + Lags(V3, 1:1)
Model 2: V2 ~ Lags(V2, 1:1)
Res.Df Df F Pr(>F)
1 87
2 88 -1 0.3261 0.5694
Forecasting based on VAR model
8. The examples of the studies of semantic concepts in
Twitter messages
9. The examples of the studies of semantic concepts in
Twitter messages
The Final Olympic Tennis Tournament (2012)
10. The examples of test studies of semantic
concepts in Twitter messages
The prediction of Eurovision 2013 favorites
Before the Eurovision 2013 final we published our
forecasting of a winner and the favorites in our blog. Later
on it proved to be correct.
11. The examples of test studies of semantic concepts
in Twitter messages
Travel trends
The analysis of travel trends
12. The examples of test studies of semantic
concepts in Twitter messages
Travel trends
The analysis of travel trends
13. The examples of test studies of
semantic concepts in Twitter
messages
Market analysis of iPhone concept
14. The examples of test studies of semantic
concepts in Twitter messages
Market analysis of iPhone concept
15. The examples of test studies of semantic concepts in
Twitter messages
The prediction of Royal baby’s name
In this work, we analyze the existence of possible correlation between
public opinion of twitter users and the decision-making of persons who are
influential in the society. We carry out this analysis on the example of the
discussion of probable name of the British crown baby, born in July, 2013.
In our study, we use the methods of quantitative processing of natural
language, the theory of frequent sets, the algorithms of visual displaying of
users' communities. We also analyzed the time dynamics of keyword
frequencies. The analysis showed that the main predictable name was
dominating in the spectrum of names before the official announcement.
Using the theories of frequent sets, we showed that the full name
consisting of three component names was the part of top 5 by the value of
support. It was revealed that the structure of dynamically formed users'
communities participating in the discussion is determined by only a few
leaders who influence significantly the viewpoints of other users.
16. The examples of test studies of semantic concepts in Twitter messages
Royal baby’s name forecasting
The name George was
dominating in the spectrum of
names before the official
announcement.
17. The examples of test studies of semantic concepts in Twitter messages
Royal baby’s name forecasting
10 first frequent sets were
created by five names, the
three of which are the
components of Prince’s
full name George
Alexander Louis.
18. The examples of test studies of semantic concepts in Twitter
messages
The Royal baby’s name forecasting
Users’ societies, which formed the discussion trends.
19. More test examples and studies are in my blog
http://bpavlyshenko.blogspot.com
Thank you for your attention!
Bohdan Pavlyshenko,
Ph.D., e-mail: b.pavlyshenko@gmail.com