2. INTRODUCTION
# P A P U A
Why #PAPUA ?
D I G I TA L TA L E N T S C H O L A R S H I P
2 0 1 9
3. ROADMAP
# P A P U A
Crawling
Marketers must link
the price to the real.
Modelling
Marketers must link
the price to the real.
Result
Marketers must link
the price to the real.
Prepocessing
Marketers must link
the price to the real.
Visualization
Marketers must link
the price to the real.
D I G I TA L TA L E N T S C H O L A R S H I P
2 0 1 9
4. DATA COLLECTING
# P A P U A
LIBRARY
Python Library for
collecting data from
Twitter
HASHTAG
Search Query
Tweepy #PAPUA EN RUN
LANGUAGE
Get only english
language tweet
CRAWL
Start Crawling Data
D I G I TA L TA L E N T S C H O L A R S H I P
2 0 1 9
* Code running in Amazon EC2 Service
5. DATA PREPOCESSING
# P A P U A
Cleaning
• Clean unused
attributes
Remove # and @
• Using `re` python library
Remove URL
• Using `re` python library
Correcting Word
• Correcting Shortword
(Replaced with dict)
Remove Stopword
• Using nltk.corpus
python Library
Rawdata Output
D I G I TA L TA L E N T S C H O L A R S H I P
2 0 1 9
6. Rawdata
D I G I TA L TA L E N T S C H O L A R S H I P
2 0 1 9
MODELLING
1 2 3
Latent Dirichlet Allocation
Social Network Analysis
Modelling
Clustering
# P A P U A
3
4
Sentiment
7. Rawdata
D I G I TA L TA L E N T S C H O L A R S H I P
2 0 1 9 SENTIMENT
# P A P U A
TEXTBLOB
• Positive Content
• Negative Content
• Netral Content
8. Rawdata
D I G I TA L TA L E N T S C H O L A R S H I P
2 0 1 9 CLUSTERING
# P A P U A
hierarchical clustering
(Ward D)
Euclidean Distance Sparse = 0.95
9. Rawdata
D I G I TA L TA L E N T S C H O L A R S H I P
2 0 1 9
# P A P U A
Plot Type : Stars Sparse = 0.95
SOCIAL NETWORK
ANALYSIS
10. Rawdata
D I G I TA L TA L E N T S C H O L A R S H I P
2 0 1 9
LATENT DIRICHLET
ALLOCATION
# P A P U A
8 Topics 4 Terms
11. Rawdata
D I G I TA L TA L E N T S C H O L A R S H I P
2 0 1 9
VISUALIZATION
# P A P U A
12. Rawdata
D I G I TA L TA L E N T S C H O L A R S H I P
2 0 1 9
# P A P U A
TERM FREQUENCY
13. Rawdata
D I G I TA L TA L E N T S C H O L A R S H I P
2 0 1 9
# P A P U A
TERM FREQUENCY – POSITIVE DATA
14. Rawdata
D I G I TA L TA L E N T S C H O L A R S H I P
2 0 1 9
# P A P U A
TERM FREQUENCY – NEGATIVE DATA
15. Rawdata
D I G I TA L TA L E N T S C H O L A R S H I P
2 0 1 9
# P A P U A
TERM FREQUENCY – NETRAL DATA
16. Rawdata
D I G I TA L TA L E N T S C H O L A R S H I P
2 0 1 9
# P A P U A
CLUSTERING
17. Rawdata
D I G I TA L TA L E N T S C H O L A R S H I P
2 0 1 9
# P A P U A
CLUSTERING – POSITIVE DATA
18. Rawdata
D I G I TA L TA L E N T S C H O L A R S H I P
2 0 1 9
# P A P U A
CLUSTERING – NEGATIVE DATA
19. Rawdata
D I G I TA L TA L E N T S C H O L A R S H I P
2 0 1 9
# P A P U A
CLUSTERING – NETRAL DATA
20. Rawdata
D I G I TA L TA L E N T S C H O L A R S H I P
2 0 1 9
# P A P U A
SOCIAL NETWORK ANALYSIS
21. Rawdata
D I G I TA L TA L E N T S C H O L A R S H I P
2 0 1 9
# P A P U A
SOCIAL NETWORK ANALYSIS – POSITIVE DATA
22. Rawdata
D I G I TA L TA L E N T S C H O L A R S H I P
2 0 1 9
# P A P U A
SOCIAL NETWORK ANALYSIS – NEGATIVE DATA
23. Rawdata
D I G I TA L TA L E N T S C H O L A R S H I P
2 0 1 9
# P A P U A
SOCIAL NETWORK ANALYSIS – NETRAL DATA
24. Rawdata
D I G I TA L TA L E N T S C H O L A R S H I P
2 0 1 9
# P A P U A
LATENT DIRICHLET ALLOCATION
25. Rawdata
D I G I TA L TA L E N T S C H O L A R S H I P
2 0 1 9
# P A P U A
LATENT DIRICHLET ALLOCATION – POSITIVE DATA
26. Rawdata
D I G I TA L TA L E N T S C H O L A R S H I P
2 0 1 9
# P A P U A
LATENT DIRICHLET ALLOCATION – NEGATIVE DATA
27. Rawdata
D I G I TA L TA L E N T S C H O L A R S H I P
2 0 1 9
# P A P U A
LATENT DIRICHLET ALLOCATION – NETRAL DATA
28. VISUALIZATION
# P A P U AD I G I TA L TA L E N T S C H O L A R S H I P
2 0 1 9
29. VISUALIZATION
# P A P U AD I G I TA L TA L E N T S C H O L A R S H I P
2 0 1 9