Recent research has shown that digital online geo- location traces are new and valuable sources to predict social interactions between users, e.g. , check-ins via FourSquare or geo-location information in Flickr images. Interestingly, if we look at related work in this area, research studying the extent to which social interactions can be predicted between users by taking more than one location-based knowledge source into account does not exist. To contribute to this field of research, we have collected social interaction data of users in an online social network called My Second Life and three related location-based knowledge sources of these users (monitored locations, shared locations and favored locations), to show the extent to which social interactions between users can be predicted. Using supervised and unsupervised machine learning techniques, we find that on the one hand the same location-based features (e.g. the common regions and common observations) perform well across the three different sources. On the other hand, we find that the shared location information is better suited to predict social interactions between users than monitored or favored location information of the user.
Predicting Social Interactions from Different Sources of Location-based Knowledge
1. Predicting Social Interactions
Based on Different Sources of Location-based Knowledge
Michael Steurer - msteurer@iicm.edu, Graz University of Technology
Christoph Trattner - ctrattner@know-center.at, Graz University of Technology
Denis Helic - dhelic@tugraz.at, Graz University of Technology
2. What Did We Do?
Predict Interactions in Social Network
Postings, Comments, Loves
Similar to Facebook, G+
Three Different Sources of Location Data
Monitored Locations
Shared Locations
Favoured Locations
Predciting Social Interactions Using Location-Based Sources
2/23
6. Collected Data
Online Social Data
152,509 Unique Users
1,084,002 Postings (Text Messages, Snapshots)
459,734 Comments
1,631,568 Loves
Groups and Interests
285,528 Unique Groups
15.51 Groups per User on Average
6.5 Interests per User on Average
Predciting Social Interactions Using Location-Based Sources
6/23
9. Collected Data
Event Data
12 Months Starting in March 2012
Working Hours 24/7
Location-based Social Data
4,105 Unique Locations
19 Million Data Samples
410,619 Unique Users
Predciting Social Interactions Using Location-Based Sources
9/23
11. Collected Data
Compare to Check-ins, e.g. FourSquare
Shared from "In-World"
Harvested Locations
45,835 User Profiles
496,912 Snapshots
13,583 Unique Locations
Predciting Social Interactions Using Location-Based Sources
11/23
13. Collected Data
Extracted from Profiles
Limitations
10 per User
Enhanced with Picture and Text
Favoured Locations
191,610 User Profiles
811,386 Picks
25,311 Unique Locations
Predciting Social Interactions Using Location-Based Sources
13/23
15. Network Setup
Create Networks from Data
Online Social Network
Enhance with Location-based Data
Predciting Social Interactions Using Location-Based Sources
15/23
16. Feature Modeling
Compute User Relation
Common and Total Locations (Jaccard's Coefficient)
Entropy Common Locations
User Count Common Locations
Frequency Common Locations
Predciting Social Interactions Using Location-Based Sources
16/23
17. Experiment Setup
Feature Modeling
Unsupervised Learning with Ranked Lists
Information Gain for Single Features
Prediction Task
Binary Classification Problem
Supervised Learning Algorithms
Logistic Regression, Random Forest, SVM
Predciting Social Interactions Using Location-Based Sources
17/23
22. Conclusions
User-Pairs with Interactions
More Common and Total Locations
Locations have Less Entropy, Frequency, and User-Count
Results of the Prediction
Common Locations, Jaccard
Shared > Monitored > Favoured
Same Characteristics Among Algorithms
Logistic Regression was Best
Predciting Social Interactions Using Location-Based Sources
22/23
23. Predicting Social Interactions
Based on Different Sources of Location-based Knowledge
Michael Steurer - msteurer@iicm.edu, Graz University of Technology
Christoph Trattner - ctrattner@know-center.at, Graz University of Technology
Denis Helic - dhelic@tugraz.at, Graz University of Technology