In this paper we propose an approach that exploits social data associated with a Web resource to measure its a priori relevance. We show how these interaction traces left by the users on the resources, which are in the form of social signals as the number of like and share, can be exploited to quantify social properties such as popularity and reputation. We propose to model these properties as a priori probability that we integrate into language model. We evaluated the effectiveness of our approach on IMDb dataset containing 167438 resources and their social signals collected from several social networks. Our experimental results are statistically significant and show the interest of integrating social properties in a search model to enhance the information retrieval.
3. 1.1 Emergence of social Web
Numberof active users2013
1,2
1,4
1,7
2,4
2011
2012
2013
2014
Numberof Internet users
Social content per 1 minute
41000 Publications
1,8 Million Like
~350 GB of Data
Facebook
Source:
blogdumoderateur.com
quantcast.com
semiocast.com
1. Introduction
2. RelatedWork
5. Conclusion
3. Approach of SIR
4. ExperimentalResults
1
4. Video
Photo
Web Page
Web Resources
Resource. . .
Social Networks
Bookmark
Comment
Share/Recommend
Motion/Vote
Like/+1
Interaction
Extraction and quantification of social properties
Information RetrievalModel
(Ranking) IntegrationQueryResults
Fig1. Global presentation of our work
Social Signals
(Source of Evidence)
Popularity
Reputation
Freshness
2
5. 1.2 Example of Social Signals
3
1. Introduction
2. RelatedWork
5. Conclusion
3. Approach of SIR
4. ExperimentalResults
6. 1.3 Research Issues
Whatarethemostusefulsignalsandpropertiestoevaluateapriorirelevance(importance)ofaresource?
2
Whattheoreticalmodeltocombineapriorirelevanceofresourcewithitstopicalrelevance?
3
What is the impact of social properties on IR system performance?
4
1
Howtotranslatesocialsignalsintosocialproperties?
4
Whatarethemostfavoredsignalsandpropertieswhileusingattributeselectionalgorithms?andwhatarethemostcorrelatedwithdocumentsrelevance?
5
1. Introduction
2. RelatedWork
5. Conclusion
3. Approach of SIR
4. ExperimentalResults
7. 1. Introduction
2. RelatedWork
5. Conclusion
3. Approach of SIR
4. ExperimentalResults
2.1 Related Work
5
Sources ofevidence (Social Features)
Properties
Models
Authors
•Numberof:clicks,votes,recordsandrecommendations.
Popularity
Importance
Linearcombination
(Karweg et al., 2011)
•Numberof:like,dislike,commentsonYouTube.
•Theplaycount(numberoftimesauserlistenstoatrackonlastfm)
•PresenceofaURLinatweet.
Importance
Machine learning
and
Linear combination
(Chelaru et al., 2012)
(Khodaei et al. 2012)
(Alonso et al., 2010)
•Numberofretweets.
•Numberofannotations(tags).
Popularity
Machine learning
(Yang et al., 2012)
(Hong et al., 2011)
(Pantelet al., 2012)
•Socialapprovalvotes
Importance
Machine learning
(Kazaiand Milic- Frayling.,2009)
16. 3.1 ProposedApproach
•TextualContent:167438DocumentsfromINEXIMDb.
4.2 Description of DataSet
3. Approach of SIR
4. ExperimentalResults
14
Field
Description
Status
ID
Identifying the film (document)
-
Title
Film's title
indexed
Year
Year of the film release
indexed
Rated
Film classficationby content type
-
Released
Date of making the film
indexed
Runtime
Length of the film
indexed
Genre
Film genre (Action, Drama, etc.)
indexed
Director
Director of the film project
indexed
Writer
Writers and writers of the film
indexed
Actors
Main actors of the film
indexed
Plot
Text summary of the film
indexed
Poster
URL of the link poster
-
url
URL of the Web source document
-
UGC
Social data recovered
-
1. Introduction
2. RelatedWork
5. Conclusion
17. 3.1 ProposedApproach
•SocialContent:8socialdatafrom5socialnetworks.
•QueryandRelevanceJudgment:fromINEXIMDb
-30queries(topics)andtheirQrelsfromthesetofINEXIMDb.
-Top1000documentsreturnedbyeachtopic.
4.2 Description of DataSet
3. Approach of SIR
4. ExperimentalResults
ACEBOOK
Like
Share
Comment
Date oflast action
WITTER
Tweet
GOOGLE+
+1
Share
LINKED
DELICIOUS
Bookmark
15
1. Introduction
2. RelatedWork
5. Conclusion
18. 3.1 ProposedApproach
4.3 Quantifying of Social Properties
3. Approach of SIR
4. ExperimentalResults
SocialProperties
SocialSignals
Social Networks
Popularity P
Numberof«Comment»
C1
Facebook
Numberof «Tweet»
C2
Twitter
Numberof «Share»
C3
LinkedIn
Numberof «Share»
C4
Facebook
Reputation R
Numberof « Like»
C5
Google+
Numberof «+1»
C6
Facebook
Numberof «Bookmark»
C7
Delicious
Freshness F
Dates oflastactions
C8
Facebook
•Eachsocialpropertyisquantifiedbasedonsocialsignalsaccordingtotheirnatureandsignification.
16
1. Introduction
2. RelatedWork
5. Conclusion
19. 0
0,1
0,2
0,3
0,4
0,5
0,6
Like
Share
Comment
Tweet
Mention+1
Bookmark
Share(LIn)
Resultsof individualintegrationof social signals 3.1 ProposedApproach
4.4 Results: Single Priors and Combination Priors
3. Approach of SIR
4. ExperimentalResults
Facebook signals
17
0
0,1
0,2
0,3
0,4
0,5
0,6
0,7
Popularity
Reputation
All Criteria
All Properties
Differentcombinationsof social signals(social properties)
0
0,1
0,2
0,3
0,4
0,5
Lucene Solr
ML.Hiemstra
baselines (Topical Models)
P@10
P@20
nDCG
MAP
1. Introduction
2. RelatedWork
5. Conclusion
20. 3.1 ProposedApproach
4.4 Results: Impact of the Freshness
3. Approach of SIR
4. ExperimentalResults
18
0
0,1
0,2
0,3
0,4
0,5
Lucene Solr
ML.Hiemstra
baselines (Topical Models)
P@10
P@20
nDCG
MAP
0
0,1
0,2
0,3
0,4
0,5
0,6
0,7
Share
Comment
Share+Comment
Popularity
All Criteria
All Properties
Without Integration of Freshness
0
0,1
0,2
0,3
0,4
0,5
0,6
0,7
Share
Comment
Share+Comment
Popularity
All Criteria
All Properties
WithIntegrationof Freshness
F
F
F
F
F
F
F
1. Introduction
2. RelatedWork
5. Conclusion
22. 3.1 ProposedApproach
4.6 Results: Ranking Correlation Analysis
3. Approach of SIR
4. ExperimentalResults
Fig 1.Spearman correlation between social signals and relevance
Fig2.Spearman correlationbetweensocial propertiesand relevance
20
1. Introduction
2. RelatedWork
5. Conclusion
23. 3.1 ProposedApproach
4.6 Results: Ranking Correlation Analysis
3. Approach of SIR
4. ExperimentalResults
Fig3.Spearman's Rho correlation values for the social signals pairs
21
Thesocialsignalspairs:(tweet,share(LIn)),(bookmark,Tweet)and(mention+1, bookmark)arehighlycorrelated,i.e.,thesimilarityscoresofthesepairsarehigherthan0.70
bookmark, share(LIn) are the less important criteriafollowed by mention+1.
1. Introduction
2. RelatedWork
5. Conclusion
24. 3.1 ProposedApproach
1. Introduction
2. RelatedWork
5. Conclusion
5. Conclusion
3. ProposedApproaches
4. ExperimentalResults
•Social Information Retrieval based on Language Model
-Topical relevance (retrieval model based content only).
-Social relevance (retrieval model based content and social features).
•Experimental Evaluation
-Superiority of proposed approach compared to textual models (baselines).
-Positive ranking correlation between social signals and relevance.
-Attribute selection algorithms.
•Perspectives
-Integration of other social features.
-Further study on the impact of the temporal property.
-Comparison of the proposed models with other social models.
-Experimentalevaluationon other types of dataset.
22