47. Ranking Schemes
Proximity between keyword nodes
EASE:
XRank:
w is the smallest text window in n that contains all search keywords
2012-4-10
SIGMOD09 Tutorial 50
48. Ranking Schemes
Based on graph structure
BANKS
Nodes:
Edges :
PageRank-like methods
XRank [Guo et al, SIGMOD03]
ObjectRank [Balmin et al, VLDB04] : considers both
Global ObjectRank and Keyword-specific
ObjectRank
2012-4-10
SIGMOD09 Tutorial 51
49. Ranking Schemes
1 ln(1 ln(tf )) N 1
Score(n, Q) ln
w Q n (1 s ) s dl / avdl df
TF*IDF based:
Discover/EASE
[Liu et al, SIGMOD06]
SPARK
but not at the node level
2012-4-10
SIGMOD09 Tutorial 52
50. Relevance Models
Relevance sample probabilities
Model q1 P(w|Q) w
israeli
.077 palestinian
M q2 palestinian .055 israel
.034 jerusalem
M q3 raids .033 protest
M .027 raid
w ??? .011 clash
P(q | w) .010 bank
.010 west
P( w) .010 troop
P( w | q1...qk ) P(q | M ) P( M | w) …
P(q1...qk ) q M
P(q1...qk | w)
Notas del editor
Top-K Queries are a long studied topic in the database and information retrieval communitiesThe main objective of these queries is to return the K highest-ranked answers quickly and efficiently.A Top-K query returns the subset of most relevant answers, instead of ALL answers, for two reasons: i) to minimize the cost metric that is associated with the retrieval of all answers (e.g., disk, network, etc.)ii) to maximize the quality of the answer set, such that the user is not overwhelmed with irrelevant results