1. Moving from Description to Prediction for Information Searching Jim Jansen College of Information Sciences and Technology The Pennsylvania State University [email_address] Information searching: actions (behavioral, affective, and cognitive) employed by people when interacting with an information system Information Searching
2.
3. What is the information context that we are facing?
9. Analysis of Search Marketplace Holding fairly stable over the last year or so
10. Analysis of Online Traffic Long tail for online traffic (i.e., a few sites with a lot of traffic and a whole bunch will little traffic)
11.
12.
13.
14. Illustration of Probabilistic User Modeling Using n-grams Given these states … … how accurately can we predict these? AC 5 A 4 ABCDE 3 ABCDE 2 ABCF 1 Search State Transitions User 40% D C 60% B A 100% E CD 66% D BC 1OO% C AB Accuracy Next State? Predictive Pattern
15.
16. Search engine logs as an information stream (voluminous, temporal, and multi-dimensional) Server Server Server Search Engines Servers Search Attributes Time 0 th period 1 st period … period n th period General Idea Information searching is a temporal stream (i.e., stateless)
17. Search Engine Logs – viewed as a temporal stream (i.e., stateless, with volume, mass, momentum, and acceleration) Search Engines Servers Search Attributes Time 0 th period 1 st period … period n th period General Idea What if, based on what has happened in the past in the temporal stream, … we could predict what is going to happen in the future?
18.
19.
20.
21.
22. Thank you! (open for questions and further discussion) Jim Jansen College of Information Sciences and Technology The Pennsylvania State University [email_address]
23. Search Logs has some common fields, such as time, queries, results, etc. We can enrich the log with additional fields. Back