An Introduction to Information Retrieval and Applications The score you get depends on the functions, difficulty and quality of your project
For system development:
System functions and correctness
For academic paper presentation
Quality and your presentation of the paper
Major methods/experimental results *must* be presented
Papers from top conferences are strongly suggested
E.g. SIGIR, WWW, CIKM, WSDM, JCDL, ICMR, …
Proposals are *required* for each team, and will be counted in the score
2. Homework assignments and programming
exercises: ~40%
Mid-term exam: ~25%
Term project: ~35%
Including proposal, presentation, and final report
3. About 3 programming exercises
Team-based (at most 2 persons per team)
You can either write your own code or reuse existing
open source code
The term project
Either team-based system development (the same as
programming exercises)
Or academic paper presentation
Only one person per team allowed
A proposal is *required* before midterm (Apr. 11,
2014)
4. The score you get depends on the functions,
difficulty and quality of your project
For system development:
System functions and correctness
For academic paper presentation
Quality and your presentation of the paper
Major methods/experimental results *must* be presented
Papers from top conferences are strongly suggested
E.g. SIGIR, WWW, CIKM, WSDM, JCDL, ICMR, …
Proposals are *required* for each team, and will be counted
in the score
5. Submission instructions
Programs, project proposals, and project reports in
electronic files must be submitted to the TA online at:
Submissions website: (TBD)
Before submission:
User name: Your student ID
Please change your default password at your first login
6. This course will NOT tell you
The tips and tricks of using search engines,
although power users might have better ideas on
how to improve them
There’re plenty of books and websites on that…
How to find books in libraries,
although it’s somewhat related to the basic IR
concepts
How to make money on the Web,
although the currently largest search engine did it
7. Things that you have been doing all day!
Searching for something interesting: Web, news,
e-mail, image, video, …
Asking for advices
…
User interests are changing all the time…
2011: New Zealand Earthquake
2012: Jeremy Lin
2013: Meteor Russia
2014: ? (next slide)
19. 流星
彗星
隕石
俄羅斯
地球
…
And other languages…
And other search engines…
And social websites…
27. “Information retrieval is a field concerned with the
structure, analysis, organization, storage, searching,
and retrieval of information.” (Salton, 1968)
28. Information retrieval (IR): a research field that
targets at effectively and efficiently searching
information in text and multimedia documents
In this course, we will introduce the basic text
and query models in IR, retrieval evaluation,
indexing and searching, and applications for IR
31. Text IR
Indexing and searching
Query languages and operations
Retrieval evaluation
Modeling
Boolean model
Vector space model
Probabilistic model
Applications for IR
Multimedia IR
Web search
Digital libraries
32. Basics in IR (focus)
Inverted indexes for boolean queries (Ch.1-5)
Term weighting and vector space model (Ch. 6-7)
Evaluation in IR (Ch. 8)
Advanced Topics
Relevance feedback (Ch. 9)
XML retrieval (Ch. 10)
Probabilistic IR (Ch. 11)
Language models (Ch. 12)
Machine learning in IR (useful)
Text classification (Ch. 13-15)
Document clustering (Ch. 16-18)
Web Search
Web crawling and indexes (Ch. 19-20)
Link analysis (Ch. 21)
33. Text mining
Machine Learning
Natural Language Processing
Social Network Analysis
…
34. Cross-language IR
Image, video, and multimedia IR
Speech retrieval
Music retrieval
User interfaces
Parallel, distributed, and P2P IR
Digital libraries
Information science perspective
Logic-based approaches to IR
Natural language processing techniques
…
35. Before midterm
Boolean retrieval (1 wk)
Indexing (2 wks)
Vector space model and evaluation (2 wk)
Relevance feedback (1 wk)
Probabilistic IR (2 wk)
After midterm
Text classification (1-2 wk)
Document clustering (1-2 wk)
Web search (2 wks)
Advanced topics: CLIR, IE, … (2 wks)
Term Project Presentation (3 wks)
36. Wikipedia page on Information Retrieval:
http://en.wikipedia.org/wiki/Information_ret
rieval
Information Retrieval Resources: http://www-
csli.stanford.edu/~hinrich/information-
retrieval.html
37. Journals
ACM TOIS: Transactions on Information Systems
JASIST: Journal of the American Society of Information Sciences
IP&M: Information Processing and Management
IEEE TKDE: Transactions on Knowledge and Data Engineering
Conferences
ACM SIGIR: International Conference on Information Retrieval
WWW: World Wide Web Conference
ACM CIKM: Conference on Information Knowledge and
Management
JCDL: ACM/IEEE Joint Conference on Digital Libraries
ACM WSDM: International Conference on Web Search and
Data Mining
TREC: Text Retrieval Conference
38. Slides and lectures will be offered mainly in
English
For better understanding for domestic students,
important concepts will be briefly summarized
in Chinese