SlideShare una empresa de Scribd logo
1 de 39
Descargar para leer sin conexión
By
SATHISHKUMAR G
(sathishsak111@gmail.com)
 Homework assignments and programming
exercises: ~40%
 Mid-term exam: ~25%
 Term project: ~35%
 Including proposal, presentation, and final report
 About 3 programming exercises
 Team-based (at most 2 persons per team)
 You can either write your own code or reuse existing
open source code
 The term project
 Either team-based system development (the same as
programming exercises)
 Or academic paper presentation
 Only one person per team allowed
 A proposal is *required* before midterm (Apr. 11,
2014)
 The score you get depends on the functions,
difficulty and quality of your project
 For system development:
 System functions and correctness
 For academic paper presentation
 Quality and your presentation of the paper
 Major methods/experimental results *must* be presented
 Papers from top conferences are strongly suggested
 E.g. SIGIR, WWW, CIKM, WSDM, JCDL, ICMR, …
 Proposals are *required* for each team, and will be counted
in the score
 Submission instructions
 Programs, project proposals, and project reports in
electronic files must be submitted to the TA online at:
 Submissions website: (TBD)
 Before submission:
 User name: Your student ID
 Please change your default password at your first login
 This course will NOT tell you
 The tips and tricks of using search engines,
although power users might have better ideas on
how to improve them
 There’re plenty of books and websites on that…
 How to find books in libraries,
although it’s somewhat related to the basic IR
concepts
 How to make money on the Web,
although the currently largest search engine did it
 Things that you have been doing all day!
 Searching for something interesting: Web, news,
e-mail, image, video, …
 Asking for advices
 …
 User interests are changing all the time…
 2011: New Zealand Earthquake
 2012: Jeremy Lin
 2013: Meteor Russia
 2014: ? (next slide)
 An Introduction to Information Retrieval and Applications
 An Introduction to Information Retrieval and Applications
 An Introduction to Information Retrieval and Applications
 An Introduction to Information Retrieval and Applications
 An Introduction to Information Retrieval and Applications
 An Introduction to Information Retrieval and Applications
 An Introduction to Information Retrieval and Applications
 An Introduction to Information Retrieval and Applications
 An Introduction to Information Retrieval and Applications
 Blast
 Explosion
 Chelyabinsk
 Asteroid 2012 DA14
 …
 An Introduction to Information Retrieval and Applications
 流星
 彗星
 隕石
 俄羅斯
 地球
 …
 And other languages…
 And other search engines…
 And social websites…
 An Introduction to Information Retrieval and Applications
 An Introduction to Information Retrieval and Applications
 An Introduction to Information Retrieval and Applications
 An Introduction to Information Retrieval and Applications
 An Introduction to Information Retrieval and Applications
 An Introduction to Information Retrieval and Applications
 An Introduction to Information Retrieval and Applications
 “Information retrieval is a field concerned with the
structure, analysis, organization, storage, searching,
and retrieval of information.” (Salton, 1968)
 Information retrieval (IR): a research field that
targets at effectively and efficiently searching
information in text and multimedia documents
 In this course, we will introduce the basic text
and query models in IR, retrieval evaluation,
indexing and searching, and applications for IR
 An Introduction to Information Retrieval and Applications
Inverted
Index
User
Interface
Text Operations
Query
Expansion
Indexing
Retrieval
Ranking
Text
query
user need
user feedback
ranked docs
retrieved docs
Doc representationlogical view
inverted file
Document
Collection
 Text IR
 Indexing and searching
 Query languages and operations
 Retrieval evaluation
 Modeling
 Boolean model
 Vector space model
 Probabilistic model
 Applications for IR
 Multimedia IR
 Web search
 Digital libraries
 Basics in IR (focus)
 Inverted indexes for boolean queries (Ch.1-5)
 Term weighting and vector space model (Ch. 6-7)
 Evaluation in IR (Ch. 8)
 Advanced Topics
 Relevance feedback (Ch. 9)
 XML retrieval (Ch. 10)
 Probabilistic IR (Ch. 11)
 Language models (Ch. 12)
 Machine learning in IR (useful)
 Text classification (Ch. 13-15)
 Document clustering (Ch. 16-18)
 Web Search
 Web crawling and indexes (Ch. 19-20)
 Link analysis (Ch. 21)
 Text mining
 Machine Learning
 Natural Language Processing
 Social Network Analysis
 …
 Cross-language IR
 Image, video, and multimedia IR
 Speech retrieval
 Music retrieval
 User interfaces
 Parallel, distributed, and P2P IR
 Digital libraries
 Information science perspective
 Logic-based approaches to IR
 Natural language processing techniques
 …
 Before midterm
 Boolean retrieval (1 wk)
 Indexing (2 wks)
 Vector space model and evaluation (2 wk)
 Relevance feedback (1 wk)
 Probabilistic IR (2 wk)
 After midterm
 Text classification (1-2 wk)
 Document clustering (1-2 wk)
 Web search (2 wks)
 Advanced topics: CLIR, IE, … (2 wks)
 Term Project Presentation (3 wks)
 Wikipedia page on Information Retrieval:
http://en.wikipedia.org/wiki/Information_ret
rieval
 Information Retrieval Resources: http://www-
csli.stanford.edu/~hinrich/information-
retrieval.html

 Journals
 ACM TOIS: Transactions on Information Systems
 JASIST: Journal of the American Society of Information Sciences
 IP&M: Information Processing and Management
 IEEE TKDE: Transactions on Knowledge and Data Engineering
 Conferences
 ACM SIGIR: International Conference on Information Retrieval
 WWW: World Wide Web Conference
 ACM CIKM: Conference on Information Knowledge and
Management
 JCDL: ACM/IEEE Joint Conference on Digital Libraries
 ACM WSDM: International Conference on Web Search and
Data Mining
 TREC: Text Retrieval Conference
 Slides and lectures will be offered mainly in
English
 For better understanding for domestic students,
important concepts will be briefly summarized
in Chinese
 An Introduction to Information Retrieval and Applications

Más contenido relacionado

La actualidad más candente

Tdm information retrieval
Tdm information retrievalTdm information retrieval
Tdm information retrievalKU Leuven
 
Introduction to Information Retrieval & Models
Introduction to Information Retrieval & ModelsIntroduction to Information Retrieval & Models
Introduction to Information Retrieval & ModelsMounia Lalmas-Roelleke
 
Information retrieval (introduction)
Information  retrieval (introduction) Information  retrieval (introduction)
Information retrieval (introduction) Primya Tamil
 
The vector space model
The vector space modelThe vector space model
The vector space modelpkgosh
 
Information retrieval dynamic indexing
Information retrieval dynamic indexingInformation retrieval dynamic indexing
Information retrieval dynamic indexingNadia Nahar
 
Tutorial 1 (information retrieval basics)
Tutorial 1 (information retrieval basics)Tutorial 1 (information retrieval basics)
Tutorial 1 (information retrieval basics)Kira
 
Introduction into Search Engines and Information Retrieval
Introduction into Search Engines and Information RetrievalIntroduction into Search Engines and Information Retrieval
Introduction into Search Engines and Information RetrievalA. LE
 
Latent Semantic Indexing For Information Retrieval
Latent Semantic Indexing For Information RetrievalLatent Semantic Indexing For Information Retrieval
Latent Semantic Indexing For Information RetrievalSudarsun Santhiappan
 
Opinion Mining Tutorial (Sentiment Analysis)
Opinion Mining Tutorial (Sentiment Analysis)Opinion Mining Tutorial (Sentiment Analysis)
Opinion Mining Tutorial (Sentiment Analysis)Kavita Ganesan
 
Information Retrieval-4(inverted index_&_query handling)
Information Retrieval-4(inverted index_&_query handling)Information Retrieval-4(inverted index_&_query handling)
Information Retrieval-4(inverted index_&_query handling)Jeet Das
 
Boolean,vector space retrieval Models
Boolean,vector space retrieval Models Boolean,vector space retrieval Models
Boolean,vector space retrieval Models Primya Tamil
 
Information Retrieval Models
Information Retrieval ModelsInformation Retrieval Models
Information Retrieval ModelsNisha Arankandath
 
Data science presentation
Data science presentationData science presentation
Data science presentationMSDEVMTL
 
Vector space classification
Vector space classificationVector space classification
Vector space classificationUjjawal
 

La actualidad más candente (20)

Tdm information retrieval
Tdm information retrievalTdm information retrieval
Tdm information retrieval
 
Introduction to Information Retrieval & Models
Introduction to Information Retrieval & ModelsIntroduction to Information Retrieval & Models
Introduction to Information Retrieval & Models
 
Information retrieval (introduction)
Information  retrieval (introduction) Information  retrieval (introduction)
Information retrieval (introduction)
 
Information Retrieval Evaluation
Information Retrieval EvaluationInformation Retrieval Evaluation
Information Retrieval Evaluation
 
The vector space model
The vector space modelThe vector space model
The vector space model
 
Information retrieval dynamic indexing
Information retrieval dynamic indexingInformation retrieval dynamic indexing
Information retrieval dynamic indexing
 
Tutorial 1 (information retrieval basics)
Tutorial 1 (information retrieval basics)Tutorial 1 (information retrieval basics)
Tutorial 1 (information retrieval basics)
 
Ir 02
Ir   02Ir   02
Ir 02
 
Term weighting
Term weightingTerm weighting
Term weighting
 
Introduction into Search Engines and Information Retrieval
Introduction into Search Engines and Information RetrievalIntroduction into Search Engines and Information Retrieval
Introduction into Search Engines and Information Retrieval
 
Latent Semantic Indexing For Information Retrieval
Latent Semantic Indexing For Information RetrievalLatent Semantic Indexing For Information Retrieval
Latent Semantic Indexing For Information Retrieval
 
Text mining
Text miningText mining
Text mining
 
Opinion Mining Tutorial (Sentiment Analysis)
Opinion Mining Tutorial (Sentiment Analysis)Opinion Mining Tutorial (Sentiment Analysis)
Opinion Mining Tutorial (Sentiment Analysis)
 
Information Retrieval-4(inverted index_&_query handling)
Information Retrieval-4(inverted index_&_query handling)Information Retrieval-4(inverted index_&_query handling)
Information Retrieval-4(inverted index_&_query handling)
 
Boolean,vector space retrieval Models
Boolean,vector space retrieval Models Boolean,vector space retrieval Models
Boolean,vector space retrieval Models
 
Information Retrieval Models
Information Retrieval ModelsInformation Retrieval Models
Information Retrieval Models
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
Data science presentation
Data science presentationData science presentation
Data science presentation
 
Vector space classification
Vector space classificationVector space classification
Vector space classification
 
Web Crawler
Web CrawlerWeb Crawler
Web Crawler
 

Similar a An Introduction to Information Retrieval and Applications

Slawek Korea
Slawek KoreaSlawek Korea
Slawek KoreaSlawek
 
Info 2402 information retrieval technologies course_outline
Info 2402 information retrieval technologies course_outlineInfo 2402 information retrieval technologies course_outline
Info 2402 information retrieval technologies course_outlineShahriar Rafee
 
Data science syllabus
Data science syllabusData science syllabus
Data science syllabusanoop bk
 
Findability through Traceability - A Realistic Application of Candidate Tr...
Findability through Traceability  - A Realistic Application of Candidate Tr...Findability through Traceability  - A Realistic Application of Candidate Tr...
Findability through Traceability - A Realistic Application of Candidate Tr...Markus Borg
 
Eddi: Interactive Topic-Based Browsing of Social Status Streams
Eddi: Interactive Topic-Based Browsing of Social Status StreamsEddi: Interactive Topic-Based Browsing of Social Status Streams
Eddi: Interactive Topic-Based Browsing of Social Status StreamsMichael Bernstein
 
Ranking Resources in Folksonomies by Exploiting Semantic and Context-specific...
Ranking Resources in Folksonomies by Exploiting Semantic and Context-specific...Ranking Resources in Folksonomies by Exploiting Semantic and Context-specific...
Ranking Resources in Folksonomies by Exploiting Semantic and Context-specific...Thomas Rodenhausen
 
semantic and social (intra)webs
semantic and social (intra)webssemantic and social (intra)webs
semantic and social (intra)websFabien Gandon
 
Automatic Classification of Springer Nature Proceedings with Smart Topic Miner
Automatic Classification of Springer Nature Proceedings with Smart Topic MinerAutomatic Classification of Springer Nature Proceedings with Smart Topic Miner
Automatic Classification of Springer Nature Proceedings with Smart Topic MinerFrancesco Osborne
 
PATHS Final state of art monitoring report v0_4
PATHS  Final state of art monitoring report v0_4PATHS  Final state of art monitoring report v0_4
PATHS Final state of art monitoring report v0_4pathsproject
 
Topic Modeling for Learning Analytics Researchers LAK15 Tutorial
Topic Modeling for Learning Analytics Researchers LAK15 TutorialTopic Modeling for Learning Analytics Researchers LAK15 Tutorial
Topic Modeling for Learning Analytics Researchers LAK15 TutorialVitomir Kovanovic
 
Deep Learning for Recommender Systems @ TDC SP 2019
Deep Learning for Recommender Systems @ TDC SP 2019Deep Learning for Recommender Systems @ TDC SP 2019
Deep Learning for Recommender Systems @ TDC SP 2019Gabriel Moreira
 
UML-Driven Software Performance Engineering: A systematic mapping and a revie...
UML-Driven Software Performance Engineering: A systematic mapping and a revie...UML-Driven Software Performance Engineering: A systematic mapping and a revie...
UML-Driven Software Performance Engineering: A systematic mapping and a revie...Vahid Garousi
 
INSC580MacasaOpenSourceSoftwareLibrariesFall2016
INSC580MacasaOpenSourceSoftwareLibrariesFall2016INSC580MacasaOpenSourceSoftwareLibrariesFall2016
INSC580MacasaOpenSourceSoftwareLibrariesFall2016Michael J. Macasa
 
Mei Wang & Sharon Hu's Institutional Repository and Academic Library
Mei Wang & Sharon Hu's Institutional Repository and Academic LibraryMei Wang & Sharon Hu's Institutional Repository and Academic Library
Mei Wang & Sharon Hu's Institutional Repository and Academic LibraryFuWaye Bender
 
Geo-annotations in Semantic Digital Libraries
Geo-annotations in Semantic Digital Libraries Geo-annotations in Semantic Digital Libraries
Geo-annotations in Semantic Digital Libraries mdabrowski
 
Revising lis curriculum with respect to ict application in india
Revising lis curriculum with respect to ict application in india Revising lis curriculum with respect to ict application in india
Revising lis curriculum with respect to ict application in india Libsoul Technologies Pvt. Ltd.
 

Similar a An Introduction to Information Retrieval and Applications (20)

Slawek Korea
Slawek KoreaSlawek Korea
Slawek Korea
 
Semantic Web in Action
Semantic Web in ActionSemantic Web in Action
Semantic Web in Action
 
Info 2402 information retrieval technologies course_outline
Info 2402 information retrieval technologies course_outlineInfo 2402 information retrieval technologies course_outline
Info 2402 information retrieval technologies course_outline
 
Data science syllabus
Data science syllabusData science syllabus
Data science syllabus
 
00 intro
00 intro00 intro
00 intro
 
Findability through Traceability - A Realistic Application of Candidate Tr...
Findability through Traceability  - A Realistic Application of Candidate Tr...Findability through Traceability  - A Realistic Application of Candidate Tr...
Findability through Traceability - A Realistic Application of Candidate Tr...
 
Eddi: Interactive Topic-Based Browsing of Social Status Streams
Eddi: Interactive Topic-Based Browsing of Social Status StreamsEddi: Interactive Topic-Based Browsing of Social Status Streams
Eddi: Interactive Topic-Based Browsing of Social Status Streams
 
Ranking Resources in Folksonomies by Exploiting Semantic and Context-specific...
Ranking Resources in Folksonomies by Exploiting Semantic and Context-specific...Ranking Resources in Folksonomies by Exploiting Semantic and Context-specific...
Ranking Resources in Folksonomies by Exploiting Semantic and Context-specific...
 
semantic and social (intra)webs
semantic and social (intra)webssemantic and social (intra)webs
semantic and social (intra)webs
 
Automatic Classification of Springer Nature Proceedings with Smart Topic Miner
Automatic Classification of Springer Nature Proceedings with Smart Topic MinerAutomatic Classification of Springer Nature Proceedings with Smart Topic Miner
Automatic Classification of Springer Nature Proceedings with Smart Topic Miner
 
PATHS Final state of art monitoring report v0_4
PATHS  Final state of art monitoring report v0_4PATHS  Final state of art monitoring report v0_4
PATHS Final state of art monitoring report v0_4
 
Topic Modeling for Learning Analytics Researchers LAK15 Tutorial
Topic Modeling for Learning Analytics Researchers LAK15 TutorialTopic Modeling for Learning Analytics Researchers LAK15 Tutorial
Topic Modeling for Learning Analytics Researchers LAK15 Tutorial
 
bonino
boninobonino
bonino
 
Deep Learning for Recommender Systems @ TDC SP 2019
Deep Learning for Recommender Systems @ TDC SP 2019Deep Learning for Recommender Systems @ TDC SP 2019
Deep Learning for Recommender Systems @ TDC SP 2019
 
SMART Seminar Series: "From Big Data to Smart data"
SMART Seminar Series: "From Big Data to Smart data"SMART Seminar Series: "From Big Data to Smart data"
SMART Seminar Series: "From Big Data to Smart data"
 
UML-Driven Software Performance Engineering: A systematic mapping and a revie...
UML-Driven Software Performance Engineering: A systematic mapping and a revie...UML-Driven Software Performance Engineering: A systematic mapping and a revie...
UML-Driven Software Performance Engineering: A systematic mapping and a revie...
 
INSC580MacasaOpenSourceSoftwareLibrariesFall2016
INSC580MacasaOpenSourceSoftwareLibrariesFall2016INSC580MacasaOpenSourceSoftwareLibrariesFall2016
INSC580MacasaOpenSourceSoftwareLibrariesFall2016
 
Mei Wang & Sharon Hu's Institutional Repository and Academic Library
Mei Wang & Sharon Hu's Institutional Repository and Academic LibraryMei Wang & Sharon Hu's Institutional Repository and Academic Library
Mei Wang & Sharon Hu's Institutional Repository and Academic Library
 
Geo-annotations in Semantic Digital Libraries
Geo-annotations in Semantic Digital Libraries Geo-annotations in Semantic Digital Libraries
Geo-annotations in Semantic Digital Libraries
 
Revising lis curriculum with respect to ict application in india
Revising lis curriculum with respect to ict application in india Revising lis curriculum with respect to ict application in india
Revising lis curriculum with respect to ict application in india
 

Más de sathish sak

TRANSPARENT CONCRE
TRANSPARENT CONCRETRANSPARENT CONCRE
TRANSPARENT CONCREsathish sak
 
Stationary Waves
Stationary WavesStationary Waves
Stationary Wavessathish sak
 
Electrical Activity of the Heart
Electrical Activity of the HeartElectrical Activity of the Heart
Electrical Activity of the Heartsathish sak
 
Electrical Activity of the Heart
Electrical Activity of the HeartElectrical Activity of the Heart
Electrical Activity of the Heartsathish sak
 
Software process life cycles
Software process life cyclesSoftware process life cycles
Software process life cycles sathish sak
 
Digital Logic Circuits
Digital Logic CircuitsDigital Logic Circuits
Digital Logic Circuitssathish sak
 
Real-Time Scheduling
Real-Time SchedulingReal-Time Scheduling
Real-Time Schedulingsathish sak
 
Real-Time Signal Processing: Implementation and Application
Real-Time Signal Processing:  Implementation and ApplicationReal-Time Signal Processing:  Implementation and Application
Real-Time Signal Processing: Implementation and Applicationsathish sak
 
DIGITAL SIGNAL PROCESSOR OVERVIEW
DIGITAL SIGNAL PROCESSOR OVERVIEWDIGITAL SIGNAL PROCESSOR OVERVIEW
DIGITAL SIGNAL PROCESSOR OVERVIEWsathish sak
 
FRACTAL ROBOTICS
FRACTAL  ROBOTICSFRACTAL  ROBOTICS
FRACTAL ROBOTICSsathish sak
 
POWER GENERATION OF THERMAL POWER PLANT
POWER GENERATION OF THERMAL POWER PLANTPOWER GENERATION OF THERMAL POWER PLANT
POWER GENERATION OF THERMAL POWER PLANTsathish sak
 
mathematics application fiels of engineering
mathematics application fiels of engineeringmathematics application fiels of engineering
mathematics application fiels of engineeringsathish sak
 
ENVIRONMENTAL POLLUTION
ENVIRONMENTALPOLLUTIONENVIRONMENTALPOLLUTION
ENVIRONMENTAL POLLUTIONsathish sak
 

Más de sathish sak (20)

TRANSPARENT CONCRE
TRANSPARENT CONCRETRANSPARENT CONCRE
TRANSPARENT CONCRE
 
Stationary Waves
Stationary WavesStationary Waves
Stationary Waves
 
Electrical Activity of the Heart
Electrical Activity of the HeartElectrical Activity of the Heart
Electrical Activity of the Heart
 
Electrical Activity of the Heart
Electrical Activity of the HeartElectrical Activity of the Heart
Electrical Activity of the Heart
 
Software process life cycles
Software process life cyclesSoftware process life cycles
Software process life cycles
 
Digital Logic Circuits
Digital Logic CircuitsDigital Logic Circuits
Digital Logic Circuits
 
Real-Time Scheduling
Real-Time SchedulingReal-Time Scheduling
Real-Time Scheduling
 
Real-Time Signal Processing: Implementation and Application
Real-Time Signal Processing:  Implementation and ApplicationReal-Time Signal Processing:  Implementation and Application
Real-Time Signal Processing: Implementation and Application
 
DIGITAL SIGNAL PROCESSOR OVERVIEW
DIGITAL SIGNAL PROCESSOR OVERVIEWDIGITAL SIGNAL PROCESSOR OVERVIEW
DIGITAL SIGNAL PROCESSOR OVERVIEW
 
FRACTAL ROBOTICS
FRACTAL  ROBOTICSFRACTAL  ROBOTICS
FRACTAL ROBOTICS
 
Electro bike
Electro bikeElectro bike
Electro bike
 
ROBOTIC SURGERY
ROBOTIC SURGERYROBOTIC SURGERY
ROBOTIC SURGERY
 
POWER GENERATION OF THERMAL POWER PLANT
POWER GENERATION OF THERMAL POWER PLANTPOWER GENERATION OF THERMAL POWER PLANT
POWER GENERATION OF THERMAL POWER PLANT
 
mathematics application fiels of engineering
mathematics application fiels of engineeringmathematics application fiels of engineering
mathematics application fiels of engineering
 
Plastics…
Plastics…Plastics…
Plastics…
 
ENGINEERING
ENGINEERINGENGINEERING
ENGINEERING
 
ENVIRONMENTAL POLLUTION
ENVIRONMENTALPOLLUTIONENVIRONMENTALPOLLUTION
ENVIRONMENTAL POLLUTION
 
RFID TECHNOLOGY
RFID TECHNOLOGYRFID TECHNOLOGY
RFID TECHNOLOGY
 
green chemistry
green chemistrygreen chemistry
green chemistry
 
NANOTECHNOLOGY
  NANOTECHNOLOGY	  NANOTECHNOLOGY
NANOTECHNOLOGY
 

Último

LESSON 5 GROUP 10 ST. THOMAS AQUINAS.pdf
LESSON 5 GROUP 10 ST. THOMAS AQUINAS.pdfLESSON 5 GROUP 10 ST. THOMAS AQUINAS.pdf
LESSON 5 GROUP 10 ST. THOMAS AQUINAS.pdfmchristianalwyn
 
Presentation2.pptx - JoyPress Wordpress
Presentation2.pptx -  JoyPress WordpressPresentation2.pptx -  JoyPress Wordpress
Presentation2.pptx - JoyPress Wordpressssuser166378
 
Benefits of doing Internet peering and running an Internet Exchange (IX) pres...
Benefits of doing Internet peering and running an Internet Exchange (IX) pres...Benefits of doing Internet peering and running an Internet Exchange (IX) pres...
Benefits of doing Internet peering and running an Internet Exchange (IX) pres...APNIC
 
Bio Medical Waste Management Guideliness 2023 ppt.pptx
Bio Medical Waste Management Guideliness 2023 ppt.pptxBio Medical Waste Management Guideliness 2023 ppt.pptx
Bio Medical Waste Management Guideliness 2023 ppt.pptxnaveenithkrishnan
 
A_Z-1_0_4T_00A-EN_U-Po_w_erPoint_06.pptx
A_Z-1_0_4T_00A-EN_U-Po_w_erPoint_06.pptxA_Z-1_0_4T_00A-EN_U-Po_w_erPoint_06.pptx
A_Z-1_0_4T_00A-EN_U-Po_w_erPoint_06.pptxjayshuklatrainer
 
TYPES AND DEFINITION OF ONLINE CRIMES AND HAZARDS
TYPES AND DEFINITION OF ONLINE CRIMES AND HAZARDSTYPES AND DEFINITION OF ONLINE CRIMES AND HAZARDS
TYPES AND DEFINITION OF ONLINE CRIMES AND HAZARDSedrianrheine
 
world Tuberculosis day ppt 25-3-2024.pptx
world Tuberculosis day ppt 25-3-2024.pptxworld Tuberculosis day ppt 25-3-2024.pptx
world Tuberculosis day ppt 25-3-2024.pptxnaveenithkrishnan
 
LESSON 10/ GROUP 10/ ST. THOMAS AQUINASS
LESSON 10/ GROUP 10/ ST. THOMAS AQUINASSLESSON 10/ GROUP 10/ ST. THOMAS AQUINASS
LESSON 10/ GROUP 10/ ST. THOMAS AQUINASSlesteraporado16
 
Zero-day Vulnerabilities
Zero-day VulnerabilitiesZero-day Vulnerabilities
Zero-day Vulnerabilitiesalihassaah1994
 
Vision Forward: Tracing Image Search SEO From Its Roots To AI-Enhanced Horizons
Vision Forward: Tracing Image Search SEO From Its Roots To AI-Enhanced HorizonsVision Forward: Tracing Image Search SEO From Its Roots To AI-Enhanced Horizons
Vision Forward: Tracing Image Search SEO From Its Roots To AI-Enhanced HorizonsRoxana Stingu
 
Computer 10 Lesson 8: Building a Website
Computer 10 Lesson 8: Building a WebsiteComputer 10 Lesson 8: Building a Website
Computer 10 Lesson 8: Building a WebsiteMavein
 
Niche Domination Prodigy Review Plus Bonus
Niche Domination Prodigy Review Plus BonusNiche Domination Prodigy Review Plus Bonus
Niche Domination Prodigy Review Plus BonusSkylark Nobin
 
WordPress by the numbers - Jan Loeffler, CTO WebPros, CloudFest 2024
WordPress by the numbers - Jan Loeffler, CTO WebPros, CloudFest 2024WordPress by the numbers - Jan Loeffler, CTO WebPros, CloudFest 2024
WordPress by the numbers - Jan Loeffler, CTO WebPros, CloudFest 2024Jan Löffler
 
Introduction to ICANN and Fellowship program by Shreedeep Rayamajhi.pdf
Introduction to ICANN and Fellowship program  by Shreedeep Rayamajhi.pdfIntroduction to ICANN and Fellowship program  by Shreedeep Rayamajhi.pdf
Introduction to ICANN and Fellowship program by Shreedeep Rayamajhi.pdfShreedeep Rayamajhi
 
Check out the Free Landing Page Hosting in 2024
Check out the Free Landing Page Hosting in 2024Check out the Free Landing Page Hosting in 2024
Check out the Free Landing Page Hosting in 2024Shubham Pant
 

Último (15)

LESSON 5 GROUP 10 ST. THOMAS AQUINAS.pdf
LESSON 5 GROUP 10 ST. THOMAS AQUINAS.pdfLESSON 5 GROUP 10 ST. THOMAS AQUINAS.pdf
LESSON 5 GROUP 10 ST. THOMAS AQUINAS.pdf
 
Presentation2.pptx - JoyPress Wordpress
Presentation2.pptx -  JoyPress WordpressPresentation2.pptx -  JoyPress Wordpress
Presentation2.pptx - JoyPress Wordpress
 
Benefits of doing Internet peering and running an Internet Exchange (IX) pres...
Benefits of doing Internet peering and running an Internet Exchange (IX) pres...Benefits of doing Internet peering and running an Internet Exchange (IX) pres...
Benefits of doing Internet peering and running an Internet Exchange (IX) pres...
 
Bio Medical Waste Management Guideliness 2023 ppt.pptx
Bio Medical Waste Management Guideliness 2023 ppt.pptxBio Medical Waste Management Guideliness 2023 ppt.pptx
Bio Medical Waste Management Guideliness 2023 ppt.pptx
 
A_Z-1_0_4T_00A-EN_U-Po_w_erPoint_06.pptx
A_Z-1_0_4T_00A-EN_U-Po_w_erPoint_06.pptxA_Z-1_0_4T_00A-EN_U-Po_w_erPoint_06.pptx
A_Z-1_0_4T_00A-EN_U-Po_w_erPoint_06.pptx
 
TYPES AND DEFINITION OF ONLINE CRIMES AND HAZARDS
TYPES AND DEFINITION OF ONLINE CRIMES AND HAZARDSTYPES AND DEFINITION OF ONLINE CRIMES AND HAZARDS
TYPES AND DEFINITION OF ONLINE CRIMES AND HAZARDS
 
world Tuberculosis day ppt 25-3-2024.pptx
world Tuberculosis day ppt 25-3-2024.pptxworld Tuberculosis day ppt 25-3-2024.pptx
world Tuberculosis day ppt 25-3-2024.pptx
 
LESSON 10/ GROUP 10/ ST. THOMAS AQUINASS
LESSON 10/ GROUP 10/ ST. THOMAS AQUINASSLESSON 10/ GROUP 10/ ST. THOMAS AQUINASS
LESSON 10/ GROUP 10/ ST. THOMAS AQUINASS
 
Zero-day Vulnerabilities
Zero-day VulnerabilitiesZero-day Vulnerabilities
Zero-day Vulnerabilities
 
Vision Forward: Tracing Image Search SEO From Its Roots To AI-Enhanced Horizons
Vision Forward: Tracing Image Search SEO From Its Roots To AI-Enhanced HorizonsVision Forward: Tracing Image Search SEO From Its Roots To AI-Enhanced Horizons
Vision Forward: Tracing Image Search SEO From Its Roots To AI-Enhanced Horizons
 
Computer 10 Lesson 8: Building a Website
Computer 10 Lesson 8: Building a WebsiteComputer 10 Lesson 8: Building a Website
Computer 10 Lesson 8: Building a Website
 
Niche Domination Prodigy Review Plus Bonus
Niche Domination Prodigy Review Plus BonusNiche Domination Prodigy Review Plus Bonus
Niche Domination Prodigy Review Plus Bonus
 
WordPress by the numbers - Jan Loeffler, CTO WebPros, CloudFest 2024
WordPress by the numbers - Jan Loeffler, CTO WebPros, CloudFest 2024WordPress by the numbers - Jan Loeffler, CTO WebPros, CloudFest 2024
WordPress by the numbers - Jan Loeffler, CTO WebPros, CloudFest 2024
 
Introduction to ICANN and Fellowship program by Shreedeep Rayamajhi.pdf
Introduction to ICANN and Fellowship program  by Shreedeep Rayamajhi.pdfIntroduction to ICANN and Fellowship program  by Shreedeep Rayamajhi.pdf
Introduction to ICANN and Fellowship program by Shreedeep Rayamajhi.pdf
 
Check out the Free Landing Page Hosting in 2024
Check out the Free Landing Page Hosting in 2024Check out the Free Landing Page Hosting in 2024
Check out the Free Landing Page Hosting in 2024
 

An Introduction to Information Retrieval and Applications

  • 2.  Homework assignments and programming exercises: ~40%  Mid-term exam: ~25%  Term project: ~35%  Including proposal, presentation, and final report
  • 3.  About 3 programming exercises  Team-based (at most 2 persons per team)  You can either write your own code or reuse existing open source code  The term project  Either team-based system development (the same as programming exercises)  Or academic paper presentation  Only one person per team allowed  A proposal is *required* before midterm (Apr. 11, 2014)
  • 4.  The score you get depends on the functions, difficulty and quality of your project  For system development:  System functions and correctness  For academic paper presentation  Quality and your presentation of the paper  Major methods/experimental results *must* be presented  Papers from top conferences are strongly suggested  E.g. SIGIR, WWW, CIKM, WSDM, JCDL, ICMR, …  Proposals are *required* for each team, and will be counted in the score
  • 5.  Submission instructions  Programs, project proposals, and project reports in electronic files must be submitted to the TA online at:  Submissions website: (TBD)  Before submission:  User name: Your student ID  Please change your default password at your first login
  • 6.  This course will NOT tell you  The tips and tricks of using search engines, although power users might have better ideas on how to improve them  There’re plenty of books and websites on that…  How to find books in libraries, although it’s somewhat related to the basic IR concepts  How to make money on the Web, although the currently largest search engine did it
  • 7.  Things that you have been doing all day!  Searching for something interesting: Web, news, e-mail, image, video, …  Asking for advices  …  User interests are changing all the time…  2011: New Zealand Earthquake  2012: Jeremy Lin  2013: Meteor Russia  2014: ? (next slide)
  • 17.  Blast  Explosion  Chelyabinsk  Asteroid 2012 DA14  …
  • 19.  流星  彗星  隕石  俄羅斯  地球  …  And other languages…  And other search engines…  And social websites…
  • 27.  “Information retrieval is a field concerned with the structure, analysis, organization, storage, searching, and retrieval of information.” (Salton, 1968)
  • 28.  Information retrieval (IR): a research field that targets at effectively and efficiently searching information in text and multimedia documents  In this course, we will introduce the basic text and query models in IR, retrieval evaluation, indexing and searching, and applications for IR
  • 30. Inverted Index User Interface Text Operations Query Expansion Indexing Retrieval Ranking Text query user need user feedback ranked docs retrieved docs Doc representationlogical view inverted file Document Collection
  • 31.  Text IR  Indexing and searching  Query languages and operations  Retrieval evaluation  Modeling  Boolean model  Vector space model  Probabilistic model  Applications for IR  Multimedia IR  Web search  Digital libraries
  • 32.  Basics in IR (focus)  Inverted indexes for boolean queries (Ch.1-5)  Term weighting and vector space model (Ch. 6-7)  Evaluation in IR (Ch. 8)  Advanced Topics  Relevance feedback (Ch. 9)  XML retrieval (Ch. 10)  Probabilistic IR (Ch. 11)  Language models (Ch. 12)  Machine learning in IR (useful)  Text classification (Ch. 13-15)  Document clustering (Ch. 16-18)  Web Search  Web crawling and indexes (Ch. 19-20)  Link analysis (Ch. 21)
  • 33.  Text mining  Machine Learning  Natural Language Processing  Social Network Analysis  …
  • 34.  Cross-language IR  Image, video, and multimedia IR  Speech retrieval  Music retrieval  User interfaces  Parallel, distributed, and P2P IR  Digital libraries  Information science perspective  Logic-based approaches to IR  Natural language processing techniques  …
  • 35.  Before midterm  Boolean retrieval (1 wk)  Indexing (2 wks)  Vector space model and evaluation (2 wk)  Relevance feedback (1 wk)  Probabilistic IR (2 wk)  After midterm  Text classification (1-2 wk)  Document clustering (1-2 wk)  Web search (2 wks)  Advanced topics: CLIR, IE, … (2 wks)  Term Project Presentation (3 wks)
  • 36.  Wikipedia page on Information Retrieval: http://en.wikipedia.org/wiki/Information_ret rieval  Information Retrieval Resources: http://www- csli.stanford.edu/~hinrich/information- retrieval.html 
  • 37.  Journals  ACM TOIS: Transactions on Information Systems  JASIST: Journal of the American Society of Information Sciences  IP&M: Information Processing and Management  IEEE TKDE: Transactions on Knowledge and Data Engineering  Conferences  ACM SIGIR: International Conference on Information Retrieval  WWW: World Wide Web Conference  ACM CIKM: Conference on Information Knowledge and Management  JCDL: ACM/IEEE Joint Conference on Digital Libraries  ACM WSDM: International Conference on Web Search and Data Mining  TREC: Text Retrieval Conference
  • 38.  Slides and lectures will be offered mainly in English  For better understanding for domestic students, important concepts will be briefly summarized in Chinese