SlideShare una empresa de Scribd logo
1 de 21
Term Project

          CS 359
Document Indexing and Retrival
IR System
IR System
• Spider (Nat)
IR System
• Spider (Nat)
• Tokenization (Klang)
IR System
• Spider (Nat)
• Tokenization (Klang)
• GUI (Ploy)
IR System
• Spider (Nat)
• Tokenization (Klang)
• GUI (Ploy)
• Searching/Scoring (Job)
Spider
Spider
• CyberNeko Html
• Groovy
Spider
• CyberNeko Html
• Groovy
Spider (cont.)
Spider (cont.)

• Link Gathering (Link Collection)
• Html Unescape
• Download Page
Link Gathering
Link Gathering
Link Gathering
Tokenization
Tokenization



• http://sansarn.com/lexto/
GUI
GUI
Why Groovy?

• Super set of Java
• Shorter than java
  .....
Groovy Example

• println 0..6
    ===> [0,1,2,3,4,5,6]
•   [0,1,2,3,4,5,6,7,8,9].findAll{it%2==0}
    ===> [0, 2, 4, 6, 8]
•   println "http://www.google.com".toURL().text
Indexing Present1

Más contenido relacionado

Destacado

Indexing languages (2)
Indexing languages (2)Indexing languages (2)
Indexing languages (2)
yhen06
 
Controlled Vocabulary
Controlled VocabularyControlled Vocabulary
Controlled Vocabulary
guest118a9a
 
Indexing languages
Indexing languagesIndexing languages
Indexing languages
yhen06
 
Lecture 4 Meta Knowledge
Lecture 4 Meta KnowledgeLecture 4 Meta Knowledge
Lecture 4 Meta Knowledge
Simon Shurville
 
Introduction to indexing
Introduction to indexingIntroduction to indexing
Introduction to indexing
Daryl Superio
 

Destacado (19)

Tuvilleja lis07 index and indexing
Tuvilleja lis07 index and indexingTuvilleja lis07 index and indexing
Tuvilleja lis07 index and indexing
 
Indexing report
Indexing reportIndexing report
Indexing report
 
Embase: An introduction to indexing 20 October 2014
Embase: An introduction to indexing 20 October 2014Embase: An introduction to indexing 20 October 2014
Embase: An introduction to indexing 20 October 2014
 
Introduction To Controlled Vocabularies
Introduction To Controlled VocabulariesIntroduction To Controlled Vocabularies
Introduction To Controlled Vocabularies
 
Introduction to Controlled Vocabulary
Introduction to Controlled VocabularyIntroduction to Controlled Vocabulary
Introduction to Controlled Vocabulary
 
Indexing languages (2)
Indexing languages (2)Indexing languages (2)
Indexing languages (2)
 
Controlled Vocabulary
Controlled VocabularyControlled Vocabulary
Controlled Vocabulary
 
Indexing languages
Indexing languagesIndexing languages
Indexing languages
 
Cited Reference Searching
Cited Reference SearchingCited Reference Searching
Cited Reference Searching
 
How to build your own citation index
How to build your own citation indexHow to build your own citation index
How to build your own citation index
 
Lecture 4 Meta Knowledge
Lecture 4 Meta KnowledgeLecture 4 Meta Knowledge
Lecture 4 Meta Knowledge
 
Post coordinate indexing .. Library and information science
Post coordinate indexing .. Library and information sciencePost coordinate indexing .. Library and information science
Post coordinate indexing .. Library and information science
 
Knowledge management through seci model
Knowledge management through seci modelKnowledge management through seci model
Knowledge management through seci model
 
Introduction to indexing
Introduction to indexingIntroduction to indexing
Introduction to indexing
 
Indexing
IndexingIndexing
Indexing
 
5013 Indexing Presentation
5013 Indexing Presentation5013 Indexing Presentation
5013 Indexing Presentation
 
Getting started with CitNetExplorer
Getting started with CitNetExplorerGetting started with CitNetExplorer
Getting started with CitNetExplorer
 
key word indexing and their types with example
key word indexing and their types with example key word indexing and their types with example
key word indexing and their types with example
 
Types of indexes
Types of indexesTypes of indexes
Types of indexes
 

Similar a Indexing Present1

Philip Stehlik at TechTalks.ph - Intro to Groovy and Grails
Philip Stehlik at TechTalks.ph - Intro to Groovy and GrailsPhilip Stehlik at TechTalks.ph - Intro to Groovy and Grails
Philip Stehlik at TechTalks.ph - Intro to Groovy and Grails
Philip Stehlik
 
Php Code Audits (PHP UK 2010)
Php Code Audits (PHP UK 2010)Php Code Audits (PHP UK 2010)
Php Code Audits (PHP UK 2010)
Damien Seguy
 
Appsec usa2013 js_libinsecurity_stefanodipaola
Appsec usa2013 js_libinsecurity_stefanodipaolaAppsec usa2013 js_libinsecurity_stefanodipaola
Appsec usa2013 js_libinsecurity_stefanodipaola
drewz lin
 

Similar a Indexing Present1 (20)

Drilling Cyber Security Data With Apache Drill
Drilling Cyber Security Data With Apache DrillDrilling Cyber Security Data With Apache Drill
Drilling Cyber Security Data With Apache Drill
 
Making HTML5 Mobile Games Indistinguishable from Native Apps
Making HTML5 Mobile Games Indistinguishable from Native AppsMaking HTML5 Mobile Games Indistinguishable from Native Apps
Making HTML5 Mobile Games Indistinguishable from Native Apps
 
Philip Stehlik at TechTalks.ph - Intro to Groovy and Grails
Philip Stehlik at TechTalks.ph - Intro to Groovy and GrailsPhilip Stehlik at TechTalks.ph - Intro to Groovy and Grails
Philip Stehlik at TechTalks.ph - Intro to Groovy and Grails
 
Drools and jBPM 6 Overview
Drools and jBPM 6 OverviewDrools and jBPM 6 Overview
Drools and jBPM 6 Overview
 
Happy Go Programming
Happy Go ProgrammingHappy Go Programming
Happy Go Programming
 
자바를 잡아주는 GURU가 있다구!? - 우여명 (아이스크림에듀) :: AWS Community Day 2020
자바를 잡아주는 GURU가 있다구!? - 우여명 (아이스크림에듀) :: AWS Community Day 2020 자바를 잡아주는 GURU가 있다구!? - 우여명 (아이스크림에듀) :: AWS Community Day 2020
자바를 잡아주는 GURU가 있다구!? - 우여명 (아이스크림에듀) :: AWS Community Day 2020
 
スマートフォンサイトの作成術 - 大川洋一
スマートフォンサイトの作成術 - 大川洋一スマートフォンサイトの作成術 - 大川洋一
スマートフォンサイトの作成術 - 大川洋一
 
Внедрение SDLC в боевых условиях / Егор Карбутов (Digital Security)
Внедрение SDLC в боевых условиях / Егор Карбутов (Digital Security)Внедрение SDLC в боевых условиях / Егор Карбутов (Digital Security)
Внедрение SDLC в боевых условиях / Егор Карбутов (Digital Security)
 
De Java 8 ate Java 14
De Java 8 ate Java 14De Java 8 ate Java 14
De Java 8 ate Java 14
 
EvoSpaces - Multi-dimensional Navigation Spaces for Software Evolution
EvoSpaces - Multi-dimensional Navigation Spaces for Software EvolutionEvoSpaces - Multi-dimensional Navigation Spaces for Software Evolution
EvoSpaces - Multi-dimensional Navigation Spaces for Software Evolution
 
Secure Node Code (workshop, O'Reilly Security)
Secure Node Code (workshop, O'Reilly Security)Secure Node Code (workshop, O'Reilly Security)
Secure Node Code (workshop, O'Reilly Security)
 
De Java 8 a Java 17
De Java 8 a Java 17De Java 8 a Java 17
De Java 8 a Java 17
 
The Ring programming language version 1.4 book - Part 30 of 30
The Ring programming language version 1.4 book - Part 30 of 30The Ring programming language version 1.4 book - Part 30 of 30
The Ring programming language version 1.4 book - Part 30 of 30
 
Scrapy workshop
Scrapy workshopScrapy workshop
Scrapy workshop
 
CBDW2014 - MockBox, get ready to mock your socks off!
CBDW2014 - MockBox, get ready to mock your socks off!CBDW2014 - MockBox, get ready to mock your socks off!
CBDW2014 - MockBox, get ready to mock your socks off!
 
Qcon beijing 2010
Qcon beijing 2010Qcon beijing 2010
Qcon beijing 2010
 
Protect Your Payloads: Modern Keying Techniques
Protect Your Payloads: Modern Keying TechniquesProtect Your Payloads: Modern Keying Techniques
Protect Your Payloads: Modern Keying Techniques
 
Php Code Audits (PHP UK 2010)
Php Code Audits (PHP UK 2010)Php Code Audits (PHP UK 2010)
Php Code Audits (PHP UK 2010)
 
Architectural Patterns for Streaming Applications
Architectural Patterns for Streaming ApplicationsArchitectural Patterns for Streaming Applications
Architectural Patterns for Streaming Applications
 
Appsec usa2013 js_libinsecurity_stefanodipaola
Appsec usa2013 js_libinsecurity_stefanodipaolaAppsec usa2013 js_libinsecurity_stefanodipaola
Appsec usa2013 js_libinsecurity_stefanodipaola
 

Más de Nat Weerawan

Raspberry pi meetup Bangkok
Raspberry pi meetup BangkokRaspberry pi meetup Bangkok
Raspberry pi meetup Bangkok
Nat Weerawan
 
Booklat @ Social Innovation Camp Asia 2013 (SICA2013)
Booklat @ Social Innovation Camp Asia 2013 (SICA2013)Booklat @ Social Innovation Camp Asia 2013 (SICA2013)
Booklat @ Social Innovation Camp Asia 2013 (SICA2013)
Nat Weerawan
 

Más de Nat Weerawan (20)

MLBlock
MLBlockMLBlock
MLBlock
 
CMMC IoT & MQTT
CMMC IoT & MQTTCMMC IoT & MQTT
CMMC IoT & MQTT
 
KidBright Plugin development
KidBright Plugin developmentKidBright Plugin development
KidBright Plugin development
 
Kidbright plugin development
Kidbright plugin developmentKidbright plugin development
Kidbright plugin development
 
ESPNow Again..
ESPNow Again..ESPNow Again..
ESPNow Again..
 
CMMC - IoT
CMMC - IoTCMMC - IoT
CMMC - IoT
 
CMMC - CNX - Community of Practice 1
CMMC - CNX - Community of Practice 1CMMC - CNX - Community of Practice 1
CMMC - CNX - Community of Practice 1
 
Chiang Mai Maker Club & Thailand 4.0
Chiang Mai Maker Club & Thailand 4.0Chiang Mai Maker Club & Thailand 4.0
Chiang Mai Maker Club & Thailand 4.0
 
What is Chiang Mai Maker Club - BRIEF
What is Chiang Mai Maker Club - BRIEFWhat is Chiang Mai Maker Club - BRIEF
What is Chiang Mai Maker Club - BRIEF
 
Create connected home devices using a Raspberry Pi, Siri and ESPNow for makers.
Create connected home devices using a Raspberry Pi, Siri and ESPNow for makers.Create connected home devices using a Raspberry Pi, Siri and ESPNow for makers.
Create connected home devices using a Raspberry Pi, Siri and ESPNow for makers.
 
Chaing Mai Maker Club @Creative Thailand Symposium
Chaing Mai Maker Club @Creative Thailand SymposiumChaing Mai Maker Club @Creative Thailand Symposium
Chaing Mai Maker Club @Creative Thailand Symposium
 
Netpie.io Generate MQTT Credential
Netpie.io Generate MQTT CredentialNetpie.io Generate MQTT Credential
Netpie.io Generate MQTT Credential
 
IBM Bluemix & IoT Foundation
IBM Bluemix & IoT FoundationIBM Bluemix & IoT Foundation
IBM Bluemix & IoT Foundation
 
CMMC - Chiang Mai Maker Club
CMMC - Chiang Mai Maker ClubCMMC - Chiang Mai Maker Club
CMMC - Chiang Mai Maker Club
 
Link it smart 7688 MEETUP - Bangkok
Link it smart 7688 MEETUP - BangkokLink it smart 7688 MEETUP - Bangkok
Link it smart 7688 MEETUP - Bangkok
 
Gdg wednesday
Gdg wednesdayGdg wednesday
Gdg wednesday
 
LoveNotYet - The first Thailand sex education game.
LoveNotYet - The first Thailand sex education game.LoveNotYet - The first Thailand sex education game.
LoveNotYet - The first Thailand sex education game.
 
Raspberry Pi @ Beercamp Chiangmai
Raspberry Pi @ Beercamp ChiangmaiRaspberry Pi @ Beercamp Chiangmai
Raspberry Pi @ Beercamp Chiangmai
 
Raspberry pi meetup Bangkok
Raspberry pi meetup BangkokRaspberry pi meetup Bangkok
Raspberry pi meetup Bangkok
 
Booklat @ Social Innovation Camp Asia 2013 (SICA2013)
Booklat @ Social Innovation Camp Asia 2013 (SICA2013)Booklat @ Social Innovation Camp Asia 2013 (SICA2013)
Booklat @ Social Innovation Camp Asia 2013 (SICA2013)
 

Último

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Último (20)

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Indexing Present1