SlideShare una empresa de Scribd logo
1 de 13
Blog Post and Comment Extraction Using Information Quantity of Web Format 4th Asia Infomation Retrieval Symposium, AIRS 2008 Reportor : Che-Min Liao
Framework of Blog Extraction ,[object Object],[object Object],[object Object]
Locating Main Text ,[object Object],[object Object],[object Object],[object Object],[object Object]
Important Features of the Main Text ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Locating main text algorthm
Example
Finding Separator ,[object Object],[object Object],[object Object]
Finding Separator Algorithm
Experiment ,[object Object],[object Object],[object Object],[object Object],[object Object]
Corpus Distribution
Experiment Result ,[object Object]
Performance comparison of three  locating main text algorithm
Performance of finding separator algorithm

Más contenido relacionado

La actualidad más candente

Chapter 10.1
Chapter 10.1Chapter 10.1
Chapter 10.1
sotlsoc
 
Chapter 06
Chapter 06Chapter 06
Chapter 06
Google
 

La actualidad más candente (20)

Switching & Multiplexing
Switching & MultiplexingSwitching & Multiplexing
Switching & Multiplexing
 
Meher ppt (1)
Meher ppt (1)Meher ppt (1)
Meher ppt (1)
 
Dynamic multi level indexing Using B-Trees And B+ Trees
Dynamic multi level indexing Using B-Trees And B+ TreesDynamic multi level indexing Using B-Trees And B+ Trees
Dynamic multi level indexing Using B-Trees And B+ Trees
 
Filehandling
FilehandlingFilehandling
Filehandling
 
Chapter 10.1
Chapter 10.1Chapter 10.1
Chapter 10.1
 
File handling
File handlingFile handling
File handling
 
Meher ppt
Meher pptMeher ppt
Meher ppt
 
OSCh11
OSCh11OSCh11
OSCh11
 
File
FileFile
File
 
Data indexing presentation
Data indexing presentationData indexing presentation
Data indexing presentation
 
Indexing
IndexingIndexing
Indexing
 
Singly Linked List
Singly Linked ListSingly Linked List
Singly Linked List
 
File system interface
File system interfaceFile system interface
File system interface
 
Linked list
Linked listLinked list
Linked list
 
Naming Entities and Locating Mobile Entities
Naming Entities and Locating Mobile EntitiesNaming Entities and Locating Mobile Entities
Naming Entities and Locating Mobile Entities
 
JAVA
JAVAJAVA
JAVA
 
Getting Started - MongoDB
Getting Started - MongoDBGetting Started - MongoDB
Getting Started - MongoDB
 
Indexing Data Structure
Indexing Data StructureIndexing Data Structure
Indexing Data Structure
 
An Introduction To Python - Files, Part 1
An Introduction To Python - Files, Part 1An Introduction To Python - Files, Part 1
An Introduction To Python - Files, Part 1
 
Chapter 06
Chapter 06Chapter 06
Chapter 06
 

Destacado

3VB David Simpson - energy talk for ILFA
3VB David Simpson - energy talk for ILFA3VB David Simpson - energy talk for ILFA
3VB David Simpson - energy talk for ILFA
David Simpson
 

Destacado (18)

Expectation Matching Survey Report
Expectation Matching Survey ReportExpectation Matching Survey Report
Expectation Matching Survey Report
 
Articulaciones
ArticulacionesArticulaciones
Articulaciones
 
FivaTech
FivaTechFivaTech
FivaTech
 
Estructura academico administrativa fce
Estructura academico administrativa fce Estructura academico administrativa fce
Estructura academico administrativa fce
 
Aparato respiratorio
Aparato respiratorioAparato respiratorio
Aparato respiratorio
 
3VB David Simpson - energy talk for ILFA
3VB David Simpson - energy talk for ILFA3VB David Simpson - energy talk for ILFA
3VB David Simpson - energy talk for ILFA
 
American showman
American showmanAmerican showman
American showman
 
Executive Search Team
Executive Search TeamExecutive Search Team
Executive Search Team
 
Partnership
PartnershipPartnership
Partnership
 
Cuadernillo de canto
Cuadernillo de cantoCuadernillo de canto
Cuadernillo de canto
 
About linux
About linuxAbout linux
About linux
 
Mecanismo de Trabajo de Parto
Mecanismo de Trabajo de PartoMecanismo de Trabajo de Parto
Mecanismo de Trabajo de Parto
 
Anatomia
AnatomiaAnatomia
Anatomia
 
Hemorragia postparto
Hemorragia postpartoHemorragia postparto
Hemorragia postparto
 
enfermedades infecciosas
enfermedades infecciosasenfermedades infecciosas
enfermedades infecciosas
 
Share System (M3, U4, A2: Project Based Learning)
Share System (M3, U4, A2: Project Based Learning)Share System (M3, U4, A2: Project Based Learning)
Share System (M3, U4, A2: Project Based Learning)
 
Revolução Industrial
Revolução IndustrialRevolução Industrial
Revolução Industrial
 
FINANCIAL MANAGEMENT- Sources of finance
FINANCIAL MANAGEMENT- Sources of financeFINANCIAL MANAGEMENT- Sources of finance
FINANCIAL MANAGEMENT- Sources of finance
 

Similar a 20081009 meeting

Boilerplate Removal and Content Extraction from Dynamic Web Pages
Boilerplate Removal and Content Extraction from Dynamic Web PagesBoilerplate Removal and Content Extraction from Dynamic Web Pages
Boilerplate Removal and Content Extraction from Dynamic Web Pages
IJCSEA Journal
 
Using HTML to Create Web Pages
Using HTML to Create Web PagesUsing HTML to Create Web Pages
Using HTML to Create Web Pages
Bravocash
 
web unit 2_4338494_2023_08_14_23_11.pptx
web unit 2_4338494_2023_08_14_23_11.pptxweb unit 2_4338494_2023_08_14_23_11.pptx
web unit 2_4338494_2023_08_14_23_11.pptx
Chan24811
 

Similar a 20081009 meeting (20)

Intro to OctoberCMS
Intro to OctoberCMSIntro to OctoberCMS
Intro to OctoberCMS
 
Boilerplate removal and content
Boilerplate removal and contentBoilerplate removal and content
Boilerplate removal and content
 
Boilerplate Removal and Content Extraction from Dynamic Web Pages
Boilerplate Removal and Content Extraction from Dynamic Web PagesBoilerplate Removal and Content Extraction from Dynamic Web Pages
Boilerplate Removal and Content Extraction from Dynamic Web Pages
 
HTML Foundations, part 1
HTML Foundations, part 1HTML Foundations, part 1
HTML Foundations, part 1
 
Appdev appdev appdev app devAPPDEV 1.2.pptx
Appdev appdev appdev app devAPPDEV 1.2.pptxAppdev appdev appdev app devAPPDEV 1.2.pptx
Appdev appdev appdev app devAPPDEV 1.2.pptx
 
WELCOME-FOLKS--CSS.-AND-HTMLS.pptx
WELCOME-FOLKS--CSS.-AND-HTMLS.pptxWELCOME-FOLKS--CSS.-AND-HTMLS.pptx
WELCOME-FOLKS--CSS.-AND-HTMLS.pptx
 
Html
HtmlHtml
Html
 
Tm 1st quarter - 3rd meeting
Tm   1st quarter - 3rd meetingTm   1st quarter - 3rd meeting
Tm 1st quarter - 3rd meeting
 
Survey on article extraction and comment monitoring techniques
Survey on article extraction and comment monitoring techniquesSurvey on article extraction and comment monitoring techniques
Survey on article extraction and comment monitoring techniques
 
Introduction to whats new in css3
Introduction to whats new in css3Introduction to whats new in css3
Introduction to whats new in css3
 
I0331047050
I0331047050I0331047050
I0331047050
 
Website development-osgl
Website development-osglWebsite development-osgl
Website development-osgl
 
Using HTML to Create Web Pages
Using HTML to Create Web PagesUsing HTML to Create Web Pages
Using HTML to Create Web Pages
 
HTML Tags
HTML Tags HTML Tags
HTML Tags
 
Week 2-intro-html
Week 2-intro-htmlWeek 2-intro-html
Week 2-intro-html
 
web unit 2_4338494_2023_08_14_23_11.pptx
web unit 2_4338494_2023_08_14_23_11.pptxweb unit 2_4338494_2023_08_14_23_11.pptx
web unit 2_4338494_2023_08_14_23_11.pptx
 
Introduction to HTML.pptx
Introduction to HTML.pptxIntroduction to HTML.pptx
Introduction to HTML.pptx
 
Bootcamp - Web Development Session 2
Bootcamp - Web Development Session 2Bootcamp - Web Development Session 2
Bootcamp - Web Development Session 2
 
Shyam sunder Rajasthan Computer
Shyam sunder Rajasthan ComputerShyam sunder Rajasthan Computer
Shyam sunder Rajasthan Computer
 
chapter-17-web-designing2.pdf
chapter-17-web-designing2.pdfchapter-17-web-designing2.pdf
chapter-17-web-designing2.pdf
 

Más de marxliouville (13)

20090813MEETING
20090813MEETING20090813MEETING
20090813MEETING
 
20091006meeting
20091006meeting20091006meeting
20091006meeting
 
The Problem of Peer Node Recognition
The Problem of Peer Node RecognitionThe Problem of Peer Node Recognition
The Problem of Peer Node Recognition
 
1212 regular meeting
1212 regular meeting1212 regular meeting
1212 regular meeting
 
20080919 regular meeting報告
20080919 regular meeting報告20080919 regular meeting報告
20080919 regular meeting報告
 
0902 regular meeting
0902 regular meeting0902 regular meeting
0902 regular meeting
 
04/29 regular meeting paper
04/29 regular meeting paper04/29 regular meeting paper
04/29 regular meeting paper
 
04/29 regular meeting paper
04/29 regular meeting paper04/29 regular meeting paper
04/29 regular meeting paper
 
2/19 regular meeting paper
2/19 regular meeting paper2/19 regular meeting paper
2/19 regular meeting paper
 
12/18 regular meeting paper
12/18 regular meeting paper12/18 regular meeting paper
12/18 regular meeting paper
 
10/23 paper
10/23 paper10/23 paper
10/23 paper
 
1023 paper
1023 paper1023 paper
1023 paper
 
A+Novel+Approach+Based+On+Prototypes+And+Rough+Sets+For+Document+And+Feature+...
A+Novel+Approach+Based+On+Prototypes+And+Rough+Sets+For+Document+And+Feature+...A+Novel+Approach+Based+On+Prototypes+And+Rough+Sets+For+Document+And+Feature+...
A+Novel+Approach+Based+On+Prototypes+And+Rough+Sets+For+Document+And+Feature+...
 

Último

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Último (20)

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 

20081009 meeting