SlideShare una empresa de Scribd logo
1 de 18
CS-690 Fall-2009 Querying Capability Modelling and Construction of Deep Web Sources Paper by: Liangcai Shu, Weiyi Meng, Hai Fe, and Clement Yu Presentation by: Fabian Alenius
Problem definition ,[object Object]
Information in Deep Web is 400-550 times larger than the surface web
Crawlers need to extract
information
Problem definition ,[object Object]
A form submission consists of a set of values, one for every field of the form
Querying Capability ,[object Object]
An attribute is a field in a form
A query is a set of attributes
A set query instances are a set of attribute and value pairs, e.g. {<Author, Tolkien>, <Title, Bilbo>}
Four types of attributes Functional Attribute Range Attribute Categorical Attribute Value-Infinite Attribute
Conditional Dependency ,[object Object]
Conditional dependency – value of  C  insignificant if  A  is null
A -> C
Attribute Unions ,[object Object]
T is not attribute type, but value type (e.g. phone, zip code, etc.)
Valid queries  – accepted;  Invalid queries  – rejected

Más contenido relacionado

Similar a Querying Capability Modeling

Exploiting web search engines to search structured
Exploiting web search engines to search structuredExploiting web search engines to search structured
Exploiting web search engines to search structured
Nita Pawar
 
Ado.Net Data Services (Astoria)
Ado.Net Data Services (Astoria)Ado.Net Data Services (Astoria)
Ado.Net Data Services (Astoria)
Igor Moochnick
 
Breaking down data silos with the open data protocol
Breaking down data silos with the open data protocolBreaking down data silos with the open data protocol
Breaking down data silos with the open data protocol
Woodruff Solutions LLC
 
Letting In the Light: Using Solr as an External Search Component
Letting In the Light: Using Solr as an External Search ComponentLetting In the Light: Using Solr as an External Search Component
Letting In the Light: Using Solr as an External Search Component
Jay Luker
 
Linq Sanjay Vyas
Linq   Sanjay VyasLinq   Sanjay Vyas
Linq Sanjay Vyas
rsnarayanan
 
Hibernate Session 4
Hibernate Session 4Hibernate Session 4
Hibernate Session 4
b_kathir
 

Similar a Querying Capability Modeling (20)

Table Recognition
Table RecognitionTable Recognition
Table Recognition
 
Web clustring engine
Web clustring engineWeb clustring engine
Web clustring engine
 
Fasterflect
FasterflectFasterflect
Fasterflect
 
Exploiting web search engines to search structured
Exploiting web search engines to search structuredExploiting web search engines to search structured
Exploiting web search engines to search structured
 
Linq
LinqLinq
Linq
 
Linq
LinqLinq
Linq
 
Building Semantic Web Portals with WebML
Building Semantic Web Portals with WebMLBuilding Semantic Web Portals with WebML
Building Semantic Web Portals with WebML
 
Ado.Net Data Services (Astoria)
Ado.Net Data Services (Astoria)Ado.Net Data Services (Astoria)
Ado.Net Data Services (Astoria)
 
Topic Maps: Theory & Practice
Topic Maps: Theory & PracticeTopic Maps: Theory & Practice
Topic Maps: Theory & Practice
 
Breaking down data silos with the open data protocol
Breaking down data silos with the open data protocolBreaking down data silos with the open data protocol
Breaking down data silos with the open data protocol
 
Letting In the Light: Using Solr as an External Search Component
Letting In the Light: Using Solr as an External Search ComponentLetting In the Light: Using Solr as an External Search Component
Letting In the Light: Using Solr as an External Search Component
 
Ellerslie User Group - ReST Presentation
Ellerslie User Group - ReST PresentationEllerslie User Group - ReST Presentation
Ellerslie User Group - ReST Presentation
 
Introducing AWS AppSync: serverless data driven apps with real-time and offli...
Introducing AWS AppSync: serverless data driven apps with real-time and offli...Introducing AWS AppSync: serverless data driven apps with real-time and offli...
Introducing AWS AppSync: serverless data driven apps with real-time and offli...
 
Lerman Vvs14 Ef Tips And Tricks
Lerman Vvs14  Ef Tips And TricksLerman Vvs14  Ef Tips And Tricks
Lerman Vvs14 Ef Tips And Tricks
 
Structured Document Search and Retrieval
Structured Document Search and RetrievalStructured Document Search and Retrieval
Structured Document Search and Retrieval
 
Understanding the Entity API Module
Understanding the Entity API ModuleUnderstanding the Entity API Module
Understanding the Entity API Module
 
Simple Data Binding
Simple Data BindingSimple Data Binding
Simple Data Binding
 
devLink - What's New in C# 4?
devLink - What's New in C# 4?devLink - What's New in C# 4?
devLink - What's New in C# 4?
 
Linq Sanjay Vyas
Linq   Sanjay VyasLinq   Sanjay Vyas
Linq Sanjay Vyas
 
Hibernate Session 4
Hibernate Session 4Hibernate Session 4
Hibernate Session 4
 

Último

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Último (20)

Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 

Querying Capability Modeling

Notas del editor

  1. blabla
  2. Functional Attribute – only affects the display of results Range Attribute – specifies a range, for example price or year Categorical Attribute – fixed set of distinct value, usually selections lists Value-Infinite Attribute – Infinite number of possible values, typically textboxes Value-Infinite can be categorized into more types, like phone-number, zip code, etc. Not addressed in this paper
  3. Enough to fill in one field
  4. A query is deemed accepted even if it returns no results Goal is find the types of queries that are accepted
  5. First extract HTML form and output XML Feed into Query Capability Analyzer Submit Queries and get result Output is a set of atomic queries
  6. Recall is lower than precision Method is conservative, if no strong evidence the system treats as rejected