SlideShare una empresa de Scribd logo
1 de 6
Descargar para leer sin conexión
Qedia – Natural Language Queries on DBPedia

       Andreea-Georgiana Zbranca, Diana Andreea Gorea, Lucian Bentea

        Faculty of Computer Science, “A.I. Cuza” University, Ia¸i, Romania
                                                               s



       Abstract. In this paper we present an application that allows users to
       query DBPedia through natural language, which is more intuitive than
       plain SPARQL.


1     Introduction
We present an application that is able to translate natural language phrases,
which conform to a certain basic grammar, into SPARQL queries that are then
run on the DBPedia knowledge base. For tagging the parts of speech of the
phrase, we have used a lexical analyzer implemented by Ian Barber and avail-
able at http://phpir.com/part-of-speech-tagging. The syntactical analysis
is achieved with respect to a basic grammar that we describe in the following
section. The resulting parse tree can also be interpreted as the RDF graph cor-
responding to the given phrase. Furthermore, there are three types of phrases
that we allow to be used as a natural language query, which we also describe
below – one which is missing the subject, one which is missing the object and
one which is missing both. Based on these categories of phrases, we are able to
automatically generate the corresponding SPARQL queries which we run on the
DBPedia end-point. In order to obtain further statistics, a SPARQL query has
also been used.
    To increase flexibility, the graphical interface has been implemented in two
versions – a Web page version using the Zend framework and requiring Apache
or a similar local server to be running, and a Desktop version using the PHP-
GTK 2 library. Also, in order to run the queries from within PHP, the ARC
library has been used, which is freely available to download from http://arc.
semsol.org/. The results returned by each query are displayed both in tabular
and in text form, along with other statistics. We also mention that the main
RDF vocabularies used by DBPedia are also automatically included with each
SPARQL query.


2     Parsing a Phrase
2.1   Algorithm
The query will be in natural language. The sentence will be transformed into
an RDF triplet Subject-Predicate-Object. Identifying the parts of sentence, the
natural language query can be transformed into a SPARQL query. As input we
get a phrase and we obtain three arrays: nouns (meaning also adjectives and
adverbs), parents (corresponding to the tree grammar parsing) and verbs that
connect the nouns. First step is to obtain the parts of speech of the phrase and
after that to find out the part of sentence and build the three arrays.
    To build this parser we first used an algorithm already implemented by Ian
Barber. This system use a corpus, with words hand tagged for part of speech.
Some examples of taggers are: NN for noun, VB for verb, VBD for verb past
tense, JJ for adjective. In his code I removed some words that are unnecessary
in the following steps. For example I removed the word the that is determinant
for noun. The output of this algorithm is the phrase with tagged with its parts
of speech, e.g.
Input: The quick brown fox jumped over the lazy dog.

Output: The/DT quick/JJ brown/JJ fox/NN
        jumped/VBD over/IN the/DT lazy/JJ dog/NN.
According to the algorithm, the tagger was trained by analysing a corpus and
noting the frequencies of the different tags for a given word. More informations
and also the algorithm that we used for this step, can be found at: http://
phpir.com/part-of-speech-tagging.
    In the next step we have as input the phrase tagged according to the Ian
Barber algorithm and we print the three arrays from above. To parse the phrase
we used a simple grammar and built the tree parse of the phrase. As a general
structure all our valid phrases must conform to the following basic grammar:
Prop = Beg S P C
Beg = What | What does | What do
S = noun | S P.atr
C = noun | adjective | adverb | C P.atr
P.Atr = that P C
P = verb
where the terminals are What, What does, What do, noun, adjective, adverb, verb
and everything else is a non-terminal. An example of a phrase that conforms to
this grammar is the following:
What animal that has the color that is gray eats leaves
that belong to the species that is Eucalyptus?
The parse tree that we aim to generate is basically the RDF graph of this phrase
and is depicted in Figure 1.
     We get the phrase and we removed from the tags all the line breaks. We then
built an array of pairs of the form (word, tag). After that we verify the tag and
if it is a noun, adjective or adverb, we build our first array that will contain only
nouns, adjectives and adverbs. In the same way we obtain the array with verbs.
     For building the parent array we go through the elements one by one and
we verify whether they are root nodes. When we find the root we search for the
animal

                          has          eats

                         color                      leaves

                           is                      belong

                          gray                     species

                                                      is

                                                 Eucalyptus



Fig. 1. RDF graph (parse tree) for the phrase: What animal that has the color that is
gray eats leaves that belong to the species that is Eucalyptus?



predicate and split the phrase in two sub trees. According to our grammar the
predicate is between the root and the other sub tree. If our phrase does not have
a subject we put in our array the symbol * in the first position. If the phrase
does not have an object we put in the array the symbol # in the last position. In
each sub tree we verify step by step if the noun is followed by the word that and
a verb, and that the child of this noun is the first noun after the verb with that
in front. The parent of the root is 0. When we form the verbs array we verify
what verb is between the child and his parent and put it into the array. On first
position we put 0 because that corresponds to the root.

2.2     Accepted Types of Phrases
In order to verify our project we used three types of phrases that can be trans-
lated into SPARQL queries:
 1. “What [property] has [subject]?”
      translated into:
      SELECT ?property WHERE {
        :[subject] dbpedia:property ?property
      }
      For example, the phrase “What abstract has Guitar?” generates the following
      parse arrays:
nouns-array:       abstract     guitar
      parents:       0            abstract
        verbs:       0            has
  and is translated into the SPARQL query:
  SELECT ?abstract WHERE {
    :Guitar dbpedia2:abstract ?abstract
  }
2. “What has [property] [object] ?”

  translated into:

  SELECT ?subject WHERE {
    ?subject dbpedia2:[property] "[object]"@en
  }

  For example, the phrase “What has name that is animal?” generates the
  following parse arrays:
  nouns-array:       *   name     animal
      parents:       0   *        name
        verbs:       0   has      is
  and is translated into the SPARQL query:
  SELECT ?subject WHERE {
    ?subject dbpedia2:name "Animal"@en
  }
3. “What has [property] ?”

  translated into:

  SELECT ?subject ?object WHERE {
    ?subject dbpedia2:[property] ?object
  }

  For example, the phrase “What has regnum?” generates the following parse
  arrays:
  nouns-array:       *   regnum
      parents:       0   *
        verbs:       0   has
  and is translated into the SPARQL query:
  SELECT ?subject ?object WHERE {
    ?subject dbpedia2:regnum ?object
  }
In this case, where both the subject and object are missing, it is advised
      that we put a limit on the number of results returned by DBPedia, using the
      LIMIT keyword, as in:

      SELECT ?subject ?object WHERE {
        ?subject dbpedia2:regnum ?object
      }
      LIMIT 20


2.3     Statistics

In order to obtain statistics, we go through the list of all nouns in the given
phrase and for each noun X we query the number of languages in which its
corresponding abstract data is translated, using:

SELECT COUNT DISTINCT ?abstract
WHERE {
  :X dbpedia2:abstract ?abstract
}


2.4     ARC Queries

The following example shows how SPARQL queries can be made from within
PHP using the ARC library, which we also have used in our application.

include_once(’./arc/ARC2.php’);

$ssp = ARC2::getSPARQLScriptProcessor();

// define the script
$scr = ’
ENDPOINT <http://dbpedia.org/sparql>

PREFIX dbpedia2: <http://dbpedia.org/property/>
PREFIX dbpedia: <http://dbpedia.org/>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>

$results = SELECT * WHERE {
  ?episode skos:subject
    <http://dbpedia.org/resource/Category:The_Simpsons_episodes%2C_season_12>.
  ?episode dbpedia2:blackboard ?chalkboard_gag.
}
’;

// run the script
$ssp->processScript($scr);
// display the results
echo "nnQuery results:nn";
print_r($ssp->env[’vars’][’results’][’value’]);


3    Conclusions and Future Developments

We presented a preliminary version of an application that allows users to query
DBPedia using basic natural language phrases. There are several features that
can be improved or new features that can be added. For instance, the basic
grammar that we use to create the parse tree can be made more complex. Also,
the three types of phrases that we allow as natural language queries can be made
more complex and closer to the everyday speech – they sound rather artificial
at the moment.
    Another feature that can be added is to allow you to query several end-points,
not just DBPedia. The main problem is that each end-point may come with its
own set of vocabularies, apart from the well-known skos, foaf, rdfs, etc. Thus,
a further knowledge of each end-point is necessary before implementing natural
language queries that can be run on it.
    As last remarks, in order to improve the lexical analysis step, a larger lexicon
can be used. Also, the graphical interface can be made more user friendly as the
previously mentioned features are implemented.


References
1. The ARC open-source RDF system at http://arc.semsol.org.
2. Ian Barber’s part of speech lexical analyzer, freely available at http://phpir.com/
   part-of-speech-tagging.
3. The DBPedia Wiki at http://dbpedia.org/About.
4. The SPARQL online query interface on DBPedia, at http://dbpedia.org/snorql.

Más contenido relacionado

La actualidad más candente

Question Answering with Lydia
Question Answering with LydiaQuestion Answering with Lydia
Question Answering with LydiaJae Hong Kil
 
Semantic web assignment 2
Semantic web assignment 2Semantic web assignment 2
Semantic web assignment 2BarryK88
 
Semantic web final assignment
Semantic web final assignmentSemantic web final assignment
Semantic web final assignmentBarryK88
 
Pairtrees for object storage
Pairtrees for object storagePairtrees for object storage
Pairtrees for object storageJohn Kunze
 
Semantic web assignment 3
Semantic web assignment 3Semantic web assignment 3
Semantic web assignment 3BarryK88
 
A Semantic Multimedia Web (Part 2)
A Semantic Multimedia Web (Part 2)A Semantic Multimedia Web (Part 2)
A Semantic Multimedia Web (Part 2)Raphael Troncy
 
ProLog (Artificial Intelligence) Introduction
ProLog (Artificial Intelligence) IntroductionProLog (Artificial Intelligence) Introduction
ProLog (Artificial Intelligence) Introductionwahab khan
 
Traits: A New Language Feature for PHP?
Traits: A New Language Feature for PHP?Traits: A New Language Feature for PHP?
Traits: A New Language Feature for PHP?Stefan Marr
 
What’s in a structured value?
What’s in a structured value?What’s in a structured value?
What’s in a structured value?Andy Powell
 
Prolog (present)
Prolog (present) Prolog (present)
Prolog (present) Melody Joey
 
Ontology In A Nutshell (version 2)
Ontology In A Nutshell (version 2)Ontology In A Nutshell (version 2)
Ontology In A Nutshell (version 2)Fabien Gandon
 
Rdf data-model-and-storage
Rdf data-model-and-storageRdf data-model-and-storage
Rdf data-model-and-storage灿辉 葛
 
Prolog Programming Language
Prolog Programming  LanguageProlog Programming  Language
Prolog Programming LanguageReham AlBlehid
 

La actualidad más candente (20)

Question Answering with Lydia
Question Answering with LydiaQuestion Answering with Lydia
Question Answering with Lydia
 
OpenNLP demo
OpenNLP demoOpenNLP demo
OpenNLP demo
 
NLTK
NLTKNLTK
NLTK
 
Semantic web assignment 2
Semantic web assignment 2Semantic web assignment 2
Semantic web assignment 2
 
Semantic web final assignment
Semantic web final assignmentSemantic web final assignment
Semantic web final assignment
 
Pairtrees for object storage
Pairtrees for object storagePairtrees for object storage
Pairtrees for object storage
 
Semantic web assignment 3
Semantic web assignment 3Semantic web assignment 3
Semantic web assignment 3
 
A Semantic Multimedia Web (Part 2)
A Semantic Multimedia Web (Part 2)A Semantic Multimedia Web (Part 2)
A Semantic Multimedia Web (Part 2)
 
ProLog (Artificial Intelligence) Introduction
ProLog (Artificial Intelligence) IntroductionProLog (Artificial Intelligence) Introduction
ProLog (Artificial Intelligence) Introduction
 
Traits: A New Language Feature for PHP?
Traits: A New Language Feature for PHP?Traits: A New Language Feature for PHP?
Traits: A New Language Feature for PHP?
 
7-Java Language Basics Part1
7-Java Language Basics Part17-Java Language Basics Part1
7-Java Language Basics Part1
 
RDF and OWL
RDF and OWLRDF and OWL
RDF and OWL
 
RDA and the Semantic Web
RDA and the Semantic WebRDA and the Semantic Web
RDA and the Semantic Web
 
What’s in a structured value?
What’s in a structured value?What’s in a structured value?
What’s in a structured value?
 
FIRE2014_IIT-P
FIRE2014_IIT-PFIRE2014_IIT-P
FIRE2014_IIT-P
 
Prolog (present)
Prolog (present) Prolog (present)
Prolog (present)
 
Ontology In A Nutshell (version 2)
Ontology In A Nutshell (version 2)Ontology In A Nutshell (version 2)
Ontology In A Nutshell (version 2)
 
PROLOG: Introduction To Prolog
PROLOG: Introduction To PrologPROLOG: Introduction To Prolog
PROLOG: Introduction To Prolog
 
Rdf data-model-and-storage
Rdf data-model-and-storageRdf data-model-and-storage
Rdf data-model-and-storage
 
Prolog Programming Language
Prolog Programming  LanguageProlog Programming  Language
Prolog Programming Language
 

Similar a Qedia - Natural Language Queries on DBPedia

Semantic Web(Web 3.0) SPARQL
Semantic Web(Web 3.0) SPARQLSemantic Web(Web 3.0) SPARQL
Semantic Web(Web 3.0) SPARQLDaniel D.J. UM
 
Introduction To RDF and RDFS
Introduction To RDF and RDFSIntroduction To RDF and RDFS
Introduction To RDF and RDFSNilesh Wagmare
 
SPARQL introduction and training (130+ slides with exercices)
SPARQL introduction and training (130+ slides with exercices)SPARQL introduction and training (130+ slides with exercices)
SPARQL introduction and training (130+ slides with exercices)Thomas Francart
 
Linking the world with Python and Semantics
Linking the world with Python and SemanticsLinking the world with Python and Semantics
Linking the world with Python and SemanticsTatiana Al-Chueyr
 
Semantic web meetup – sparql tutorial
Semantic web meetup – sparql tutorialSemantic web meetup – sparql tutorial
Semantic web meetup – sparql tutorialAdonisDamian
 
2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs
2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs
2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIsJosef Petrák
 
Semantic web
Semantic webSemantic web
Semantic webtariq1352
 
Extracting Authoring Information Based on Keywords andSemant.docx
Extracting Authoring Information Based on Keywords andSemant.docxExtracting Authoring Information Based on Keywords andSemant.docx
Extracting Authoring Information Based on Keywords andSemant.docxmydrynan
 
A Comparison Between Python APIs For RDF Processing
A Comparison Between Python APIs For RDF ProcessingA Comparison Between Python APIs For RDF Processing
A Comparison Between Python APIs For RDF Processinglucianb
 
An introduction to Semantic Web and Linked Data
An introduction to Semantic Web and Linked DataAn introduction to Semantic Web and Linked Data
An introduction to Semantic Web and Linked DataGabriela Agustini
 
W3C Tutorial on Semantic Web and Linked Data at WWW 2013
W3C Tutorial on Semantic Web and Linked Data at WWW 2013W3C Tutorial on Semantic Web and Linked Data at WWW 2013
W3C Tutorial on Semantic Web and Linked Data at WWW 2013Fabien Gandon
 
An introduction to Semantic Web and Linked Data
An introduction to Semantic  Web and Linked DataAn introduction to Semantic  Web and Linked Data
An introduction to Semantic Web and Linked DataGabriela Agustini
 
Aidan's PhD Viva
Aidan's PhD VivaAidan's PhD Viva
Aidan's PhD VivaAidan Hogan
 

Similar a Qedia - Natural Language Queries on DBPedia (20)

Semantic Web(Web 3.0) SPARQL
Semantic Web(Web 3.0) SPARQLSemantic Web(Web 3.0) SPARQL
Semantic Web(Web 3.0) SPARQL
 
Introduction To RDF and RDFS
Introduction To RDF and RDFSIntroduction To RDF and RDFS
Introduction To RDF and RDFS
 
Sparql
SparqlSparql
Sparql
 
SPARQL introduction and training (130+ slides with exercices)
SPARQL introduction and training (130+ slides with exercices)SPARQL introduction and training (130+ slides with exercices)
SPARQL introduction and training (130+ slides with exercices)
 
Linking the world with Python and Semantics
Linking the world with Python and SemanticsLinking the world with Python and Semantics
Linking the world with Python and Semantics
 
Semantic web meetup – sparql tutorial
Semantic web meetup – sparql tutorialSemantic web meetup – sparql tutorial
Semantic web meetup – sparql tutorial
 
Parser
ParserParser
Parser
 
The Bund language
The Bund languageThe Bund language
The Bund language
 
2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs
2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs
2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs
 
Semantic web
Semantic webSemantic web
Semantic web
 
RDF briefing
RDF briefingRDF briefing
RDF briefing
 
Extracting Authoring Information Based on Keywords andSemant.docx
Extracting Authoring Information Based on Keywords andSemant.docxExtracting Authoring Information Based on Keywords andSemant.docx
Extracting Authoring Information Based on Keywords andSemant.docx
 
SWT Lecture Session 2 - RDF
SWT Lecture Session 2 - RDFSWT Lecture Session 2 - RDF
SWT Lecture Session 2 - RDF
 
A Comparison Between Python APIs For RDF Processing
A Comparison Between Python APIs For RDF ProcessingA Comparison Between Python APIs For RDF Processing
A Comparison Between Python APIs For RDF Processing
 
NLP and LSA getting started
NLP and LSA getting startedNLP and LSA getting started
NLP and LSA getting started
 
Sparql
SparqlSparql
Sparql
 
An introduction to Semantic Web and Linked Data
An introduction to Semantic Web and Linked DataAn introduction to Semantic Web and Linked Data
An introduction to Semantic Web and Linked Data
 
W3C Tutorial on Semantic Web and Linked Data at WWW 2013
W3C Tutorial on Semantic Web and Linked Data at WWW 2013W3C Tutorial on Semantic Web and Linked Data at WWW 2013
W3C Tutorial on Semantic Web and Linked Data at WWW 2013
 
An introduction to Semantic Web and Linked Data
An introduction to Semantic  Web and Linked DataAn introduction to Semantic  Web and Linked Data
An introduction to Semantic Web and Linked Data
 
Aidan's PhD Viva
Aidan's PhD VivaAidan's PhD Viva
Aidan's PhD Viva
 

Último

Maximizing Impact_ Nonprofit Website Planning, Budgeting, and Design.pdf
Maximizing Impact_ Nonprofit Website Planning, Budgeting, and Design.pdfMaximizing Impact_ Nonprofit Website Planning, Budgeting, and Design.pdf
Maximizing Impact_ Nonprofit Website Planning, Budgeting, and Design.pdfTechSoup
 
DUST OF SNOW_BY ROBERT FROST_EDITED BY_ TANMOY MISHRA
DUST OF SNOW_BY ROBERT FROST_EDITED BY_ TANMOY MISHRADUST OF SNOW_BY ROBERT FROST_EDITED BY_ TANMOY MISHRA
DUST OF SNOW_BY ROBERT FROST_EDITED BY_ TANMOY MISHRATanmoy Mishra
 
5 charts on South Africa as a source country for international student recrui...
5 charts on South Africa as a source country for international student recrui...5 charts on South Africa as a source country for international student recrui...
5 charts on South Africa as a source country for international student recrui...CaraSkikne1
 
How to Add Existing Field in One2Many Tree View in Odoo 17
How to Add Existing Field in One2Many Tree View in Odoo 17How to Add Existing Field in One2Many Tree View in Odoo 17
How to Add Existing Field in One2Many Tree View in Odoo 17Celine George
 
Practical Research 1: Lesson 8 Writing the Thesis Statement.pptx
Practical Research 1: Lesson 8 Writing the Thesis Statement.pptxPractical Research 1: Lesson 8 Writing the Thesis Statement.pptx
Practical Research 1: Lesson 8 Writing the Thesis Statement.pptxKatherine Villaluna
 
How to Solve Singleton Error in the Odoo 17
How to Solve Singleton Error in the  Odoo 17How to Solve Singleton Error in the  Odoo 17
How to Solve Singleton Error in the Odoo 17Celine George
 
UKCGE Parental Leave Discussion March 2024
UKCGE Parental Leave Discussion March 2024UKCGE Parental Leave Discussion March 2024
UKCGE Parental Leave Discussion March 2024UKCGE
 
CapTechU Doctoral Presentation -March 2024 slides.pptx
CapTechU Doctoral Presentation -March 2024 slides.pptxCapTechU Doctoral Presentation -March 2024 slides.pptx
CapTechU Doctoral Presentation -March 2024 slides.pptxCapitolTechU
 
Education and training program in the hospital APR.pptx
Education and training program in the hospital APR.pptxEducation and training program in the hospital APR.pptx
Education and training program in the hospital APR.pptxraviapr7
 
Presentation on the Basics of Writing. Writing a Paragraph
Presentation on the Basics of Writing. Writing a ParagraphPresentation on the Basics of Writing. Writing a Paragraph
Presentation on the Basics of Writing. Writing a ParagraphNetziValdelomar1
 
Patterns of Written Texts Across Disciplines.pptx
Patterns of Written Texts Across Disciplines.pptxPatterns of Written Texts Across Disciplines.pptx
Patterns of Written Texts Across Disciplines.pptxMYDA ANGELICA SUAN
 
Philosophy of Education and Educational Philosophy
Philosophy of Education  and Educational PhilosophyPhilosophy of Education  and Educational Philosophy
Philosophy of Education and Educational PhilosophyShuvankar Madhu
 
The Singapore Teaching Practice document
The Singapore Teaching Practice documentThe Singapore Teaching Practice document
The Singapore Teaching Practice documentXsasf Sfdfasd
 
Patient Counselling. Definition of patient counseling; steps involved in pati...
Patient Counselling. Definition of patient counseling; steps involved in pati...Patient Counselling. Definition of patient counseling; steps involved in pati...
Patient Counselling. Definition of patient counseling; steps involved in pati...raviapr7
 
AUDIENCE THEORY -- FANDOM -- JENKINS.pptx
AUDIENCE THEORY -- FANDOM -- JENKINS.pptxAUDIENCE THEORY -- FANDOM -- JENKINS.pptx
AUDIENCE THEORY -- FANDOM -- JENKINS.pptxiammrhaywood
 
How to Add a New Field in Existing Kanban View in Odoo 17
How to Add a New Field in Existing Kanban View in Odoo 17How to Add a New Field in Existing Kanban View in Odoo 17
How to Add a New Field in Existing Kanban View in Odoo 17Celine George
 
HED Office Sohayok Exam Question Solution 2023.pdf
HED Office Sohayok Exam Question Solution 2023.pdfHED Office Sohayok Exam Question Solution 2023.pdf
HED Office Sohayok Exam Question Solution 2023.pdfMohonDas
 
Benefits & Challenges of Inclusive Education
Benefits & Challenges of Inclusive EducationBenefits & Challenges of Inclusive Education
Benefits & Challenges of Inclusive EducationMJDuyan
 
Diploma in Nursing Admission Test Question Solution 2023.pdf
Diploma in Nursing Admission Test Question Solution 2023.pdfDiploma in Nursing Admission Test Question Solution 2023.pdf
Diploma in Nursing Admission Test Question Solution 2023.pdfMohonDas
 
The basics of sentences session 10pptx.pptx
The basics of sentences session 10pptx.pptxThe basics of sentences session 10pptx.pptx
The basics of sentences session 10pptx.pptxheathfieldcps1
 

Último (20)

Maximizing Impact_ Nonprofit Website Planning, Budgeting, and Design.pdf
Maximizing Impact_ Nonprofit Website Planning, Budgeting, and Design.pdfMaximizing Impact_ Nonprofit Website Planning, Budgeting, and Design.pdf
Maximizing Impact_ Nonprofit Website Planning, Budgeting, and Design.pdf
 
DUST OF SNOW_BY ROBERT FROST_EDITED BY_ TANMOY MISHRA
DUST OF SNOW_BY ROBERT FROST_EDITED BY_ TANMOY MISHRADUST OF SNOW_BY ROBERT FROST_EDITED BY_ TANMOY MISHRA
DUST OF SNOW_BY ROBERT FROST_EDITED BY_ TANMOY MISHRA
 
5 charts on South Africa as a source country for international student recrui...
5 charts on South Africa as a source country for international student recrui...5 charts on South Africa as a source country for international student recrui...
5 charts on South Africa as a source country for international student recrui...
 
How to Add Existing Field in One2Many Tree View in Odoo 17
How to Add Existing Field in One2Many Tree View in Odoo 17How to Add Existing Field in One2Many Tree View in Odoo 17
How to Add Existing Field in One2Many Tree View in Odoo 17
 
Practical Research 1: Lesson 8 Writing the Thesis Statement.pptx
Practical Research 1: Lesson 8 Writing the Thesis Statement.pptxPractical Research 1: Lesson 8 Writing the Thesis Statement.pptx
Practical Research 1: Lesson 8 Writing the Thesis Statement.pptx
 
How to Solve Singleton Error in the Odoo 17
How to Solve Singleton Error in the  Odoo 17How to Solve Singleton Error in the  Odoo 17
How to Solve Singleton Error in the Odoo 17
 
UKCGE Parental Leave Discussion March 2024
UKCGE Parental Leave Discussion March 2024UKCGE Parental Leave Discussion March 2024
UKCGE Parental Leave Discussion March 2024
 
CapTechU Doctoral Presentation -March 2024 slides.pptx
CapTechU Doctoral Presentation -March 2024 slides.pptxCapTechU Doctoral Presentation -March 2024 slides.pptx
CapTechU Doctoral Presentation -March 2024 slides.pptx
 
Education and training program in the hospital APR.pptx
Education and training program in the hospital APR.pptxEducation and training program in the hospital APR.pptx
Education and training program in the hospital APR.pptx
 
Presentation on the Basics of Writing. Writing a Paragraph
Presentation on the Basics of Writing. Writing a ParagraphPresentation on the Basics of Writing. Writing a Paragraph
Presentation on the Basics of Writing. Writing a Paragraph
 
Patterns of Written Texts Across Disciplines.pptx
Patterns of Written Texts Across Disciplines.pptxPatterns of Written Texts Across Disciplines.pptx
Patterns of Written Texts Across Disciplines.pptx
 
Philosophy of Education and Educational Philosophy
Philosophy of Education  and Educational PhilosophyPhilosophy of Education  and Educational Philosophy
Philosophy of Education and Educational Philosophy
 
The Singapore Teaching Practice document
The Singapore Teaching Practice documentThe Singapore Teaching Practice document
The Singapore Teaching Practice document
 
Patient Counselling. Definition of patient counseling; steps involved in pati...
Patient Counselling. Definition of patient counseling; steps involved in pati...Patient Counselling. Definition of patient counseling; steps involved in pati...
Patient Counselling. Definition of patient counseling; steps involved in pati...
 
AUDIENCE THEORY -- FANDOM -- JENKINS.pptx
AUDIENCE THEORY -- FANDOM -- JENKINS.pptxAUDIENCE THEORY -- FANDOM -- JENKINS.pptx
AUDIENCE THEORY -- FANDOM -- JENKINS.pptx
 
How to Add a New Field in Existing Kanban View in Odoo 17
How to Add a New Field in Existing Kanban View in Odoo 17How to Add a New Field in Existing Kanban View in Odoo 17
How to Add a New Field in Existing Kanban View in Odoo 17
 
HED Office Sohayok Exam Question Solution 2023.pdf
HED Office Sohayok Exam Question Solution 2023.pdfHED Office Sohayok Exam Question Solution 2023.pdf
HED Office Sohayok Exam Question Solution 2023.pdf
 
Benefits & Challenges of Inclusive Education
Benefits & Challenges of Inclusive EducationBenefits & Challenges of Inclusive Education
Benefits & Challenges of Inclusive Education
 
Diploma in Nursing Admission Test Question Solution 2023.pdf
Diploma in Nursing Admission Test Question Solution 2023.pdfDiploma in Nursing Admission Test Question Solution 2023.pdf
Diploma in Nursing Admission Test Question Solution 2023.pdf
 
The basics of sentences session 10pptx.pptx
The basics of sentences session 10pptx.pptxThe basics of sentences session 10pptx.pptx
The basics of sentences session 10pptx.pptx
 

Qedia - Natural Language Queries on DBPedia

  • 1. Qedia – Natural Language Queries on DBPedia Andreea-Georgiana Zbranca, Diana Andreea Gorea, Lucian Bentea Faculty of Computer Science, “A.I. Cuza” University, Ia¸i, Romania s Abstract. In this paper we present an application that allows users to query DBPedia through natural language, which is more intuitive than plain SPARQL. 1 Introduction We present an application that is able to translate natural language phrases, which conform to a certain basic grammar, into SPARQL queries that are then run on the DBPedia knowledge base. For tagging the parts of speech of the phrase, we have used a lexical analyzer implemented by Ian Barber and avail- able at http://phpir.com/part-of-speech-tagging. The syntactical analysis is achieved with respect to a basic grammar that we describe in the following section. The resulting parse tree can also be interpreted as the RDF graph cor- responding to the given phrase. Furthermore, there are three types of phrases that we allow to be used as a natural language query, which we also describe below – one which is missing the subject, one which is missing the object and one which is missing both. Based on these categories of phrases, we are able to automatically generate the corresponding SPARQL queries which we run on the DBPedia end-point. In order to obtain further statistics, a SPARQL query has also been used. To increase flexibility, the graphical interface has been implemented in two versions – a Web page version using the Zend framework and requiring Apache or a similar local server to be running, and a Desktop version using the PHP- GTK 2 library. Also, in order to run the queries from within PHP, the ARC library has been used, which is freely available to download from http://arc. semsol.org/. The results returned by each query are displayed both in tabular and in text form, along with other statistics. We also mention that the main RDF vocabularies used by DBPedia are also automatically included with each SPARQL query. 2 Parsing a Phrase 2.1 Algorithm The query will be in natural language. The sentence will be transformed into an RDF triplet Subject-Predicate-Object. Identifying the parts of sentence, the natural language query can be transformed into a SPARQL query. As input we
  • 2. get a phrase and we obtain three arrays: nouns (meaning also adjectives and adverbs), parents (corresponding to the tree grammar parsing) and verbs that connect the nouns. First step is to obtain the parts of speech of the phrase and after that to find out the part of sentence and build the three arrays. To build this parser we first used an algorithm already implemented by Ian Barber. This system use a corpus, with words hand tagged for part of speech. Some examples of taggers are: NN for noun, VB for verb, VBD for verb past tense, JJ for adjective. In his code I removed some words that are unnecessary in the following steps. For example I removed the word the that is determinant for noun. The output of this algorithm is the phrase with tagged with its parts of speech, e.g. Input: The quick brown fox jumped over the lazy dog. Output: The/DT quick/JJ brown/JJ fox/NN jumped/VBD over/IN the/DT lazy/JJ dog/NN. According to the algorithm, the tagger was trained by analysing a corpus and noting the frequencies of the different tags for a given word. More informations and also the algorithm that we used for this step, can be found at: http:// phpir.com/part-of-speech-tagging. In the next step we have as input the phrase tagged according to the Ian Barber algorithm and we print the three arrays from above. To parse the phrase we used a simple grammar and built the tree parse of the phrase. As a general structure all our valid phrases must conform to the following basic grammar: Prop = Beg S P C Beg = What | What does | What do S = noun | S P.atr C = noun | adjective | adverb | C P.atr P.Atr = that P C P = verb where the terminals are What, What does, What do, noun, adjective, adverb, verb and everything else is a non-terminal. An example of a phrase that conforms to this grammar is the following: What animal that has the color that is gray eats leaves that belong to the species that is Eucalyptus? The parse tree that we aim to generate is basically the RDF graph of this phrase and is depicted in Figure 1. We get the phrase and we removed from the tags all the line breaks. We then built an array of pairs of the form (word, tag). After that we verify the tag and if it is a noun, adjective or adverb, we build our first array that will contain only nouns, adjectives and adverbs. In the same way we obtain the array with verbs. For building the parent array we go through the elements one by one and we verify whether they are root nodes. When we find the root we search for the
  • 3. animal has eats color leaves is belong gray species is Eucalyptus Fig. 1. RDF graph (parse tree) for the phrase: What animal that has the color that is gray eats leaves that belong to the species that is Eucalyptus? predicate and split the phrase in two sub trees. According to our grammar the predicate is between the root and the other sub tree. If our phrase does not have a subject we put in our array the symbol * in the first position. If the phrase does not have an object we put in the array the symbol # in the last position. In each sub tree we verify step by step if the noun is followed by the word that and a verb, and that the child of this noun is the first noun after the verb with that in front. The parent of the root is 0. When we form the verbs array we verify what verb is between the child and his parent and put it into the array. On first position we put 0 because that corresponds to the root. 2.2 Accepted Types of Phrases In order to verify our project we used three types of phrases that can be trans- lated into SPARQL queries: 1. “What [property] has [subject]?” translated into: SELECT ?property WHERE { :[subject] dbpedia:property ?property } For example, the phrase “What abstract has Guitar?” generates the following parse arrays:
  • 4. nouns-array: abstract guitar parents: 0 abstract verbs: 0 has and is translated into the SPARQL query: SELECT ?abstract WHERE { :Guitar dbpedia2:abstract ?abstract } 2. “What has [property] [object] ?” translated into: SELECT ?subject WHERE { ?subject dbpedia2:[property] "[object]"@en } For example, the phrase “What has name that is animal?” generates the following parse arrays: nouns-array: * name animal parents: 0 * name verbs: 0 has is and is translated into the SPARQL query: SELECT ?subject WHERE { ?subject dbpedia2:name "Animal"@en } 3. “What has [property] ?” translated into: SELECT ?subject ?object WHERE { ?subject dbpedia2:[property] ?object } For example, the phrase “What has regnum?” generates the following parse arrays: nouns-array: * regnum parents: 0 * verbs: 0 has and is translated into the SPARQL query: SELECT ?subject ?object WHERE { ?subject dbpedia2:regnum ?object }
  • 5. In this case, where both the subject and object are missing, it is advised that we put a limit on the number of results returned by DBPedia, using the LIMIT keyword, as in: SELECT ?subject ?object WHERE { ?subject dbpedia2:regnum ?object } LIMIT 20 2.3 Statistics In order to obtain statistics, we go through the list of all nouns in the given phrase and for each noun X we query the number of languages in which its corresponding abstract data is translated, using: SELECT COUNT DISTINCT ?abstract WHERE { :X dbpedia2:abstract ?abstract } 2.4 ARC Queries The following example shows how SPARQL queries can be made from within PHP using the ARC library, which we also have used in our application. include_once(’./arc/ARC2.php’); $ssp = ARC2::getSPARQLScriptProcessor(); // define the script $scr = ’ ENDPOINT <http://dbpedia.org/sparql> PREFIX dbpedia2: <http://dbpedia.org/property/> PREFIX dbpedia: <http://dbpedia.org/> PREFIX skos: <http://www.w3.org/2004/02/skos/core#> $results = SELECT * WHERE { ?episode skos:subject <http://dbpedia.org/resource/Category:The_Simpsons_episodes%2C_season_12>. ?episode dbpedia2:blackboard ?chalkboard_gag. } ’; // run the script $ssp->processScript($scr);
  • 6. // display the results echo "nnQuery results:nn"; print_r($ssp->env[’vars’][’results’][’value’]); 3 Conclusions and Future Developments We presented a preliminary version of an application that allows users to query DBPedia using basic natural language phrases. There are several features that can be improved or new features that can be added. For instance, the basic grammar that we use to create the parse tree can be made more complex. Also, the three types of phrases that we allow as natural language queries can be made more complex and closer to the everyday speech – they sound rather artificial at the moment. Another feature that can be added is to allow you to query several end-points, not just DBPedia. The main problem is that each end-point may come with its own set of vocabularies, apart from the well-known skos, foaf, rdfs, etc. Thus, a further knowledge of each end-point is necessary before implementing natural language queries that can be run on it. As last remarks, in order to improve the lexical analysis step, a larger lexicon can be used. Also, the graphical interface can be made more user friendly as the previously mentioned features are implemented. References 1. The ARC open-source RDF system at http://arc.semsol.org. 2. Ian Barber’s part of speech lexical analyzer, freely available at http://phpir.com/ part-of-speech-tagging. 3. The DBPedia Wiki at http://dbpedia.org/About. 4. The SPARQL online query interface on DBPedia, at http://dbpedia.org/snorql.