This document discusses various techniques for question answering and relation extraction in natural language processing. It provides an overview of question answering systems and approaches, including examples like START, Ask Jeeves and Siri. It also discusses using search engines for question answering, relation extraction from questions, and common evaluation metrics for question answering systems like accuracy and mean reciprocal rank.
1. Seman&c
Analysis
in
Language
Technology
http://stp.lingfil.uu.se/~santinim/sais/2016/sais_2016.htm
Relation Extraction
Marina
San(ni
san$nim@stp.lingfil.uu.se
Department
of
Linguis(cs
and
Philology
Uppsala
University,
Uppsala,
Sweden
Spring
2016
3. Ques$on
Answering
systems
• Factoid
ques(ons:
• Google
• Wolfram
• Ask
Jeeves
• Start
• ….
3
• Approaches:
• IR-‐based
• Knowelege
based
• Hybrid
4. Katz
et
al.
(2006)
hFp://start.csail.mit.edu/publica$ons/FLAIRS0601KatzB.pdf
• START
answers
natural
language
ques(ons
by
presen(ng
components
of
text
and
mul(-‐media
informa(on
drawn
from
a
set
of
informa(on
resources
that
are
hosted
locally
or
accessed
remotely
through
the
Internet.
• START
targets
high
precision
in
its
ques(on
answering.
• The
START
system
analyzes
English
text
and
produces
a
knowledge
base
which
incorporates,
in
the
form
of
nested
ternary
expressions
(=triples),
the
informa(on
found
in
the
text.
4
5. Is
it
true?:
hFp://uncyclopedia.wikia.com/wiki/Ask_Jeeves
• Ask
Jeeves,
more
correctly
known
as
Ask.com,
is
a
search
engine
founded
in
1996
in
California.
• Ini(ally
it
represented
a
stereotypical
English
butler
who
would
"fetch"
the
answer
to
any
ques(on
asked.
• Ask.com
is
now
considered
one
of
the
great
failures
of
the
internet.
The
ques(on
and
answer
feature
simply
didn't
work
as
well
as
hoped,
and
a^er
trying
his
hand
at
being
both
a
tradi(onal
search
engine
and
a
terrible
kind
of
"ar(ficial
AI"
with
a
bald
spot,
• These
days
Jeeves
is
ranked
as
the
4th
most
successful
search
engine
on
the
web,
and
the
4th
most
successful
overall.
This
seems
impressive
un$l
you
consider
that
Google
holds
the
top
spot
with
95%
of
the
market.
It
has
even
fallen
behind
Bing;
enough
said.
5
7. Siri
hFp://en.wikipedia.org/wiki/Siri
• Siri
/ˈsɪri/
is
an
intelligent
personal
assistant
and
knowledge
navigator
which
works
as
an
applica(on
for
Apple
Inc.'s
iOS.
•
The
applica(on
uses
a
natural
language
user
interface
to
answer
ques$ons,
make
recommenda(ons,
and
perform
ac(ons
by
delega$ng
requests
to
a
set
of
Web
services.
• The
so^ware,
both
in
its
original
version
and
as
an
iOS
applica(on,
adapts
to
the
user's
individual
language
usage
and
individual
searches
(preferences)
with
con(nuing
use,
and
returns
results
that
are
individualized.
• The
name
Siri
is
Scandinavian,
a
short
form
of
the
Norse
name
Sigrid
meaning
"beauty"
and
"victory",
and
comes
from
the
intended
name
for
the
original
developer's
first
child.
7
8. ChaFerbots
• Siri…
conversa(onal
”safety
net”.
• Conversa(onal
agents
(chaker
bots,
and
personal
assistants)
àcustomer
care,
customer
analy(cs
(replacing/integra(ng
FAQs
and
help
desk)
8
Avatar: a picture of a person or animal that
represents you on a computer screen, for
example in some chat rooms or when you are
playing games over the Internet
10. General
IR
architecture
for
factoid
ques$ons
10
Document
DocumentDocument
Docume
ntDocume
ntDocume
ntDocume
ntDocume
nt
Question
Processing
Passage
Retrieval
Query
Formulation
Answer Type
Detection
Question
Passage
Retrieval
Document
Retrieval
Answer
Processing
Answer
passages
Indexing
Relevant
Docs
DocumentDocument
Document
11. Things
to
extract
from
the
ques$on
• Answer
Type
Detec(on
• Decide
the
named
en$ty
type
(person,
place)
of
the
answer
• Query
Formula(on
• Choose
query
keywords
for
the
IR
system
• Ques(on
Type
classifica(on
• Is
this
a
defini(on
ques(on,
a
math
ques(on,
a
list
ques(on?
• Focus
Detec(on
• Find
the
ques(on
words
that
are
replaced
by
the
answer
• Rela(on
Extrac(on
• Find
rela(ons
between
en((es
in
the
ques(on
11
12. 12
Common
Evalua$on
Metrics
1. Accuracy
(does
answer
match
gold-‐labeled
answer?)
2. Mean
Reciprocal
Rank:
• The
reciprocal
rank
of
a
query
response
is
the
inverse
of
the
rank
of
the
first
correct
answer.
• The
mean
reciprocal
rank
is
the
average
of
the
reciprocal
ranks
of
results
for
a
sample
of
queries
Q
MRR =
1
rankii=1
N
∑
N
=
13. Common
Evalua$on
Metrics:
MRR
• The
mean
reciprocal
rank
is
the
average
of
the
reciprocal
ranks
of
results
for
a
sample
of
queries
Q.
• (ex
adapted
from
Wikipedia)
• 3
ranked
answers
for
a
query,
with
the
first
one
being
the
one
it
thinks
is
most
likely
correct
• Given
those
3
samples,
we
could
calculate
the
mean
reciprocal
rank
as
(1/3
+
1/2
+
1)/3
=
0.61.
13
14. Complex
ques$ons:
“What
is
the
‘hajii’”?
• The
(bokom-‐up)
snippet
method
• Find
a
set
of
relevant
documents
• Extract
informa(ve
sentences
from
the
documents
(using
p-‐idf,
MMR)
• Order
and
modify
the
sentences
into
an
answer
• The
(top-‐down)
informa(on
extrac(on
method
• build
specific
answerers
for
different
ques(on
types:
• defini(on
ques(ons,
• biography
ques(ons,
• certain
medical
ques(ons
16. Document
Retrieval
11 Web documents
1127 total
sentences
Predicate
Identification
Data-Driven
Analysis
383 Non-Specific Definitional sentences
Sentence clusters,
Importance ordering
Definition
Creation
9 Genus-Species Sentences
The Hajj, or pilgrimage to Makkah (Mecca), is the central duty of Islam.
The Hajj is a milestone event in a Muslim's life.
The hajj is one of five pillars that make up the foundation of Islam.
...
The Hajj, or pilgrimage to Makkah [Mecca], is the central duty of Islam. More than
two million Muslims are expected to take the Hajj this year. Muslims must perform
the hajj at least once in their lifetime if physically and financially able. The Hajj is a
milestone event in a Muslim's life. The annual hajj begins in the twelfth month of
the Islamic year (which is lunar, not solar, so that hajj and Ramadan fall sometimes
in summer, sometimes in winter). The Hajj is a week-long pilgrimage that begins in
the 12th month of the Islamic lunar calendar. Another ceremony, which was not
connected with the rites of the Ka'ba before the rise of Islam, is the Hajj, the
annual pilgrimage to 'Arafat, about two miles east of Mecca, toward Mina…
"What is the Hajj?"
(Ndocs=20, Len=8)
Architecture
for
complex
ques$on
answering:
defini$on
ques$ons
S.
Blair-‐Goldensohn,
K.
McKeown
and
A.
Schlaikjer.
2004.
Answering
Defini(on
Ques(ons:
A
Hyrbid
Approach.
17. State-‐of-‐the-‐art:
ex
• Top
downMing
Tan,
Cicero
dos
Santos,
Bing
Xiang
&
Bowen
Zhou.
2015.
LSTM-‐Based
Deep
Learning
Models
for
non
factoid
Answer
Selec(on.
• Di
Wang
and
Eric
Nyberg.
2015.
A
Long
Short-‐Term
Memory
Model
for
Answer
Sentence
Selec(on
in
Ques(on
Answering.
In
ACL
2015.s
• Minwei
Feng,
Bing
Xiang,
Michael
R.
Glass,
Lidan
Wang,
Bowen
Zhou.
2015.
Applying
deep
learning
to
answer
selec(on:
A
study
and
an
open
task.
17
Deep
Learning
is
a
new
area
of
Machine
Learning
research.
Said
to
be
very
promising.
It
is
about
learning
mul(ple
levels
of
representa(on
and
abstrac(on
that
help
to
make
sense
of
data
such
as
images,
sound,
and
text.
It
is
based
on
neural
networks.
18. Prac$cal
ac$vity
• Start
seems
to
be
limited,
but
it
understands
natural
language
• Google
(presumably
helped
by
Knowledge
Graph)
is
more
accurate,
but
skips
natural
language
(uses
keywords).
• Google
is
customized
to
the
users’
preferences
(different
results)
• Interes(ng
outcomes
• Currency
vs.
Coin
• What’s
love?
• Lyric/song
vs.
Defini(on
ques(on
18
19. What’s
the
meaning
of
life?
• Google
19
Presumably
from
Knowledge
Graph…
22. Acknowledgements
Most
slides
borrowed
or
adapted
from:
Dan
Jurafsky
and
Christopher
Manning,
Coursera
Dan
Jurafsky
and
James
H.
Mar(n
(2015)
J&M(2015,
dra^):
hkps://web.stanford.edu/~jurafsky/slp3/
24. Extrac$ng
rela$ons
from
text
• Company
report:
“Interna(onal
Business
Machines
Corpora(on
(IBM
or
the
company)
was
incorporated
in
the
State
of
New
York
on
June
16,
1911,
as
the
Compu(ng-‐Tabula(ng-‐Recording
Co.
(C-‐T-‐R)…”
• Extracted
Complex
Rela(on:
Company-‐Founding
Company
IBM
Loca(on
New
York
Date
June
16,
1911
Original-‐Name
Compu(ng-‐Tabula(ng-‐Recording
Co.
• But
we
will
focus
on
the
simpler
task
of
extrac(ng
rela(on
triples
Founding-‐year(IBM,1911)
Founding-‐loca(on(IBM,New
York)
24
25. Extrac$ng
Rela$on
Triples
from
Text
The
Leland
Stanford
Junior
University,
commonly
referred
to
as
Stanford
University
or
Stanford,
is
an
American
private
research
university
located
in
Stanford,
California
…
near
Palo
Alto,
California…
Leland
Stanford…founded
the
university
in
1891
Stanford EQ Leland Stanford Junior University
Stanford LOC-IN California
Stanford IS-A research university
Stanford LOC-NEAR Palo Alto
Stanford FOUNDED-IN 1891
Stanford FOUNDER Leland Stanford25
26. Why
Rela$on
Extrac$on?
• Create
new
structured
knowledge
bases,
useful
for
any
app
• Augment
current
knowledge
bases
• Adding
words
to
WordNet
thesaurus,
facts
to
FreeBase
or
DBPedia
• Support
ques(on
answering
• The
granddaughter
of
which
actor
starred
in
the
movie
“E.T.”?
(acted-in ?x “E.T.”)(is-a ?y actor)(granddaughter-of ?x ?y)!
• But
which
rela(ons
should
we
extract?
!
26
27. Automated
Content
Extrac$on
(ACE)
ARTIFACT
GENERAL
AFFILIATION
ORG
AFFILIATION
PART-
WHOLE
PERSON-
SOCIAL
PHYSICAL
Located
Near
Business
Family Lasting
Personal
Citizen-
Resident-
Ethnicity-
Religion
Org-Location-
Origin
Founder
Employment
Membership
Ownership
Student-Alum
Investor
User-Owner-Inventor-
Manufacturer
Geographical
Subsidiary
Sports-Affiliation
“Relation Extraction Task”
27
Automa(c
Content
Extrac(on
(ACE)
is
a
research
program
for
developing
advanced
Informa(on
extrac(on
technologies.
Given
a
text
in
natural
language,
the
ACE
challenge
is
to
detect:
• en((es
• rela(ons
between
en((es
• events
28. Automated
Content
Extrac$on
(ACE)
• Physical-‐Located
PER-‐GPE
!He was in Tennessee!
• Part-‐Whole-‐Subsidiary
ORG-‐ORG
XYZ, the parent company of ABC!
• Person-‐Social-‐Family
PER-‐PER
John’s wife Yoko!
• Org-‐AFF-‐Founder
PER-‐ORG
!Steve Jobs, co-founder of Apple…!
•
28
30. Extrac$ng
UMLS
rela$ons
from
a
sentence
Doppler echocardiography can be used to
diagnose left anterior descending artery
stenosis in patients with type 2 diabetes!
ê
Echocardiography,
Doppler
DIAGNOSES
Acquired
stenosis
30
31. Databases
of
Wikipedia
Rela$ons
31
Rela(ons
extracted
from
Infobox
Stanford
state
California
Stanford
moko
“Die
Lu^
der
Freiheit
weht”
…
Wikipedia
Infobox
32. Rela$on
databases
that
draw
from
Wikipedia
• Resource
Descrip(on
Framework
(RDF)
triples
subject
predicate
object
Golden Gate Park location San Francisco!
dbpedia:Golden_Gate_Park
dbpedia-‐owl:loca(on
dbpedia:San_Francisco!
• The
DBpedia
project
uses
the
Resource
Descrip(on
Framework
(RDF)
to
represent
the
extracted
informa(on
and
consists
of
3
billion
RDF
triples,
580
million
extracted
from
the
English
edi(on
of
Wikipedia
and
2.46
billion
from
other
language
edi(ons
(wikipedia,
March
2016).
• Frequent
Freebase
rela(ons:
people/person/na(onality,
loca(on/loca(on/contains
people/person/profession,
people/person/place-‐of-‐birth
biology/organism_higher_classifica(on
film/film/genre
32
DBpedia
is
a
project
aiming
to
extract
structured
content
from
the
informa(on
created
as
part
of
the
Wikipedia
project.
Freebase
was
a
large
collabora(ve
knowledge
base
consis(ng
of
data
composed
mainly
by
its
community
members
(cf
Seman(c
Web).
-‐-‐>
Knowledge
Graph:
hkps://en.wikipedia.org/wiki/
Freebase
33. How
to
build
rela$on
extractors
1. Hand-‐wriken
pakerns
2. Supervised
machine
learning
3. Semi-‐supervised
and
unsupervised
• Bootstrapping
(using
seeds)
• Distant
supervision
• Unsupervised
learning
from
the
web
33
35. Rules
for
extrac$ng
IS-‐A
rela$on
Early
intui(on
from
Hearst
(1992)
• “Agar
is
a
substance
prepared
from
a
mixture
of
red
algae,
such
as
Gelidium,
for
laboratory
or
industrial
use”
• What
does
Gelidium
mean?
• How
do
you
know?`
35
36. Rules
for
extrac$ng
IS-‐A
rela$on
Early
intui(on
from
Hearst
(1992)
• “Agar
is
a
substance
prepared
from
a
mixture
of
red
algae,
such
as
Gelidium,
for
laboratory
or
industrial
use”
• What
does
Gelidium
mean?
• How
do
you
know?`
36
37. Hearst’s
PaFerns
for
extrac$ng
IS-‐A
rela$ons
(Hearst,
1992):
Automa(c
Acquisi(on
of
Hyponyms
“Y such as X ((, X)* (, and|or) X)”!
“such Y as X”!
“X or other Y”!
“X and other Y”!
“Y including X”!
“Y, especially X”!
37
38. Hearst’s
PaFerns
for
extrac$ng
IS-‐A
rela$ons
Hearst
paFern
Example
occurrences
X
and
other
Y
...temples,
treasuries,
and
other
important
civic
buildings.
X
or
other
Y
Bruises,
wounds,
broken
bones
or
other
injuries...
Y
such
as
X
The
bow
lute,
such
as
the
Bambara
ndang...
Such
Y
as
X
...such
authors
as
Herrick,
Goldsmith,
and
Shakespeare.
Y
including
X
...common-‐law
countries,
including
Canada
and
England...
Y
,
especially
X
European
countries,
especially
France,
England,
and
Spain...
38
39. Hand-‐built
paFerns
for
rela$ons
• Plus:
• Human patterns tend to be high-precision
• Can be tailored to specific domains
• Minus
• Human patterns are often low-recall
• A lot of work to think of all possible patterns!
• Don’t want to have to do this for every relation!
• We’d like better accuracy39
41. Supervised
machine
learning
for
rela$ons
• Choose
a
set
of
rela(ons
we’d
like
to
extract
• Choose
a
set
of
relevant
named
en((es
• Find
and
label
data
• Choose
a
representa(ve
corpus
• Label
the
named
en((es
in
the
corpus
• Hand-‐label
the
rela(ons
between
these
en((es
• Break
into
training,
development,
and
test
• Train
a
classifier
on
the
training
set
41
42. How
to
do
classifica$on
in
supervised
rela$on
extrac$on
1. Find
all
pairs
of
named
en((es
(usually
in
same
sentence)
2. Decide
if
2
en((es
are
related
3. If
yes,
classify
the
rela(on
• Why
the
extra
step?
• Faster
classifica(on
training
by
elimina(ng
most
pairs
• Can
use
dis(nct
feature-‐sets
appropriate
for
each
task.
42
43. Word
Features
for
Rela$on
Extrac$on
• Headwords
of
M1
and
M2,
and
combina(on
Airlines
Wagner
Airlines-‐Wagner
• Bag
of
words
and
bigrams
in
M1
and
M2
{American,
Airlines,
Tim,
Wagner,
American
Airlines,
Tim
Wagner}
• Words
or
bigrams
in
par(cular
posi(ons
le^
and
right
of
M1/M2
M2:
-‐1
spokesman
M2:
+1
said
• Bag
of
words
or
bigrams
between
the
two
en((es
{a,
AMR,
of,
immediately,
matched,
move,
spokesman,
the,
unit}
American
Airlines,
a
unit
of
AMR,
immediately
matched
the
move,
spokesman
Tim
Wagner
said
Men(on
1
Men(on
2
43
44. Named
En$ty
Type
and
Men$on
Level
Features
for
Rela$on
Extrac$on
• Named-‐en(ty
types
• M1:
ORG
• M2:
PERSON
• Concatena(on
of
the
two
named-‐en(ty
types
• ORG-‐PERSON
• En(ty
Level
of
M1
and
M2
(NAME,
NOMINAL,
PRONOUN)
• M1:
NAME
[it
or
he
would
be
PRONOUN]
• M2:
NAME
[the
company
would
be
NOMINAL]
American
Airlines,
a
unit
of
AMR,
immediately
matched
the
move,
spokesman
Tim
Wagner
said
Men(on
1
Men(on
2
44
45. Parse
Features
for
Rela$on
Extrac$on
• Base
syntac(c
chunk
sequence
from
one
to
the
other
NP
NP
PP
VP
NP
NP
• Cons(tuent
path
through
the
tree
from
one
to
the
other
NP
é NP
é
S
é S
ê NP
• Dependency
path
Airlines
matched
Wagner
said
American
Airlines,
a
unit
of
AMR,
immediately
matched
the
move,
spokesman
Tim
Wagner
said
Men(on
1
Men(on
2
45
46. American
Airlines,
a
unit
of
AMR,
immediately
matched
the
move,
spokesman
Tim
Wagner
said.
46
47. Classifiers
for
supervised
methods
• Now
you
can
use
any
classifier
you
like
• MaxEnt
• Naïve
Bayes
• SVM
• ...
• Train
it
on
the
training
set,
tune
on
the
dev
set,
test
on
the
test
set
47
48. Evalua$on
of
Supervised
Rela$on
Extrac$on
• Compute
P/R/F1
for
each
rela(on
48
P =
# of correctly extracted relations
Total # of extracted relations
R =
# of correctly extracted relations
Total # of gold relations
F1 =
2PR
P + R
49. Summary:
Supervised
Rela$on
Extrac$on
+
Can
get
high
accuracies
with
enough
hand-‐labeled
training
data,
if
test
similar
enough
to
training
-‐
Labeling
a
large
training
set
is
expensive
-‐
Supervised
models
are
brikle,
don’t
generalize
well
to
different
genres
49
51. Seed-‐based
or
bootstrapping
approaches
to
rela$on
extrac$on
• No
training
set?
Maybe
you
have:
• A
few
seed
tuples
or
• A
few
high-‐precision
pakerns
• Can
you
use
those
seeds
to
do
something
useful?
• Bootstrapping:
use
the
seeds
to
directly
learn
to
populate
a
rela(on
51
Roughly
said:
Use
seeds
to
ini(alize
a
process
of
annota(on,
then
refine
through
itera(ons
52. Rela$on
Bootstrapping
(Hearst
1992)
• Gather
a
set
of
seed
pairs
that
have
rela(on
R
• Iterate:
1. Find
sentences
with
these
pairs
2. Look
at
the
context
between
or
around
the
pair
and
generalize
the
context
to
create
pakerns
3. Use
the
pakerns
for
grep
for
more
pairs
52
53. Bootstrapping
• <Mark
Twain,
Elmira>
Seed
tuple
• Grep
(google)
for
the
environments
of
the
seed
tuple
“Mark
Twain
is
buried
in
Elmira,
NY.”
X
is
buried
in
Y
“The
grave
of
Mark
Twain
is
in
Elmira”
The
grave
of
X
is
in
Y
“Elmira
is
Mark
Twain’s
final
res(ng
place”
Y
is
X’s
final
res(ng
place.
• Use
those
pakerns
to
grep
for
new
tuples
• Iterate
53
54. Dipre:
Extract
<author,book>
pairs
• Start
with
5
seeds:
• Find
Instances:
The
Comedy
of
Errors,
by
William
Shakespeare,
was
The
Comedy
of
Errors,
by
William
Shakespeare,
is
The
Comedy
of
Errors,
one
of
William
Shakespeare's
earliest
akempts
The
Comedy
of
Errors,
one
of
William
Shakespeare's
most
• Extract
pakerns
(group
by
middle,
take
longest
common
prefix/suffix)
?x , by ?y , ?x , one of ?y ‘s !
• Now
iterate,
finding
new
seeds
that
match
the
pakern
!
Brin, Sergei. 1998. Extracting Patterns and Relations from the World Wide Web.
Author
Book
Isaac
Asimov
The
Robots
of
Dawn
David
Brin
Star(de
Rising
James
Gleick
Chaos:
Making
a
New
Science
Charles
Dickens
Great
Expecta(ons
William
Shakespeare
The
Comedy
of
Errors
54
55. Distant
Supervision
• Combine
bootstrapping
with
supervised
learning
• Instead
of
5
seeds,
• Use
a
large
database
to
get
huge
#
of
seed
examples
• Create
lots
of
features
from
all
these
examples
• Combine
in
a
supervised
classifier
Snow,
Jurafsky,
Ng.
2005.
Learning
syntac(c
pakerns
for
automa(c
hypernym
discovery.
NIPS
17
Fei
Wu
and
Daniel
S.
Weld.
2007.
Autonomously
Seman(fying
Wikipeida.
CIKM
2007
Mintz,
Bills,
Snow,
Jurafsky.
2009.
Distant
supervision
for
rela(on
extrac(on
without
labeled
data.
ACL09
55
56. Distant
supervision
paradigm
• Like
supervised
classifica(on:
• Uses
a
classifier
with
lots
of
features
• Supervised
by
detailed
hand-‐created
knowledge
• Doesn’t
require
itera(vely
expanding
pakerns
• Like
unsupervised
classifica(on:
• Uses
very
large
amounts
of
unlabeled
data
• Not
sensi(ve
to
genre
issues
in
training
corpus
56
57. Distantly
supervised
learning
of
rela$on
extrac$on
paFerns
For
each
rela(on
For
each
tuple
in
big
database
Find
sentences
in
large
corpus
with
both
en((es
Extract
frequent
features
(parse,
words,
etc)
Train
supervised
classifier
using
thousands
of
pakerns
4
1
2
3
5
PER
was
born
in
LOC
PER,
born
(XXXX),
LOC
PER’s
birthplace
in
LOC
<Edwin
Hubble,
Marshfield>
<Albert
Einstein,
Ulm>
Born-‐In
Hubble
was
born
in
Marshfield
Einstein,
born
(1879),
Ulm
Hubble’s
birthplace
in
Marshfield
P(born-in | f1,f2,f3,…,f70000)57
58. Unsupervised
rela$on
extrac$on
• Open
InformaLon
ExtracLon:
• extract
rela(ons
from
the
web
with
no
training
data,
no
list
of
rela(ons
1. Use
parsed
data
to
train
a
“trustworthy
tuple”
classifier
2. Single-‐pass
extract
all
rela(ons
between
NPs,
keep
if
trustworthy
3. Assessor
ranks
rela(ons
based
on
text
redundancy
(FCI,
specializes
in,
so^ware
development)
(Tesla,
invented,
coil
transformer)
58
M.
Banko,
M.
Cararella,
S.
Soderland,
M.
Broadhead,
and
O.
Etzioni.
2007.
Open
informa(on
extrac(on
from
the
web.
IJCAI
59. Evalua$on
of
Semi-‐supervised
and
Unsupervised
Rela$on
Extrac$on
• Since
it
extracts
totally
new
rela(ons
from
the
web
• There
is
no
gold
set
of
correct
instances
of
rela(ons!
• Can’t
compute
precision
(don’t
know
which
ones
are
correct)
• Can’t
compute
recall
(don’t
know
which
ones
were
missed)
• Instead,
we
can
approximate
precision
(only)
•
Draw
a
random
sample
of
rela(ons
from
output,
check
precision
manually
• Can
also
compute
precision
at
different
levels
of
recall.
• Precision
for
top
1000
new
rela(ons,
top
10,000
new
rela(ons,
top
100,000
• In
each
case
taking
a
random
sample
of
that
set
• But
no
way
to
evaluate
recall
59
ˆP =
# of correctly extracted relations in the sample
Total # of extracted relations in the sample