An introduction to Cyc for the neural/statistical learning audience, followed by a description of Semantic Construction Grammar, a knowledge extraction techniques that produces rich, inferentially productive representations of text. Included are six challenges to the NIPS audience from the point of view of logic-based AI
2. WHAT’S THE POINT OF CYC
3-Stage Research Program for AI
1. Slowly hand-code a very large and very broad
KB.
2. When enough knowledge is present, the system
should actively help with the KA process. It
should be faster to acquire more from texts,
databases, [websites], interactive dialogues,
etc.
3. To go beyond the frontier of human knowledge,
the system will have to rely on learning by
discovery, to expand its KB domain by domain.
(Doug Lenat)
3. CYC KNOWLEDGE BASE
Euro
isa
Education
Mathematics
Planet
Vehicles
Cat
Thing
Learning
Physics
Working
Driving
Human
Time Tree
isa
Hatred
isa
isa
Cat
isa
Words
Chemistry
Nature
Money
Time
located in Emotions
Celestial Earth
Euro
Event
Universe
Body
Stabbing someone Learning
Animal
Fear
Event
Vehicles
Words
subclass
Food
Euro
Tree Physics
Driving
subclas
School
Learning
Mathematics
Rain
Planet
Vehicles
Cat
s
Working
LanguageEducation
Driving
Time
Emotions
EventHuman for a
Going
isa Love
Rain
Education
Hatred
Words
walk
Chemistry
Fear
Money
Euro
Emotions Earth
Stabbing someone
Hatred
Going for a
School
Learning
Fear
Vehicles
walk
Food
Education
Driving
Event
Rain
Stabbing someone
Hatred
Language
Emotions
Going for a
Love
Fear
walk
Nature Physics
Death
Tree School
Death
5. EVENT TEMPORAL-THING PARTIALLY-TANGIBLE-THING
Upper
Ontology
Core
Theories
Domain-Specific
Theories
Very specific information
(some indirect, via SKSI)
( a, b ) a EVENT b EVENT
causes( a, b ) precedes( a, b )
( m, a ) m MAMMAL a ANTHRAX
causes( exposed-to( m, a ), infected-by( m, a ) )
•
(ist FtLaudHolyCrossERCase#403921
(caused CutaneousAnthrax
(SkinLesions Ahmed_al-Haznawit)))
First Order Predicate Calculus: unambiguous; enable mechanical reasoning
Every NZr has a Queen.
Every NZr has a mother.
Higher Order Logic: contexts,
predicates as variables,
nested modals, reflection,…
NZr(x) ruler(x,y)&Queen(y)
y.x.
x.y. NZr(x) mother(x,y)
6. First Order
(isa AVPR2-Human-GIS GeneTypeBySpeciesAndProductFamily)
(gene-GISTypeCodesForType AVPR2-Human-GIS VasopressinV2Receptor)
With Context
In Mt : VertebratePhysiologyMt
(relationExistsAll outputsCreated IntramembranousBoneGrowthAndDevelopment FlatBone)
Each vertebrate flat
bone was created by
intramembranous
bone development.
Rule
In Mt: MolecularBiologyMt
(implies
(isa ?MOLECULE-TYPE TranscriptionFactor)
(behaviorCapable ?MOLECULE-TYPE
(ChemicalBindingEventTypeWithTypesFn TranscriptionFactor DNAMolecule)
objectOfAttachment))
Transcription factors can
bind with DNA.
Exceptions
(implies (and (isa ?MUT GeneticMutationEvent-Cellular)
(locusOfCellularProcess-Cell ?MUT ?ANCESTOR)
(isa ?ANCESTOR Cell)
(subEvents ?REPRO ?MUT))
(abnormal (TheList ?REPRO ?PROGENY ?ANCESTOR)
(implies(and
(isa ?REPRO AsexualReproductionEvent)
(outputsCreated ?REPRO ?PROGENY)
(isa ?PROGENY BiologicalLivingObject)
(doneBy ?REPRO ?ANCESTOR))
(geneticallyIdentical ?ANCESTOR ?PROGENY))))
NIPS TASK 1:
DISTRIBUTED
REP OF THIS
Normally, the progeny of asexual
reproduction are genetically
identical to the parent; however,
if the parent is a cell in which a
mutation has occurred, this rule
doesn’t apply.
9.
Does part of the inner object
stick out of the container?
◦ None of it.
#$in-ContCompletely
◦ Yes
#$in-ContPartially
◦ No
•
#$in-ContClosed
◦ If the container were
turned around could
the contained object
fall out?
Yes
#$in-ContOpen
NIPS TASK 2:
LEARN CONCEPTS
THIS FINELY
DISTINGUISED
10. NIPS TASK 3: LEARN TO
PRODUCE COHERENT NL
FROM REPRESENTATIONS
10
incyc.cyc.com
12. a sad realisation and an opportunity
logical representations good for inference are
sometimes very far from natural language
… in unpredictable ways
… and they need to be right in ways NL does not
So, syntactic mapping is (pretty much) hopeless
BUT, storage is pretty much free,
… and inference is getting there
… and ILP works at least some of the time
(EBMT ∩ FrameNet ∩ Cyc) ► SCG
12
13. Renaissance Artists
Bronze Age Farmers
(SubcollectionOfWithRelationToFn
Artist activeDuringPeriod
TheRenaissance)
(SubcollectionOfWithRelationToFn
Farmer activeDuringPeriod
TheBronzeAge)
Kind of TimeInterval
Noun Form: not plural
Kind of Agent-Generic
Noun form
30. 9 Shades of Fail
Term interpretation fails lenient CycL truth test (“WFF”)
Arg required to be a collection but is not
Arg not an instance of all argument type constraints (strict)
Arg provably disjoint with a constraint (collections)
Arg is provably not-isa or not-genl a constraint (everything)
Argument is probably bad since it conflicts with implicit
argument constraints via #$relationNotExistsExists,
#$relationAllExists, or
#$someTypePlaysRoleInSituationType KB knowledge
Volume mismatch between types using
#$typicallyMoreVoluminousThan KB knowledge
Esoteric
Unlikely
30
31. Argument is probably bad since it conflicts with implicit
argument constraints via #$relationNotExistsExists,
#$relationAllExists, or
#$someTypePlaysRoleInSituationType KB knowledge
(#$SubcollectionOfWithRelationToTypeFn #$Fist
#$properPhysicalParts #$EthnicGroupOfRussians)
(relationAllExists #$Fist #$properPhysicalParts
#$AnimalBodyPart)
31
32. Volume mismatch between types
Can horses jump?: by analogy with Tahoe bars.
(#$SubcollectionOfWithRelationToTypeFn
#$SawHorse #$objectFoundInLocation #$Can)
This interpretation is blocked if we can prove:
(#$typicallyMoreVoluminousThan #$SawHorse #$Can)
32
33. Plausibility: Flat sheets of paper
Sheets that are part of an apartment unit that are part of some paper
(#$sentencePlausibilityScore
(#$Quote
(#$equalSymbols ?X
(#$SubcollectionOfWithRelationFromTypeFn
(#$SubcollectionOfWithRelationFromTypeFn #$BedSheet
#$physicalParts #$ApartmentUnit) #$physicalParts #$Paper)))
?SCORE) in #$PlausibilityQueryMt
→
?SCORE: (#$NumericLikelihoodFn 0.08)
33
36. •
•
•
•
•
Primacy of Semantics
Importance of mapping not to
“logical form”, but to logic
The frightening complexity of
human level knowledge
Importance of doing inference
during understanding
The time is now/nigh for uniting
the threads of AI
end
36
The representation language of Cyc, and the inference engine that draws conclusions from the KB content and data, is sufficiently powerful to express and reason about biological processes. Some examples of the representation of biological objects, relations, and processes are given here.
These pages synthesised using natural language generation from an underlying logical representation.