Más contenido relacionado La actualidad más candente (15) Similar a Empower your Enterprise with language intelligence_Francisco Webber (20) Más de Dataconomy Media (20) Empower your Enterprise with language intelligence_Francisco Webber 1. © cortical.io inc. 2015
Empower your enterprise with
language intelligence
free access at
api.cortical.io
contact: f.webber@cortical.io
2. © cortical.io inc. 2015
who we are
• cortical.io inc. science startup in Vienna - Austria
• result of the CEPT project (Cortical Engine for Processing Text)
• advances in brain theory guided us to a fundamentally
new approach for natural language processing
• we are investor backed in the second round
• we made semantic fingerprinting accessible, robust,
scalable, intuitive and easy to use
3. © cortical.io inc. 2015
big (text) data
• businesses, organizations and
governments are threatened by the
big data explosion.
• a substantial part of this data
consists of text.
• computers ‘understand’ numbers
but ignore the meaning of
language
4. © cortical.io inc. 2015
the downsides
existing semantic systems are…
…hard to build (sometimes impossible)
…inaccurate & fragile (in real-world use)
…expensive to buy (licenses & services)
…tricky to integrate (setup, tuning, training)
…laborious to run (metadata management)
…hard to maintain (dictionaries, ontologies)
5. © cortical.io inc. 2015
Semantic Fingerprinting
5
• semantic fingerprinting bridges
the gap between natural
language processing and
knowledge management
• language is represented using
the same data format as found in
the neocortex (mammalian brain)
• the cortical.io Retina behaves like
a sensorial organ for language
• meaning is embodied in
thousands of self-learned
semantic features
6. © cortical.io inc. 2015
Semantic Fingerprinting
6
organ
piano
church liver
• the cortical.io Retina converts
every word into its semantic
fingerprint
• the fingerprints allow direct
semantic comparison of the
meanings between words
• similar fingerprints have similar
meanings
7. © cortical.io inc. 2015
Semantic Similarity
7
cat dogcat+dog
home & family
aspects
cat specific
aspects
dog specific
aspects
biology
aspects
38%
8. © cortical.io inc. 2015
word sense
disambiguation
rock
apple
computer
sense 1
sense 2
sense …n
songwriter
vocals
spector
airplay
album
seeds
flowers
pollinators
pests
insects
trees
fruit
sense 2a
vegetables
berries
ingredients
sugar
diet
sense 2 …m
food
macintosh
microsoft
linux
software
hardware
10. © cortical.io inc. 2015
Text Fingerprinting
10
• word fingerprints can be
stacked together to form
fingerprints of any piece of text.
• all semantic fingerprint
properties remain: similar
fingerprints mean similar texts.
• representation is made through
more than 16K features.
aggregation+
sparsification
teens like to hear music on
their mobile phones
teens like to hear music on their mobile phones
11. © cortical.io inc. 2015
teens like playing good music
with their mobile phones
you can also consume chart
hits with your notebook27%
Text Similarity 1
11
12. © cortical.io inc. 2015
teens like playing good music
with their mobile phones
the fishermen are sailing out
of the harbor9%
Text Similarity 2
12
13. © cortical.io inc. 2015
similarity engineexample
document
most similar
documents
ordered along
the users
information need
query document index
result set
ranking
NLP Functionality:
Search
14. © cortical.io inc. 2015
NLP Functionality:
classification
cow elephantdog spider frog
“mammal
or
mammals
or
mammalian”
most relevant
matching area
Literally:
16. © cortical.io inc. 2015
Evaluation
16
There are very few comparable algorithms: a couple
of academic ones that cannot be readily used for
production purposes and Google’s Word2Vec.
The MEN Test Collection: http://clic.cimec.unitn.it/~elia.bruni/MEN.html
The RG-65 Test Collection: http://www.aclweb.org/aclwiki/index.php?title=RG-65_Test_Collection_(State_of_the_art)
The WordSimilarity-353 Test Collection: http://www.cs.technion.ac.il/~gabr/resources/data/wordsim353/
Yu&Dredzde 2014: http://arxiv.org/pdf/1411.4166.pdf
Distributed representations of words and phrases: http://papers.nips.cc/paper/5021-di
17. © cortical.io inc. 2015
disciplines
of
language intelligence
• locate documents
• find web content
• match people
• identify products
• monitor competitors
• file business information
• discover new knowledge
• track customer satisfaction
• avoid duplication of work
• advertise on the Internet
• mine for evidences
• improve security
18. © cortical.io inc. 2015
business applications
“Anything that can be expressed with text can be matched:
- products with LinkedIn profiles,
- tweets with Facebook timelines,
- job descriptions with CVs …”
19. © cortical.io inc. 2015
into a stream of semantic
fingerprints
not
matching
convert the
twitter firehose
to generate a realtime
content sub-stream
MATCHMATCHMATCH
Filter
application:
streaming text filter
20. © cortical.io inc. 2015
resulting
filter
fingerprint
creating filter fingerprints
words
text
simple words, keywords
text or text-documents of any
size
profile descriptions or message
postings from social media
the expression builder allows
interactive design of boolean
specifications like:
jaguar - Porsche = tiger
the fingerprint editor allows the
“drawing” of fingerprints. The
meaning of the resulting
fingerprint can be monitored
through the context terms
21. © cortical.io inc. 2015
• match people by their profiles
• no keyword or field based
string matching limitations
• semantic similarity measure to
compare professional profiles
• different profiles for
professional, leisure, interests,
sports etc…
profile
fingerprint
activity
fingerprint
application:
profile matching
22. © cortical.io inc. 2015
• create fingerprints from product
descriptions
• find similar products by
matching description
fingerprints
• create customer fingerprints
from purchased products
product description
fingerprint
Product recommendations
similar products
recommendationsmatch
application:
product recommendation
23. © cortical.io inc. 2015
simplicity
• no prior expertise in natural language processing or linguistics
are needed.
• easy and intuitive definition of semantic filters or classifiers.
• all types of text (words, sentences, paragraphs, chapters, books,
etc…) are processed in the same way using fingerprints.
• easy expansion to other languages by switching to any of the
available language retinas.
• zero configuration and no parameter tweaking needed
cortical.io advantages
24. © cortical.io inc. 2015
cortical.io advantages
efficiency
• semantic fingerprints are small 2K byte sized binary vectors.
• only binary operators are used - no floating point operations
needed.
• linear scalability as the engine takes advantage of a parallel
computing infrastructure (multicore, cluster, virtualization) to
match any performance needed.
• high throughput as complex NLP operations are executed in a
single step and are therefore much faster than with traditional
statistical systems.
25. © cortical.io inc. 2015
quality
• higher precision on NLP operations due to the large number of semantic
features used (>16K).
• automatic disambiguation of human language due to the novel
approach.
• full language independence, equally high quality results in all languages
due to complete avoidance of any statistical language models.
• no unintended bias as no human input is needed as gold standard.
• automatic update as new words and concepts can be added
continuously.
cortical.io advantages
26. © cortical.io inc. 2015
Web
: www.cortical.io
Service : api.cortical.io
Videos : www.cortical.io/company_media.html
Contact : f.webber@cortical.io