Dan McCreary gave a talk on AI, knowledge representation, and graph databases. He discussed how knowledge representation is a key focus for AI and many feel over half their work is understanding knowledge structures. In 2012, Google published a paper that consolidated many forms of knowledge representation into a single structure called a labeled property graph. This talk described how labeled property graphs became popular and how they are being used to build knowledge graphs that can be used by AI systems and researchers. The talk also discussed challenges of transferring siloed project knowledge into reusable knowledge structures.
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
AI, Knowledge Representation and Graph Databases: Key Trends in Data Science
1. AI, Knowledge Representation
and Graph Databases -
Key Trends in Data Science
Dan McCreary
Social Data Science Meetup
March 2nd, 2019
1
2. Talk Description
Knowledge Representation is a key focus for most modern AI texts. Many AI
experts feel that over half of their work is understanding how to find the
right knowledge structures to build intelligent agents that can continuously
learn and respond to changing events in their world. In 2012, a paper
published by Google started a consolidation of the many diverse forms of
knowledge representation into a single general-purpose structure called a
labeled property graph.
This talk will describe the key events behind this movement and show how a
new generation of data scientist will be needed to build and maintain
corporate knowledge graphs that contain a uniform, normalized and highly
connected data sets for used by researchers and intelligent agents. We will
also discuss the challenges of transferring siloed project-knowledge to
reusable structures.
2
3. Hello, my name is
dan.mccreary@gmail.com
• Distinguished Engineer in AI and Graph Technologies at
Optum’s Advanced Technology Collaborative
• Co-founder of "NoSQL Now!" conference (now part of
Dataversity)
• Author of "Making Sense of NoSQL" (w. Ann Kelly)
• 15+ years of working with non-tabular knowledge
representations
• Background in solution architecture, metadata management,
NLP, semantics, text analytics and knowledge
representation for AI
• Disclaimer: All opinions are my own and may not reflect the
views of my employer
3
4. Graph a “NoSQL” Data Architecture
Relational Analytical (OLAP) Key-Value
Column-Family DocumentGraph
key value
key value
key value
key value
See Chapter 1: https://www.manning.com/books/making-sense-of-nosql
4
6. Relational vs. Graph
6
1. Atomic unit of storage is a row of a table
2. Data is appended one row at a time
3. All columns within a table must have the same
structure and no variations within a table are
allowed
4. Table structures are fixed after design
5. Query language is SQL
6. Joins are log(N) search's against other tables
1. Atomic units of storage are nodes and edges
2. Each node and edge may have independent
properties that are determined at run time
(schema agnostic)
3. Joins between nodes and edges are computed at
load time and are stored as memory pointers
4. Each core hops through 2M edges/second (1,000x
faster than joins)
5. Query language varies although there are some
standards e.g. SPARQL/Cypher/Gremlin/etc.
Relational (row store) Graph
ID NAME DAT
E
AMOUN
T
7. Gartner on Graph Analytics
Key Analytics Trends for 2019
1. Augmented Analytics
2. Augmented Data Management
3. Continuous Intelligence
4. Explainable AI
5.Graph
6. Data Fabric
7. NLP/Conversational Analytics
8. Commercial AI/ML
9. Blockchain
10. Persistent Memory Servers
7
…Graph processing to continuously accelerate data
preparation and enable more complex and adaptive
data science…to efficiently model, explore and query
data with complex interrelationships across data
silos…the need to ask complex questions across data
silos which is not practical or even possible at scale
using SQL queries.
Graphs are also related to 4, 6 and 7
8. Which of the following organizations use graph databases?
• Every major airline uses a graph database to calculate fares in real-time
• Over half of retailers use graphs for product recommendations
Answer: They all do!
8
9. Amazon Product Graph Job Posting
As a leader in e-commerce, Amazon is building an authoritative knowledge base for every product in the world. With hundreds of
millions of customers and billions of products, Amazon will offer a challenging but fun journey to turn this big and rapidly changing data
into high-quality knowledge to impact customer experiences across Amazon from Alexa to Search to Shopping. As a member of the
Product Graph team, based in Seattle, you will play a key role in the establishment of a new platform, with opportunities to create
enormous benefit for our customers and Amazon.
9
10. How do we store
knowledge?
• What exactly is “knowledge”?
• How is knowledge different from raw data and
information?
• What is knowledge engineering?
• What is knowledge architecture?
• How does this relate to data science?
10
12. The Data Science Lifecycle
12
What do we mean by
“Understanding”?
13. 2018: Graphs Join with Deep Learning
How did we get here?
https://arxiv.org/pdf/1806.01261.pdf
13
14. Why Metaphors Matter
“Metaphors drive design decisions”
GraphConnect Keynote 2018
We make decisions not by having a deep
understanding of how technology works,
but through being exposed to the right
metaphors.Hillary Mason
Data Scientist
https://neo4j.com/graphconnect-2018/ around 51 minutes in
14
15. Four Graph Metaphors
15
Neighborhood Walk
(explains index free agency, performance)
Knowledge Triangle
(explains data, information and knowledge)
The Open World Assumption
(explains graph Integration, agility)
The Jenga Tower
(explains resilience of graphs to change)
16. How do you get to your neighbor’s house?
• You walk out your door and over to the house
• Your houses are “Adjacent” so getting there is a direct “hop”
• You don’t need to consult anyone else about how to get to a
neighbor’s house because you have the right pointer
16
17. Your Logical Graph Model
• If two physical items are related, they have a relationships arch between
them and we model it like the above
• We build a “logical” data model that has this link
• In a native graph, the vertexes are loaded into memory and then the
physical memory addresses of each link is also reflected in each of the
nodes
17
Dan LIVES_NEXT_TO Ann
18. Relational Database Use Indexes
● You must walk to a central index system
● The index system does a “search” given the house’s address
● The index system will tell you how to get to your neighbors house
18
Central
Index
Search: 123 Main St.
19. The Knowledge Triangle Metaphor
19
https://en.wikipedia.org/wiki/DIKW_pyramid
• Diagram for representing the
relationships between data,
information, knowledge, and wisdom
• Too often we focus on “Big Data” and
not enough on connected knowledge
and transferrable knowledge
• Graphs are connected information
concepts
• Wisdom is reusable across multiple
context
• Can we capture knowledge in a form
that can be reused across multiple
domains?
Data - Binary, Codes, Data Lakes
Information - Concepts
Knowledge -
Patterns Relationships
Wisdom
& AI
20. From Raw Data to Wisdom with Continuous Enrichment
20
Knowledge that can be reused
across multiple context –
Transfer Learning
Data
Lake
Information
Knowledge
Graph
Wisdom
Raw data dumped from a
database, log files or
documents
Tagged Text
Connections
Consistent
De-duplicated
Semantics
Concepts
Reusable To New Problems
Defintions
Validity
Searchable
Continuous
Enrichment
21. Structure
Definition: The arrangement of and relations between the parts or elements
of something complex
• The real world has lots of structure
• How is structure captured in a machine learning feature?
• If we take simple “features” out of the real world do we lose structure?
• Do our brains “extract” features? (answer: no)
21
22. The Adversarial Turtle
99% of modern image recognition is just simple (but precise) texture matching
https://www.theverge.com/2017/11/2/16597276/google-ai-image-attacks-adversarial-turtle-rifle-3d-printed
• Use a 3D printer to print a turtle
• Place different “textures” on the shell
• Most image recognitions fail
• CNNs: “On a very fundamental level,
our work highlights how far current
CNNs are from learning the 'true'
structure of the world”
22
rifle
rifle
23. Our Brains are Graphs
100B Neurons
10K Connections per Neuron (degree)
23
24. Three Eras of Computing
Procedural Code
(Rules)
Programs
Data Answers
Explanations (Why)
Machine
Learning
Data
Answers
Rules
(10M weights)
1) Procedural Era
2) Machine Learning Era
3) Graph Era
Data
Answers
Explanations (Why)
Knowledge
Machine
Learning
24
25. Sir Tim’s
Vision
RDF
2001
Euler Solves
7 Bridges of
Königsberg
1736
W3C
SPARQL
2008
Labeled
Property
Graph
Neo4j 1.0
2010
LPG
in AI
2018
Google’s
Knowledge
Graph
“Things Not
Strings”
May 2012
Graph Timeline
25
AlexNet
Sept 2012
Graphs Rise
On
Intelligence
2005
26. The Birth of the Semantic Web
• May 2001
• Resource Description Format - RDF
• Keep it simple: Triples all the way down
• Universal Identifiers: URIs
• Ideal for data interchange
• The problem: reification
• When adding a simple attribute to a relationship
causes 10,000 SPARQL queries to become
obsolete
Subject Object
Property
26
28. 2010 Neo4j 1.0 Released
• Neo4j used a new graph data model called a Labeled Property Graph (LPG)
• Each vertex and Edge had their own set of properties (key-value pairs)
• Each edge must have a single type
• Verticies can have 0-N types (labels)
• Adding a new property to a relationships does not require you to rewrite
you queries! They solved the reification problem but kept the flexibility of
graphs!
• Developers LOVE Neo4j!
Vertex Vertex
Properties
Properties
Properties
Edge
28
31. Google Knowledge Graph
Results: Better Search
• 110 Billion Concepts and an API to get verticies but no edges.
• Relationships are too proprietary!
31
33. Graphs and the “Open World”
33
Past: Closed Word Graph: Open World
• Everything is prohibited until it is
permitted
• You can only add data that you model
• Everything is permitted until it is
prohibited
• Anyone can easily add any data at any
time without disruption of services
Also known as “schemaless, or schema agnostic”
34. The Jenga Tower Metaphor
• What happens to your existing queries when you make a change to your data model?
• LPGs allow anyone to add properties to verticies or relationships without disrupting
other queries
34
Add a property to your model
1,000 queries need to be rewritten
35. TigerGraph
• First commercial distributed native graph
product to fully support the LPG data
model
• Scales to 100B verticies on commodity
hardware
• Support for subgraphs (lightweight
security)
• Supports distributed ACID transactions
using multi-version concurrency control
• Large library of graph algorithms
35
36. Sample Graph Algorithms
36
Dependencies
• Failure chains
• Order of operations
Clustering
• Finding related items
• Friends, fraud networks
Similarity
• Similar paths and patterns
Matching/Categorizing
• Look for and tag specific patterns
Flow/Cost
• Optimize costs based on
routing
• Path optimization
Centrality/Search
• Which nodes are the most
connected or relevant?
API Server Auth
Data
38. Graphs Store Concept Distance
• How similar are concepts?
• How can the distance in a graph help us find the underlying intent of the questions?
• How can this help us build automated chatbots?
38
Baby Infant
Child
Chat Question: We are planning to have a new [baby, infant, child], what are my benefits?
Chatbot: Here is a link to your maternity benefits.
Maternity
39. Normalized Google Distance (NGD)
A semantic similarity measure derived from the number of hits
returned by the Google search engine for a given set of keywords.
39
1. "Shakespeare" returns 130M pages
2. "Macbeth" returns 26M pages
3. "Shakespeare Macbeth" returns 20.8M pages
where N is the total number of web pages searched by Google multiplied by
the average number of singleton search terms occurring on pages; f(x) and f(y)
are the number of hits for search terms x and y, respectively; and f(x, y) is the
number of web pages on which both x and y occur.
40. Data Lakes Today
• Very little reuse of code to “understand” and link data
• Few people using rules engines and ML to link data
40
Data Lake
(Data Swamp?)
100s of Data Scientists
100s of R and Python Libraries
80% of effort is “Data Engineering”
20% is Data Science
Data Access Code
41. From Data Scientist to Knowledge Scientist
41
Data
Information
Knowledge
Graph
Wisdom
Data Engineering
Insights
70% of
your time
Fast Path with Feedback
Slow Path
42. Causality: The Apocryphal Cancer Gene Story
42
The Smoking Gene Theory (1950) The Correct Causal Relationship (1969)
Smoking
Gene
Lung
Cancer
Urge to
Smoke
Lung
Cancer
Tar
Smoking
20 million preventable deaths From “The Book of Why” by Judea Pearl
?
43. Explainable AI Models are Graphs
43
https://www.darpa.mil/program/explainable-artificial-intelligence
Training
Data
Training
Data
Machine
Learning
Process
New
Machine
Learning
Process
Learned
Function
Today
Explainable AI
• Why did you do that?
• Why not something else?
• When do you succeed?
• When do you fail?
• When can I trust you?
• How to I correct and error?
• I understand why
• I understand why not
• I know when you succeed
• I know when you fail
• I know when to trust you
• I know why you erred
Decision or
Recommendation
Explanation
Interface
Explainable Model
Task
Task
44. Model-based Machine Learning
44
• Model-based machine learning requires the user to create a model of the world in the form of a graph. This
model encodes the assumptions you make such as how a change on one variable changes another variable
(causal relationships).
• Model-based machine learning allows fewer algorithms and universal inference
Many Algorithms
Flat Data
Hidden Assumptions
Many Algorithms
Specific Solution
Graph Data
Assumptions in Graph
Structure
One Algorithm
Universal Inference
General Solution
Few Algorithms
http://www.mbmlbook.com
45. 2005: On Intelligence
The key to Artificial Intelligence has
always been the representation.
Jeff Hawkins
Sensory Data
Abstract Concepts
Hierarchical Temporal Memory (HTM)
45
Edge Computing Converts
Sensors into Concepts
46. General CPU Hardware vs. Graph Hardware
• Most graph traversal algorithms only use simple pointer hopping
• How efficient are CPU and GPUs at running graph algorithms?
• No need for floating point
• No need for matrix multiplication
46
CPU
1,000 Instructions
Available
(1503 defined x86
instructions )
100
Instructions
Used
47. GPUs are optimized for array processing
• GPUs were designed to support video games
• GPUs are optimized for highly parallel matrix multiplication
• Inefficient for graph “hop” calculations
47
NVIDIA "Pascal" GP100
48. What’s Coming: Graph Hardware!
Graphcore – graph in hardware - $350M in funding
48
49. AlexNet
• A convolutional neural
network (CNN), designed by
Alex Krizhevsky
• First image recognition
program to use GPUs
• Beat other teams by huge
margin 10% 2012
49
50. AlexNet in Graphcore
• Each layer in the algorithm maps to a
region of the graph
• Initial layers are convolutional layers
• The colors represent connection density
50
51. Different Algorithms Look Different
51
ResNet50:
“AI Brain Scan”
https://www.wired.co.uk/gallery/machine-learning-graphcore-pictures-inside-ai
53. Dan’s Predictions
• Graph technology will continue to gain relevance in analytics
• LPGs will be the dominate the graph data model
• RDF/SPARQL might only be used in niche areas
• Once a standard query language is adopted the ability to reuse
algorithms will grow quickly
• The number of graph databases and middle-tier algorithm products
will continue to grown
53
54. Recommendations
• Become a knowledge scientist!
• Learn a bit about graph modeling and graph algorithms
• Think about structure when doing feature design
• Understand the causal relationships between your data (Bayesian Graphs)
• Use graph databases when you have lots of relationships or want real-time
analytics (rules engines, recommendations)
• Build knowledge APIs, not just more libraries for data lakes
• Learn a few graph algorithms
• Similarity
• Clustering
• Recommendation
54
55. Resources
• Dan’s Medium Blog:
https://medium.com/@dmccreary
• Machine Learning with Python
• A Comprehensive Guide to Graph Algorithms
in Neo4j Mark Needham & Amy E. Hodler
• Wikipedia Articles
• Graph Databases
• Similarity (Network Science)
• Google Normalized Distance
• Explainable AI
55
56. Thank You!
Please send e-mail to Dan.McCreary@gmail.com if you want a copy of
the slides.
56
Notas del editor
My background is a solution architect. I have spent most of the last 20 years understanding how to objectively match business problems to the appropriate technologies. My focus has been on the fast-evolving area of NoSQL databases. I have also had a strong interest in AI, semantics, natural language process and search.
OK, now lets take a step back and take a more structure look at where these tools fit into our business processes at Optum.
There are six main database architecture patterns we use when we think of a business problem.
Relational or row-stores
Analytical or OLAP
Key-Value stores – one of the simplest but most extensible data architectures
Column-family stores
Graph stores
and Document stores
Graph is just one of these six. Graph stores are often most closely related to document stores. Both Graph and document stores have the ability for new data to be added to structures without needing to remodel the data. We call these systems schema-free or schema agnostic. They are a key driver for highly agile systems.
Your systems may often draw on two or more of these systems. Databases that support multiple data models are called multi-model databases. They prevent us from having to store the same data over in multiple systems for transactions, search and analytics. Multi-modal systems that integrate graph technologies are also an emerging trend.
https://db-engines.com/en/ranking_categories
Now lets do a side-by-side comparison of both the traditional Relational row-store and compare some key facts. With a row store, the atomic unit of work is adding a single row at a time to a table. The key is that all the datatypes in each column must be the same. If there are dates in the third column and decimals in the fourth column all your data must conform to this standard. The table column structures and datatypes are fixed when you design your database. Once you have a million rows loaded into each table and 10,000 reports created it becomes challenging to modify the database.
Relational database also use the SQL language and they use join operations to dynamically calculate relationships each time the query is run. These calculations are based on binary search algorithms that run in log(N) time where N is the number of rows in each table. As the table grows the searches sale as the log of the number of rows.
Graph databases on the other hand allow you to add any number of nodes and and relationships into your graph database. Each node and relationship has it own properties but there are usually few overarching rules about what datatypes these structures can contain. Graph databases also used fixed memory pointers to store relationships between each node. As a result the queries over relationships are fast and there is no slow-down as the number of verticies gets bigger.
https://neo4j.com/graphconnect-2018/ around 51 minutes in
Sometimes the best way to understand how graph databases are different is using a metaphor. We call this metaphor the “Neighbor Walk” metaphor. It has proven very helpful for people that are trying to understand the performance differences between relational joins and a graph traversal.
Let’s say you want to walk out your front door and over to your neighbor’s house. You open the door, point you body to the neighbor’s house and walk over there. Pretty simple. Since your houses are adjacent this is the logical way to do it.
Here is the graph “logical model” for this. You might have a vertex for your house, a relationships called LIVES_NEXT_TO as a pre-calculated relationship to your neighbor’s house.
Here are the steps that are reflective of how a relational databases does joins
Walk out the door
Walk downtown to the DMV where they have a service that tells you how to get to your neighbor’s house
When you get to the DMV you take a number
When your number comes up you get called by a search agent
You give your neighbors name to the DMV search agent
The search agent at the DMV has a list of all the people in your town sorted by their address (called the index). They search the list (using a binary search algorithm) and they finally give you the GPS coordinates of your neighbor’s house.
You take these GPS coordinates, enter them them into your GPS tracker and follow the directions to Ann’s house.
Now granted the metaphor is not perfect. The speedup really depends on how many rows each table has. However, this metaphore helps you remember that sometimes directly addressing a memory location that is pre-calculated for you at load time can sometimes be three orders of magnitude faster than searching for something every time you need to access it. RDBMS systems try to minimize this search time by doing clever things like caching. However the more data you have the longer the searches take.Direct memory access will always be faster than doing a search!
Now lets also look at some different architectures than a single node graph.
Now lets take a look into some of the reasons that organizations are moving toward knowledge graphs as ways to connect and reuse data. The structure we used to describe this system is called the DIKY pyramid. It is a triangle with Data at the bottom, Information at one level up, Knowledge (in the form of a graph) at the third level and Wisdom at the top. The Wisdom is the layer most strongly associated with AI. When we think of going to the top of the mountain and we ask the Gurus for advice, we are asking them to apply their knowledge to our specific problem. We are asking them to transfer knowledge to our context.
This picture follows this same DIKW pattern. However, we want to invoke the idea of raw binary data at the bottom with binary symbols that have little meaning without context.
The second layer is where we identify the nouns in our documents for data. We look for people, places and things in the byte stream. This is where we can understand isolated data – know the types, the definitions of the types and be able to validate if the 1s and 0s make sense within a narrow context.
The third layer is where we start to tie our Nouns together in a graph. It is where we build relationship links between things. It is where we might look for duplicated data (Master Data Management), where we check for consistency and we verify the patterns of connections are consistent.
The highest level is where we restructure our graph so that it comes in sub-graphs that are reusable across multiple applications. We can provide consistent APIs that pull data in consistent ways and these interfaces are reusable across many domains.
Central to this pyramid is the concept of continuous enrichment cycles. As we discover new things at a higher level, we sometimes provide feedback to lower levels.
If you go to Google and type “chest pain” you will note that a “Knowledge Summary” box appears in the right side. Note that the keyword is mapped into a Preferred Term called “Angina” – which is the formal medical condition name. You will also not that the term “Ischemic chest pain” is also shown as an alternate label. The Knowledge summary has tabs for ABOUT, SYMPTOMS and TREATMENTS. The summary box also indicates that this is a common condition that impacts over 3 million people per year in the US.
This is an example of using a Graph to group common concepts about a topic. The keyword gets you to the right part of the graph, but the knowledge summary is a machine generated summary that has been carefully reviewed for quality by the Mayo Clinic.
Googles Knowledge Graph contains over 100 billion “facts” about things that people search for. They build this graph by harvesting information from many web pages and using both Natural Language Processing (NLP) and machine learning based on what users actually click on to find the most relevant summary information for any topic in the graph.
There is another way to describe the differences between fixed and schema agnostic. This is related to the way that logic systems make assumptions about unknown data. Many relational database use logic that imply that missing or unknown data is always false. Graph databases use an assumption that unknown data is usually unknown. This is known as the Open World Assumption (OWA).
https://en.wikipedia.org/wiki/Open-world_assumption
http://www.mkbergman.com/852/the-open-world-assumption-elephant-in-the-room/
Dan
Explainable AI is a key
https://www.darpa.mil/program/explainable-artificial-intelligence