Graph Gurus Episode 19: Deep Learning Implemented by GSQL on a Native Parallel Graph Database

Graph Gurus Episode 19
Deep Learning Implemented By GSQL
On A Native Parallel Graph Database

© 2019 TigerGraph. All Rights Reserved
Welcome
• Attendees are muted but you can talk to us via Chat in Zoom
• We will have 10 min for Q&A at the end
• Send questions at any time using the Q&A tab in the Zoom menu
• The webinar will be recorded
• A link to the presentation and reproducible steps will be emailed
Developer Edition Download -
https://www.tigergraph.com/developer/
TigerGraph Cloud -
https://www.tigergraph.com/cloud/

Speaking Today
● BS in Mechanical Engineering, Tsinghua
University, China
● MS & PhD in Mechanical Engineering, Stanford
University focused on numerical simulations of
nanophysics
● PhD minor in Philosophy focused on applications
of mathematical logic in artificial intelligence
Changran Liu,
Solution Architect
Victor Lee,
Director of Product
Management
● BS in Electrical Engineering and Computer
Science from UC Berkeley, MS in Electrical
Engineering from Stanford University
● PhD in Computer Science from Kent State
University focused on graph data mining
● 15+ years in tech industry

Outline
● Graphs and Machine Learning
● Deep Learning Basics
● GSQL Neural Network Demo:
recognizing handwritten digits
● GSQL Neural Network: closer look
• Prediction (forward propagation)
• Cost function
• Training (backpropagation)
• Recap

© 2019 TigerGraph. All Rights
Reserved
Two Hot Items: A Good Match?
5
Powerful way to represent
information and to query it
Graph Database Machine Learning
Established powerhouse for
predictions and "smart" systems,
but still tricky to use

© 2019 TigerGraph. All Rights
Reserved
Natural and organic approach for representing
all kinds of things, ideas, and how they relate to
one another.
6
Profile
Events
Locations
Blockchain, Network
Analysis
Browsing
History
Transactions
Recommendations
New Products
Predictions
Supply Chain
CyberSecurity
Purchases
Graph Is How WE THINK
Use the power of relationships and
deep analytics to provide insights
Identify key data and process
massive amounts of data
Profile
Events
Locations
Browsing
History
Transactions
Purchases

Some Uses of Machine Learning
● Virtual Personal Assistants
○ Voice-to-text
○ Semantic analysis
○ Formulate a response
● Real-time route optimization
○ Waze, Google Maps
● Image Recognition
○ Facial recognition
○ X-ray / CT scan reading
● Effective spam filters
7

Some Machine Learning Techniques
● Unsupervised Learning
No "correct" answer; detect patterns or summarize
○ Clustering, Partitioning, Outlier detection
○ Neural Networks
○ Dimensionality reduction
● Supervised Learning
Humans provide "training data" with known correct answers
○ Decision Trees
○ Nearest Neighbor
○ Hidden Markov Model,Naïve Bayesian
○ Linear Regression
○ Support Vector Machines (SVM)
○ Neural Networks, Deep Learning
8

Graph Analytics - Explainable Results
9
● The data model itself is explanatory
● Graph Analytics = Connecting the Dots

Three Ways to Use Graph for Machine Learning
● Use Graph Algorithms for Unsupervised Learning
○ Frequency Patterns
○ Community Detection
○ Ranking
● Extract Graph Features to use for Supervised Learning
○ Graph provides unique, richer features
○ Can work with many different ML techniques
○ Manual Feature Selection or Graph Embedding,
representing the entire graph as features of each vertices.
● Use the Graph as a Neural Network
10

Neural Network
11
● Input layer:
input variables
● Output layer: the
result
● Hidden layer:
internal intermediate
variables
● Training process will
set the weights of
the connecting
edges.
https://www.pnas.org/content/116/4/1074 "What are the limits of deep learning?"
Neural Networks
ARE graphs!
Training can be
computationally intense:
● many edges
● each edge weight
may be recomputed
for each iteration.
⇒ Need Native Parallel
Graph for performance

Neural Network & Deep Learning
12
Deep Learning is simply
using a Neural Network
with multiple hidden
layers.
● Layers may be
pre-trained to guide
their their behavior
https://www.pnas.org/content/116/4/1074 "What are the limits of deep learning?"

Deep Learning Technologies
Facial recognition Real-time translation Strategy games: Go, Chess

Recognize Handwritten Digits
20 by 20 Grayscale Image Neural Network
1
2
3
4
5
6
7
8
9
0

Problem Introduction
Examples of Training Dataset
• Goal: recognition of handwritten digits
• Data: each input is a 20 by 20 grayscale image
showing a handwritten digit (400 dimensional vector)
• 5000 images are split into two parts:
• Training: 4000 images
• Validation: 1000 images
Training data from: https://www.coursera.org/learn/machine-learning

Demo
● Graphs and Machine Learning
● Deep Learning Basics
● GSQL Neural Network Demo:
recognizing handwritten digits
● GSQL Neural Network: closer look
• Cost function
• Training (backpropagation)
• Recap

Schema
• Training/validation data (x, y) are stored on the inputLayer and outputLayer vertices
respectively.
• Randomly generated weight are initially stored on Theta1 and Theta2 edges
outputLayerhiddenLayerinputLayer
v_in_id STRING
x_training LIST<DOUBLE>
x_validation LIST<DOUBLE>
v_out_id STRING
y_training LIST<DOUBLE>
y_validation LIST<DOUBLE>
theta DOUBLE theta DOUBLE
v_hid_id STRING
Theta1 Theta2

Graph
Input Layer Hidden Layer Output Layer

10 vertices
representing 0 ~ 9
Theta1:
401 x 25 edges
Theta2:
26 x 10 edges
400 vertices
for 20 x 20 pixels
1 bias
25 vertices
1 bias
Graph
Input Layer Hidden Layer Output Layer

Loading Data
Input Layer Hidden Layer
10 vertices
representing 0 ~ 9
Theta1:
401 x 25 edges
Theta2:
26 x 10 edges
400 vertices
for 20 x 20 pixels
1 bias
25 vertices
1 bias
Output Layer

Loading Data
10 vertices
representing 0 ~ 9400 vertices
for 20 x 20 pixels
1 bias
25 vertices
1 bias
Output Layer
Theta1:
401 x 25 edges
Theta2:
26 x 10 edges

Loading Data
10 vertices
representing 0 ~ 9
400 vertices
for 20 x 20 pixels
1 bias
25 vertices
1 bias
Output Layer
…
Theta1:
401 x 25 edges
Theta2:
26 x 10 edges

Loading Data
10 vertices
representing 0 ~ 9
400 vertices
for 20 x 20 pixels
1 bias
25 vertices
1 bias
Output Layer
x_training =
x_training =
4000 images
= y_training
= y_training
4000 images
Theta1:
401 x 25 edges
Theta2:
26 x 10 edges

Loading Data
10 vertices
representing 0 ~ 9
400 vertices
for 20 x 20 pixels
1 bias
25 vertices
1 bias
Output Layer
x_validation =
x_validation =
1000 images
= y_validation
= y_validation
1000 images
Theta1:
401 x 25 edges
Theta2:
26 x 10
edges

Neuron A (biological) neuron
• receives signals
• processes the signals
• signals neurons connected to it.
Learning involves
• growing topic-specific
dendrites to connect specific
neurons at specific synapses.
An artificial neuron
• same as above
Learning involves
• adjusting the weights of the
network to improve the
accuracy of the result.
• This is done by minimizing the
observed errors.https://en.wikipedia.org/wiki/Artificial_neural_network#/media/File:Neuron3.png

Training a Neural Network
input
forward propagation
diff. btw prediction and label
converged?
no
finish
yes
backpropagation
update weight

Forward Propagation
CREATE QUERY training() FOR GRAPH NeuralNetwork {
/* variables declaration */
…
/* forward propagation*/
InputLayer = {inputLayer.*};
InputLayer = SELECT s FROM InputLayer:s -(Theta1:e)->hiddenLayer:t
ACCUM
IF s.v_in_id == "401" THEN
t.@a_training += product_List_const(@@ones_training, e.theta)
ELSE
t.@a_training += product_List_const(s.x_training, e.theta)
END
POST-ACCUM
t.@a_training = sigmoid_ArrayAccum(t.@a_training);
1

Forward Propagation
CREATE QUERY training() FOR GRAPH NeuralNetwork {
/* variables declaration */
…
/* forward propagation*/
InputLayer = {inputLayer.*};
ACCUM
t.@a_training += product_List_const(@@ones_training,
e.theta)
ELSE
t.@a_training += product_List_const(s.x_training, e.theta)
END
POST-ACCUM
1

Forward Propagation
HiddenLayer = {hiddenLayer.*};
HiddenLayer = SELECT s FROM HiddenLayer:s -(Theta2:e)->outputLayer:t
ACCUM
IF s.v_hid_id == "26" THEN
t.@a_training += product_List_const(@@ones_training, e.theta)
ELSE
t.@a_training += product_ArrayAccum_const(s.@a_training, e.theta)
END
POST-ACCUM
1

Forward Propagation
HiddenLayer = {hiddenLayer.*};
ACCUM
t.@a_training += product_List_const(@@ones_training,
e.theta)
ELSE
t.@a_training += product_ArrayAccum_const(s.@a_training,
e.theta)
END
POST-ACCUM
1

Backpropagation
OutputLayer = SELECT s FROM OutputLayer:s
ACCUM
s.@delta += diff_ArrayAccum_List(s.@a_training, s.y_training),
OutputLayer = SELECT s FROM OutputLayer:s -(Theta2:e)->hiddenLayer:t
ACCUM
t.@delta += product_ArrayAccum_const(s.@delta, e.theta)
POST-ACCUM
t.@delta += delta_ArrayAccum(t.@delta, t.@a);

Backpropagation
OutputLayer = SELECT s FROM OutputLayer:s
ACCUM
s.@delta += diff_ArrayAccum_List(s.@a_training, s.y_training);
OutputLayer = SELECT s FROM OutputLayer:s -(Theta2:e)->hiddenLayer:t
ACCUM
t.@delta += product_ArrayAccum_const(s.@delta, e.theta)
POST-ACCUM
t.@delta += delta_ArrayAccum(t.@delta, t.@a);

Update Theta1
ACCUM
DOUBLE D = 0,
D = dotProduct_ArrayAccum_List(t.@delta, @@ones_training),
D = D/4000
ELSE
D = dotProduct_ArrayAccum_List(t.@delta , s.x_training),
D = D/4000+lambda*e.theta/4000
END,
e.theta = e.theta - alpha * D;
1

Update Theta2
ACCUM
DOUBLE D = 0,
D = dotProduct_ArrayAccum_List(t.@delta, @@ones_training),
D = D/4000
ELSE
D = dotProduct_ArrayAccum_ArrayAccum(t.@delta, s.@a),
END,
}
1

Update Theta2
ACCUM
DOUBLE D = 0,
D = dotProduct_ArrayAccum_ArrayAccum(s.@a , t.@delta),
D = D/4000
ELSE
END,
}
1

Summary
• Applications of deep learning
• Demo: recognize handwritten digit
• Implementation using GSQL
• Training (backpropagation)The human brain contains about 100 billion
neurons

Q & A
Please send your questions via the Q&A menu in Zoom

Sign up: info.tigergraph.com/graph-gurus-20

Additional Resources
Test Drive Online Demo
https://www.tigergraph.com/demo
Download the Developer Edition
https://www.tigergraph.com/download/
Developer Portal
https://www.tigergraph.com/developers/
Guru Scripts
https://github.com/tigergraph/ecosys/tree/master/guru_scripts
Join our Developer Forum
https://groups.google.com/a/opengsql.org/forum/#!forum/gsql-users
linkedin.com/company/TigerGraphfacebook.com/TigerGraphDB@TigerGraphDB youtube.com/tigergraph

Graph Gurus Episode 19: Deep Learning Implemented by GSQL on a Native Parallel Graph Database

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (19)

Similar a Graph Gurus Episode 19: Deep Learning Implemented by GSQL on a Native Parallel Graph Database

Similar a Graph Gurus Episode 19: Deep Learning Implemented by GSQL on a Native Parallel Graph Database (20)

Más de TigerGraph

Más de TigerGraph (20)

Último

Último (20)

Graph Gurus Episode 19: Deep Learning Implemented by GSQL on a Native Parallel Graph Database