Más contenido relacionado La actualidad más candente (19) Similar a Graph Gurus Episode 19: Deep Learning Implemented by GSQL on a Native Parallel Graph Database (20) Graph Gurus Episode 19: Deep Learning Implemented by GSQL on a Native Parallel Graph Database1. Graph Gurus Episode 19
Deep Learning Implemented By GSQL
On A Native Parallel Graph Database
2. © 2019 TigerGraph. All Rights Reserved
Welcome
• Attendees are muted but you can talk to us via Chat in Zoom
• We will have 10 min for Q&A at the end
• Send questions at any time using the Q&A tab in the Zoom menu
• The webinar will be recorded
• A link to the presentation and reproducible steps will be emailed
Developer Edition Download -
https://www.tigergraph.com/developer/
TigerGraph Cloud -
https://www.tigergraph.com/cloud/
3. © 2019 TigerGraph. All Rights Reserved
Speaking Today
● BS in Mechanical Engineering, Tsinghua
University, China
● MS & PhD in Mechanical Engineering, Stanford
University focused on numerical simulations of
nanophysics
● PhD minor in Philosophy focused on applications
of mathematical logic in artificial intelligence
Changran Liu,
Solution Architect
Victor Lee,
Director of Product
Management
● BS in Electrical Engineering and Computer
Science from UC Berkeley, MS in Electrical
Engineering from Stanford University
● PhD in Computer Science from Kent State
University focused on graph data mining
● 15+ years in tech industry
4. © 2019 TigerGraph. All Rights Reserved
Outline
● Graphs and Machine Learning
● Deep Learning Basics
● GSQL Neural Network Demo:
recognizing handwritten digits
● GSQL Neural Network: closer look
• Prediction (forward propagation)
• Cost function
• Training (backpropagation)
• Recap
5. © 2019 TigerGraph. All Rights
Reserved
Two Hot Items: A Good Match?
5
Powerful way to represent
information and to query it
Graph Database Machine Learning
Established powerhouse for
predictions and "smart" systems,
but still tricky to use
6. © 2019 TigerGraph. All Rights
Reserved
Natural and organic approach for representing
all kinds of things, ideas, and how they relate to
one another.
6
Profile
Events
Locations
Blockchain, Network
Analysis
Browsing
History
Transactions
Recommendations
New Products
Predictions
Supply Chain
CyberSecurity
Purchases
Graph Is How WE THINK
Use the power of relationships and
deep analytics to provide insights
Identify key data and process
massive amounts of data
Profile
Events
Locations
Browsing
History
Transactions
Purchases
7. © 2019 TigerGraph. All Rights Reserved
Some Uses of Machine Learning
● Virtual Personal Assistants
○ Voice-to-text
○ Semantic analysis
○ Formulate a response
● Real-time route optimization
○ Waze, Google Maps
● Image Recognition
○ Facial recognition
○ X-ray / CT scan reading
● Effective spam filters
7
8. © 2019 TigerGraph. All Rights Reserved
Some Machine Learning Techniques
● Unsupervised Learning
No "correct" answer; detect patterns or summarize
○ Clustering, Partitioning, Outlier detection
○ Neural Networks
○ Dimensionality reduction
● Supervised Learning
Humans provide "training data" with known correct answers
○ Decision Trees
○ Nearest Neighbor
○ Hidden Markov Model,Naïve Bayesian
○ Linear Regression
○ Support Vector Machines (SVM)
○ Neural Networks, Deep Learning
8
9. © 2019 TigerGraph. All Rights Reserved
Graph Analytics - Explainable Results
9
● The data model itself is explanatory
● Graph Analytics = Connecting the Dots
10. © 2019 TigerGraph. All Rights Reserved
Three Ways to Use Graph for Machine Learning
● Use Graph Algorithms for Unsupervised Learning
○ Frequency Patterns
○ Community Detection
○ Ranking
● Extract Graph Features to use for Supervised Learning
○ Graph provides unique, richer features
○ Can work with many different ML techniques
○ Manual Feature Selection or Graph Embedding,
representing the entire graph as features of each vertices.
● Use the Graph as a Neural Network
10
11. © 2019 TigerGraph. All Rights Reserved
Neural Network
11
● Input layer:
input variables
● Output layer: the
result
● Hidden layer:
internal intermediate
variables
● Training process will
set the weights of
the connecting
edges.
https://www.pnas.org/content/116/4/1074 "What are the limits of deep learning?"
Neural Networks
ARE graphs!
Training can be
computationally intense:
● many edges
● each edge weight
may be recomputed
for each iteration.
⇒ Need Native Parallel
Graph for performance
12. © 2019 TigerGraph. All Rights Reserved
Neural Network & Deep Learning
12
Deep Learning is simply
using a Neural Network
with multiple hidden
layers.
● Layers may be
pre-trained to guide
their their behavior
https://www.pnas.org/content/116/4/1074 "What are the limits of deep learning?"
13. © 2019 TigerGraph. All Rights Reserved
Deep Learning Technologies
Facial recognition Real-time translation Strategy games: Go, Chess
14. © 2019 TigerGraph. All Rights Reserved
Outline
● Graphs and Machine Learning
● Deep Learning Basics
● GSQL Neural Network Demo:
recognizing handwritten digits
● GSQL Neural Network: closer look
• Prediction (forward propagation)
• Cost function
• Training (backpropagation)
• Recap
15. © 2019 TigerGraph. All Rights Reserved
Recognize Handwritten Digits
20 by 20 Grayscale Image Neural Network
1
2
3
4
5
6
7
8
9
0
16. © 2019 TigerGraph. All Rights Reserved
Problem Introduction
Examples of Training Dataset
• Goal: recognition of handwritten digits
• Data: each input is a 20 by 20 grayscale image
showing a handwritten digit (400 dimensional vector)
• 5000 images are split into two parts:
• Training: 4000 images
• Validation: 1000 images
Training data from: https://www.coursera.org/learn/machine-learning
17. © 2019 TigerGraph. All Rights Reserved
Demo
● Graphs and Machine Learning
● Deep Learning Basics
● GSQL Neural Network Demo:
recognizing handwritten digits
● GSQL Neural Network: closer look
• Prediction (forward propagation)
• Cost function
• Training (backpropagation)
• Recap
18. © 2019 TigerGraph. All Rights Reserved
Schema
• Training/validation data (x, y) are stored on the inputLayer and outputLayer vertices
respectively.
• Randomly generated weight are initially stored on Theta1 and Theta2 edges
outputLayerhiddenLayerinputLayer
v_in_id STRING
x_training LIST<DOUBLE>
x_validation LIST<DOUBLE>
v_out_id STRING
y_training LIST<DOUBLE>
y_validation LIST<DOUBLE>
theta DOUBLE theta DOUBLE
v_hid_id STRING
Theta1 Theta2
20. © 2019 TigerGraph. All Rights Reserved
10 vertices
representing 0 ~ 9
Theta1:
401 x 25 edges
Theta2:
26 x 10 edges
400 vertices
for 20 x 20 pixels
1 bias
25 vertices
1 bias
Graph
Input Layer Hidden Layer Output Layer
21. © 2019 TigerGraph. All Rights Reserved
Loading Data
Input Layer Hidden Layer
10 vertices
representing 0 ~ 9
Theta1:
401 x 25 edges
Theta2:
26 x 10 edges
400 vertices
for 20 x 20 pixels
1 bias
25 vertices
1 bias
Output Layer
22. © 2019 TigerGraph. All Rights Reserved
Loading Data
Input Layer Hidden Layer
10 vertices
representing 0 ~ 9400 vertices
for 20 x 20 pixels
1 bias
25 vertices
1 bias
Output Layer
Theta1:
401 x 25 edges
Theta2:
26 x 10 edges
23. © 2019 TigerGraph. All Rights Reserved
Loading Data
Input Layer Hidden Layer
10 vertices
representing 0 ~ 9
400 vertices
for 20 x 20 pixels
1 bias
25 vertices
1 bias
Output Layer
…
Theta1:
401 x 25 edges
Theta2:
26 x 10 edges
24. © 2019 TigerGraph. All Rights Reserved
Loading Data
Input Layer Hidden Layer
10 vertices
representing 0 ~ 9
400 vertices
for 20 x 20 pixels
1 bias
25 vertices
1 bias
Output Layer
x_training =
x_training =
4000 images
= y_training
= y_training
4000 images
Theta1:
401 x 25 edges
Theta2:
26 x 10 edges
25. © 2019 TigerGraph. All Rights Reserved
Loading Data
Input Layer Hidden Layer
10 vertices
representing 0 ~ 9
400 vertices
for 20 x 20 pixels
1 bias
25 vertices
1 bias
Output Layer
x_validation =
x_validation =
1000 images
= y_validation
= y_validation
1000 images
Theta1:
401 x 25 edges
Theta2:
26 x 10
edges
26. © 2019 TigerGraph. All Rights Reserved
Neuron A (biological) neuron
• receives signals
• processes the signals
• signals neurons connected to it.
Learning involves
• growing topic-specific
dendrites to connect specific
neurons at specific synapses.
An artificial neuron
• same as above
Learning involves
• adjusting the weights of the
network to improve the
accuracy of the result.
• This is done by minimizing the
observed errors.https://en.wikipedia.org/wiki/Artificial_neural_network#/media/File:Neuron3.png
27. © 2019 TigerGraph. All Rights Reserved
Training a Neural Network
input
forward propagation
diff. btw prediction and label
converged?
no
finish
yes
backpropagation
update weight
28. © 2019 TigerGraph. All Rights Reserved
Training a Neural Network
input
forward propagation
diff. btw prediction and label
converged?
no
finish
yes
backpropagation
update weight
29. © 2019 TigerGraph. All Rights Reserved
Forward Propagation
CREATE QUERY training() FOR GRAPH NeuralNetwork {
/* variables declaration */
…
/* forward propagation*/
InputLayer = {inputLayer.*};
InputLayer = SELECT s FROM InputLayer:s -(Theta1:e)->hiddenLayer:t
ACCUM
IF s.v_in_id == "401" THEN
t.@a_training += product_List_const(@@ones_training, e.theta)
ELSE
t.@a_training += product_List_const(s.x_training, e.theta)
END
POST-ACCUM
t.@a_training = sigmoid_ArrayAccum(t.@a_training);
1
30. © 2019 TigerGraph. All Rights Reserved
Forward Propagation
CREATE QUERY training() FOR GRAPH NeuralNetwork {
/* variables declaration */
…
/* forward propagation*/
InputLayer = {inputLayer.*};
InputLayer = SELECT s FROM InputLayer:s -(Theta1:e)->hiddenLayer:t
ACCUM
IF s.v_in_id == "401" THEN
t.@a_training += product_List_const(@@ones_training,
e.theta)
ELSE
t.@a_training += product_List_const(s.x_training, e.theta)
END
POST-ACCUM
t.@a_training = sigmoid_ArrayAccum(t.@a_training);
1
31. © 2019 TigerGraph. All Rights Reserved
Forward Propagation
CREATE QUERY training() FOR GRAPH NeuralNetwork {
/* variables declaration */
…
/* forward propagation*/
InputLayer = {inputLayer.*};
InputLayer = SELECT s FROM InputLayer:s -(Theta1:e)->hiddenLayer:t
ACCUM
IF s.v_in_id == "401" THEN
t.@a_training += product_List_const(@@ones_training, e.theta)
ELSE
t.@a_training += product_List_const(s.x_training, e.theta)
END
POST-ACCUM
t.@a_training = sigmoid_ArrayAccum(t.@a_training);
1
32. © 2019 TigerGraph. All Rights Reserved
Forward Propagation
HiddenLayer = {hiddenLayer.*};
HiddenLayer = SELECT s FROM HiddenLayer:s -(Theta2:e)->outputLayer:t
ACCUM
IF s.v_hid_id == "26" THEN
t.@a_training += product_List_const(@@ones_training, e.theta)
ELSE
t.@a_training += product_ArrayAccum_const(s.@a_training, e.theta)
END
POST-ACCUM
t.@a_training = sigmoid_ArrayAccum(t.@a_training);
1
33. © 2019 TigerGraph. All Rights Reserved
Forward Propagation
HiddenLayer = {hiddenLayer.*};
HiddenLayer = SELECT s FROM HiddenLayer:s -(Theta2:e)->outputLayer:t
ACCUM
IF s.v_hid_id == "26" THEN
t.@a_training += product_List_const(@@ones_training,
e.theta)
ELSE
t.@a_training += product_ArrayAccum_const(s.@a_training,
e.theta)
END
POST-ACCUM
t.@a_training = sigmoid_ArrayAccum(t.@a_training);
1
34. © 2019 TigerGraph. All Rights Reserved
Forward Propagation
HiddenLayer = {hiddenLayer.*};
HiddenLayer = SELECT s FROM HiddenLayer:s -(Theta2:e)->outputLayer:t
ACCUM
IF s.v_hid_id == "26" THEN
t.@a_training += product_List_const(@@ones_training,
e.theta)
ELSE
t.@a_training += product_ArrayAccum_const(s.@a_training,
e.theta)
END
POST-ACCUM
t.@a_training = sigmoid_ArrayAccum(t.@a_training);
1
35. © 2019 TigerGraph. All Rights Reserved
Training a Neural Network
input
forward propagation
diff. btw prediction and label
converged?
no
finish
yes
backpropagation
update weight
36. © 2019 TigerGraph. All Rights Reserved
Backpropagation
OutputLayer = SELECT s FROM OutputLayer:s
ACCUM
s.@delta += diff_ArrayAccum_List(s.@a_training, s.y_training),
OutputLayer = SELECT s FROM OutputLayer:s -(Theta2:e)->hiddenLayer:t
ACCUM
t.@delta += product_ArrayAccum_const(s.@delta, e.theta)
POST-ACCUM
t.@delta += delta_ArrayAccum(t.@delta, t.@a);
37. © 2019 TigerGraph. All Rights Reserved
Backpropagation
OutputLayer = SELECT s FROM OutputLayer:s
ACCUM
s.@delta += diff_ArrayAccum_List(s.@a_training, s.y_training);
OutputLayer = SELECT s FROM OutputLayer:s -(Theta2:e)->hiddenLayer:t
ACCUM
t.@delta += product_ArrayAccum_const(s.@delta, e.theta)
POST-ACCUM
t.@delta += delta_ArrayAccum(t.@delta, t.@a);
38. © 2019 TigerGraph. All Rights Reserved
Backpropagation
OutputLayer = SELECT s FROM OutputLayer:s
ACCUM
s.@delta += diff_ArrayAccum_List(s.@a_training, s.y_training);
OutputLayer = SELECT s FROM OutputLayer:s -(Theta2:e)->hiddenLayer:t
ACCUM
t.@delta += product_ArrayAccum_const(s.@delta, e.theta)
POST-ACCUM
t.@delta += delta_ArrayAccum(t.@delta, t.@a);
39. © 2019 TigerGraph. All Rights Reserved
Backpropagation
OutputLayer = SELECT s FROM OutputLayer:s
ACCUM
s.@delta += diff_ArrayAccum_List(s.@a_training, s.y_training),
OutputLayer = SELECT s FROM OutputLayer:s -(Theta2:e)->hiddenLayer:t
ACCUM
t.@delta += product_ArrayAccum_const(s.@delta, e.theta)
POST-ACCUM
t.@delta += delta_ArrayAccum(t.@delta, t.@a);
40. © 2019 TigerGraph. All Rights Reserved
Update Theta1
InputLayer = SELECT s FROM InputLayer:s -(Theta1:e)->hiddenLayer:t
ACCUM
DOUBLE D = 0,
IF s.v_in_id == "401" THEN
D = dotProduct_ArrayAccum_List(t.@delta, @@ones_training),
D = D/4000
ELSE
D = dotProduct_ArrayAccum_List(t.@delta , s.x_training),
D = D/4000+lambda*e.theta/4000
END,
e.theta = e.theta - alpha * D;
1
41. © 2019 TigerGraph. All Rights Reserved
Update Theta1
InputLayer = SELECT s FROM InputLayer:s -(Theta1:e)->hiddenLayer:t
ACCUM
DOUBLE D = 0,
IF s.v_in_id == "401" THEN
D = dotProduct_ArrayAccum_List(t.@delta, @@ones_training),
D = D/4000
ELSE
D = dotProduct_ArrayAccum_List(t.@delta , s.x_training),
D = D/4000+lambda*e.theta/4000
END,
e.theta = e.theta - alpha * D;
1
42. © 2019 TigerGraph. All Rights Reserved
Update Theta2
HiddenLayer = SELECT s FROM HiddenLayer:s -(Theta2:e)->outputLayer:t
ACCUM
DOUBLE D = 0,
IF s.v_hid_id == "26" THEN
D = dotProduct_ArrayAccum_List(t.@delta, @@ones_training),
D = D/4000
ELSE
D = dotProduct_ArrayAccum_ArrayAccum(t.@delta, s.@a),
D = D/4000+lambda*e.theta/4000
END,
e.theta = e.theta - alpha * D;
}
1
43. © 2019 TigerGraph. All Rights Reserved
Update Theta2
HiddenLayer = SELECT s FROM HiddenLayer:s -(Theta2:e)->outputLayer:t
ACCUM
DOUBLE D = 0,
D = dotProduct_ArrayAccum_ArrayAccum(s.@a , t.@delta),
IF s.v_hid_id == "26" THEN
D = D/4000
ELSE
D = D/4000+lambda*e.theta/4000
END,
e.theta = e.theta - alpha * D;
}
1
44. © 2019 TigerGraph. All Rights Reserved
Summary
• Applications of deep learning
• Demo: recognize handwritten digit
• Implementation using GSQL
• Prediction (forward propagation)
• Training (backpropagation)The human brain contains about 100 billion
neurons
45. Q & A
Please send your questions via the Q&A menu in Zoom
47. © 2019 TigerGraph. All Rights Reserved
Additional Resources
Test Drive Online Demo
https://www.tigergraph.com/demo
Download the Developer Edition
https://www.tigergraph.com/download/
Developer Portal
https://www.tigergraph.com/developers/
Guru Scripts
https://github.com/tigergraph/ecosys/tree/master/guru_scripts
Join our Developer Forum
https://groups.google.com/a/opengsql.org/forum/#!forum/gsql-users
linkedin.com/company/TigerGraphfacebook.com/TigerGraphDB@TigerGraphDB youtube.com/tigergraph