TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
GraphX is the blue ocean for scala engineers @ Scala Matsuri 2014
1. Scala Matsuri 2014 LT
GraphX is the blue ocean
for Scala Engineers
@teppei_tosa
https://www.flickr.com/photos/exalthim/337922734
2. Who am I ?
@teppei_tosa
F i n a n c e I T E n g i n e e r
!
Asakusa / Hadoop /
Scala / Play Framework /
Spark / GraphX
https://www.flickr.com/photos/exalthim/337922734
3. • One of Spark Components
• Graph-parallel computation system.
• Unify graph-parallel and data-parallel
computation in one system with a single
composable API.
https://www.flickr.com/photos/exalthim/337922734
4. Example graph computation : Page Rank
0.33 0.33 0.33
Set the values which are
divided 1 with the number of
vertex
0.17 0.17
0.33 0.33
Divide the values of each vertex
with the number of degrees
and send neighbors the values
Summarize the values which are 0.17 0.50 0.33
sent from neighbors and Set
the summarized value
Until the values are
converged, repeat these
steps
https://www.flickr.com/photos/exalthim/337922734
5. Difficulty of graph-parallel computation
Because of connection between vertices,
distributed computation of vertices needs to
communicate between nodes
( Apache Giraph communicates by Zookeeper )
https://www.flickr.com/photos/exalthim/337922734
7. Graph data around you
Social Network Train Network Data Network
https://www.flickr.com/photos/exalthim/337922734
8. What you will be able to do with graph data
Eveluate Vertex Clustering Graph Shape
Flow on Graph Predict Link
9. GraphX is
Still young
• Not enough
information on web
• Much less functions
than other graph lib
like igraph of R
https://www.flickr.com/photos/exalthim/337922734
https://www.flickr.com/photos/katedot/8272997562
10. My work about GraphX
• Translated GraphX document in Japanese
• https://gist.github.com/ironpeace/9306874
• Graph utility
• https://github.com/ironpeace/graph-web
https://www.flickr.com/photos/exalthim/337922734
11. Advantage for Scala Engineers
• Handling graph data with API like Scala’s
collection’s API
• Easy to implement recursive
computation
• Easy to implement function to handle
graph data in iteration
https://www.flickr.com/photos/exalthim/337922734
12. GraphX is the blue ocean for YOU !
• GraphX is the good solution for graph-parallel
computation
• Handling Graph structure data gives you
power to work out something which you have
never been able to
• GraphX is still Young
• Scala engineers have advantage for graph data
https://www.flickr.com/photos/exalthim/337922734
13. Thank you !
Get the Graph Power!
@teppei_tosa
https://www.flickr.com/photos/exalthim/337922734