This document discusses graph processing with Titan and Scylla. It provides an overview of graph computing and common graph domains. It describes Apache TinkerPop and the property graph model. It then discusses the graph landscape, including graph databases for OLTP vs graph processors for OLAP. It introduces Titan as an open source graph database and describes its key features and architecture. Finally, it discusses using Scylla as a drop-in replacement for Cassandra as the storage backend for Titan, highlighting Scylla's performance benefits for OLTP and potential for future integration.
3. Common graph data domains
• Social network analysis
• Configuration management database
• Master data management
• Recommendation engines
• Knowledge graphs
• Internet of things
5. Property graph and Gremlin
• Structure
§ Vertex
§ Edge
§ Properties
• Gremlin
§ Domain specific language (DSL) for graph
§ Functional, data flow approach
§ Full library of traversal steps
§ Support for non-JVM languages
7. Graph Landscape
• Graph database vs Graph processor
§ OLTP vs OLAP
§ Neighborhood vs Whole graph
8. Apache Spark or Apache Giraph
• Pick a graph processor for OLAP…
§ Spark is the new hotness in analytics
§ Giraph is better suited for gigantic graphs
• By using Apache TinkerPop and Gremlin, we can use
either one seamlessly
9. Titan (Aurelius)
• Pick a graph database for OLTP…
• Pluggable storage backend
• Pluggable indexing backend
• Gift from Matthias Broecheler and Dan LaRocque
• Apache license but not in ASF?
http://titandb.io
10. DataStax Enterprise Graph?
• Apache TinkerPop compliant
• Not open source
• Titan inspired
• Gremlin tooling with DataStax Studio
12. Why Titan?
• Designed for big graphs (10B+ edges)
• Local graph traversals (OLTP)
• Batch graph processing (OLAP)
• Desire a free, open source distributed graph database
13. Titan Key Features
• Data management
• Vertex-centric indices
• Graph partitioning
• Edge compression
http://s3.thinkaurelius.com/docs/titan/1.0.0/getting-started.html
15. Why Scylla?
• Drop-in replacement for Cassandra 2.1.8
• Thrift support (Duarte Nunes)
§ Partial support in 1.3
§ Full support in 1.4
• Titan is compatible with Scylla 1.3
§ OLTP with Scylla is crazy fast
§ OLAP via SparkGraphComputer
https://github.com/scylladb/scylla/issues/693
16. Titan reawakened with Scylla
• Next steps
• Benchmarking OLTP and OLAP with Scylla
• Transition Titan to native CQL
§ Essentially a rewrite
§ Materialized views
• Native search in Scylla?
17. • Open source leads the way
• Partner with open communities