Más contenido relacionado
La actualidad más candente (20)
Similar a OQGraph at MySQL Users Conference 2011 (20)
OQGraph at MySQL Users Conference 2011
- 1. OQGRAPH
Graphs and Heirarchies in Plain SQL
Antony T Curtis <atcurtis@gmail.com>
graph@openquery.com
http://openquery.com/graph
- 2. Hierarchies / Trees
● Trees typically have a single "root" node.
● All child nodes have only one parent.
Other examples:
● Menu structures.
● Organisation charts.
● Filesystem directories.
OQGRAPH computation engine © 2009-2011 Open Query
- 3. Graphs / Networks
● Nodes connected by Edges.
● Edges may be directional.
● Edges may have a "weight" / "cost" attribute.
● Directed graphs may have bi-directional edges.
● Unconnected sets of nodes may exist on same graph.
● There need not be a "root" node.
Examples:
● "Social Graphs" / friend relationships.
● Decision / State graphs.
● Airline routes
OQGRAPH computation engine © 2009-2011 Open Query
- 4. Problem Solving
Trees Networks
● Does Dilbert report to the ● What is the quickest air
PHB? route to MLA from SJC?
● How many people report ● What is the shortest path
to manager X? of decisions to get to state
#11 from state #5.
● How many people are
between the CEO and ● Playing "Six Degrees of
employee Y? Kevin Bacon"
OQGRAPH computation engine © 2009-2011 Open Query
- 5. RDBMS with Heirarchies and Graphs
● Not always a particularly good fit.
● Various tree models exist; each with limitations:
○ Adjacency model
■ Either uses fixed max depth or recursive queries.
■ Oracle has CONNECT BY PRIOR
■ SQL99 has WITH RECURSIVE...UNION...
○ Nested set
■ complex
■ recursive queries to find path to root.
○ Materialised path
■ Ugly and not relational.
■ Can be quite effective when used correctly.
Further reading: http://dev.mysql.com/tech-resources/articles/hierarchical-data.html
OQGRAPH computation engine © 2009-2011 Open Query
- 6. What is OQGRAPH?
● Implemented as a storage engine.
○ Original concept by Arjen Lentz
○ for MySQL
○ for Drizzle
○ for MariaDB
● Mk. II implementation by
○ Antony Curtis
○ Arjen Lentz @openquery
● Mk. III dev. on LaunchPad
● Licensing
○ GPLv2+
OQGRAPH computation engine © 2009-2011 Open Query
- 7. OQGRAPH: A Computation Engine
● It is not a general purpose data engine.
○ unlike MyISAM, InnoDB, PBXT or MEMORY.
● Looks like an ordinary table.
● Has a very different internal architecture.
● It does not operate in terms of
○ storing data for later retrieval.
○ having indexes on data.
● May be regarded as a "magic view" or "table function".
OQGRAPH computation engine © 2009-2011 Open Query
- 8. Getting OQGRAPH
MariaDB - available as a plugin.
● Included in mainline MariaDB 5.2 builds.
○ INSTALL PLUGIN oqgraph SONAME ‘oqgraph_engine’;
● Or build it for yourself.
○ All MySQL/MariaDB storage engines should be built with
same debug/compile flags for correct behaviour.
● Check with SHOW PLUGINS and SHOW STORAGE ENGINE.
● 64bit Windows build is currently unstable.
MySQL 5.0 does not have plugins so must be compiled in.
● Binaries available from ourdelta.org
● Included in '-sail' builds since 5.0.87-d10
○ SHOW GLOBAL VARIABLES LIKE 'have_oqgraph';
Drizzle
● Basic port has been done.
OQGRAPH computation engine © 2009-2011 Open Query
- 9. Anatomy of an OQGRAPH table
CREATE TABLE db.tblname (
latch SMALLINT UNSIGNED NULL,
origid BIGINT UNSIGNED NULL,
destid BIGINT UNSIGNED NULL,
weight DOUBLE NULL,
seq BIGINT UNSIGNED NULL,
linkid BIGINT UNSIGNED NULL,
KEY (latch, origid, destid) USING HASH,
KEY (latch, destid, origid) USING HASH
) ENGINE=OQGRAPH;
Note: Mk.3 has a few additional options, discussed later.
OQGRAPH computation engine © 2009-2011 Open Query
- 10. OQGRAPH Mk.II - Inserting data
● Only insert directed edges into its memory store.
● Edge weight are optional and default to 1.0
● Undirected edges may be represented as two directed
edges, in opposite directions.
INSERT INTO foo (origid,destid) VALUES
(1,2), (2,3), (2,4),
(4,5), (3,6), (5,6);
OQGRAPH computation engine © 2009-2011 Open Query
- 11. Selecting Edges
SELECT * FROM foo;
+-------+--------+--------+--------+------+--------+
| latch | origid | destid | weight | seq | linkid |
+-------+--------+--------+--------+------+--------+
| NULL | 1 | 2 | 1 | 0 | NULL |
| NULL | 2 | 3 | 1 | 1 | NULL |
| NULL | 2 | 4 | 1 | 2 | NULL |
| NULL | 4 | 5 | 1 | 3 | NULL |
| NULL | 3 | 6 | 1 | 4 | NULL |
| NULL | 5 | 6 | 1 | 5 | NULL |
+-------+--------+--------+--------+------+--------+
OQGRAPH computation engine © 2009-2011 Open Query
- 12. Now, it's time for some magic.
(shortest path calculation)
● SELECT * FROM foo
WHERE latch=1 AND origid=1 AND destid=6;
+-------+--------+--------+--------+------+--------+
| latch | origid | destid | weight | seq | linkid |
+-------+--------+--------+--------+------+--------+
| 1 | 1 | 6 | NULL | 0 | 1 |
| 1 | 1 | 6 | 1 | 1 | 2 |
| 1 | 1 | 6 | 1 | 2 | 3 |
| 1 | 1 | 6 | 1 | 3 | 6 |
+-------+--------+--------+--------+------+--------+
● SELECT GROUP_CONCAT(linkid ORDER BY seq) AS path
FROM foo WHERE latch=1 AND origid=1 AND destid=6 G
path: 1,2,3,6
OQGRAPH computation engine © 2009-2011 Open Query
- 13. Other computations,
● Which paths lead to node 4?
SELECT GROUP_CONCAT(linkid) AS list
FROM foo WHERE latch=1 AND destid=4 G
list: 1,2,4
● Where can I get to from node 4?
SELECT GROUP_CONCAT(linkid) AS list
FROM foo WHERE latch=1 AND origid=4 G
list: 6,5,4
OQGRAPH computation engine © 2009-2011 Open Query
- 14. Other computations, continued.
● See docs for latch 0 and latch NULL
● latch 1 : Dijkstra's shortest path.
○ O((V + E).log V)
● latch 2 : Breadth-first search.
○ O(V+E)
● Other algorithms possible
OQGRAPH computation engine © 2009-2011 Open Query
- 15. Joins make it prettier,
● INSERT INTO people VALUES
(1,’pearce’), (2,’hunnicut’), (3,’potter’),
(4,’hoolihan’), (5,’winchester’), (6,’
mulcahy’);
● SELECT GROUP_CONCAT(name ORDER BY seq) path
FROM foo
JOIN people ON (foo.linkid = people.id)
WHERE latch=1 AND origid=1 AND destid=6 G
path: pearce,hunnicut,potter,mulcahy
OQGRAPH computation engine © 2009-2011 Open Query
- 16. In brief: OQGRAPH Mk. II
● Behaviour similar to MEMORY engine:
○ Table-level locking for normal tables
○ No locking for temporary tables
○ No persistence
○ No transactions
● Insert performance O(N.LOG(N))
This means...
○ It’s usable for menus & more, up to say a (few) million edges.
○ Inserts get very slow when there are a lot of edges.
○ You can use the --init-file option to copy/load on startup.
OQGRAPH computation engine © 2009-2011 Open Query
- 17. First Look: OQGRAPH Mk. III
Features:
● Similar core graph implementation.
● Uses existing tables as a source for edge data.
● Does not impose any strict structure on the donor table.
● Efficient Judy sparse bitmaps for node traversal data.
Notes:
● Tables are read-only and only read from the backing table.
● Table must be in same schema as the backing table.
● Current implementation is not of release quality yet.
● But it works!
OQGRAPH computation engine © 2009-2011 Open Query
- 18. Tree of Life, with Mk.III
Load the tol.sql schema,
Create tol_link backing store table,
create table tol_link (
source int unsigned not null,
target int unsigned not null,
primary key (source, target),
key (target) ) engine=innodb;
Populate it with all the edges we need:
INSERT INTO tol_link (source,target)
SELECT parent,id FROM tol WHERE parent IS NOT NULL
UNION ALL
SELECT id,parent FROM tol WHERE parent IS NOT NULL;
Query OK, 178102 rows affected (14.66 sec)
Direct download: http://bazaar.launchpad.net/~openquery-core/oqgraph/trunk/view/head:/examples/tree-of-life/tol.sql
OQGRAPH computation engine © 2009-2011 Open Query
- 19. Tree of Life, cont.
Creating the OQGRAPH MkIII table:
CREATE TABLE tol_tree (
latch SMALLINT UNSIGNED NULL,
origid BIGINT UNSIGNED NULL,
destid BIGINT UNSIGNED NULL,
weight DOUBLE NULL,
seq BIGINT UNSIGNED NULL,
linkid BIGINT UNSIGNED NULL,
KEY (latch, origid, destid) USING HASH,
KEY (latch, destid, origid) USING HASH
) ENGINE=OQGRAPH
data_table='tol_link' origid='source' destid='target';
select count(*) from tol_treeG
count(*): 178102
OQGRAPH computation engine © 2009-2011 Open Query
- 20. Tree of Life - finding H.Sapiens
SELECT GROUP_CONCAT(name ORDER BY seq
SEPARATOR ' -> ') AS path
FROM tol_tree JOIN tol ON (linkid=id)
WHERE latch=1 AND origid=1 AND destid=16421 G
path: Life on Earth -> Eukaryotes -> Unikonts ->
Opisthokonts -> Animals -> Bilateria -> Deuterostomia ->
Chordata -> Craniata -> Vertebrata -> Gnathostomata ->
Teleostomi -> Osteichthyes -> Sarcopterygii -> Terrestrial
Vertebrates -> Tetrapoda -> Reptiliomorpha -> Amniota ->
Synapsida -> Eupelycosauria -> Sphenacodontia ->
Sphenacodontoidea -> Therapsida -> Theriodontia ->
Cynodontia -> Mammalia -> Eutheria -> Primates ->
Catarrhini -> Hominidae -> Homo -> Homo sapiens
1 row in set (2.13 sec)
OQGRAPH computation engine © 2009-2011 Open Query
- 21. We want your feedback!!!1one!
● Very easy to use...
But do feel free to ask us for help/advice.
● OpenQuery created friendlist_graph for Drupal 6.
○ Addition to the existing friendlist module.
○ Enables easy social networking in Drupal.
○ Peter Lieverdink (@cafuego) did this in about 30 minutes
● We would like to know how you are using OQGRAPH!
○ You could be doing something really cool...
OQGRAPH computation engine © 2009-2011 Open Query
- 22. Links and support
● Binaries & Packages
○ http://mariadb.com (MariaDB 5.2 & above) < easiest to begin
○ http://ourdelta.org (MySQL 5.0)
● Source collaboration
○ http://launchpad.net/maria (in /storage/oqgraph)
○ http://launchpad.net/oqgraph
○ Development Mk3 source is currently at https://code.launchpad.
net/~atcurtis/ourdelta/oqgraph-v3
● Info, Docs, Support, Licensing, Engineering
○ http://openquery.com/graph
○ This presentation: http://goo.gl/UrybZ
Thank you!
Antony Curtis & Arjen Lentz
graph@openquery.com
OQGRAPH computation engine © 2009-2011 Open Query