SlideShare una empresa de Scribd logo
1 de 65
17th International Conference 
Information Visualization 
GGrraapphh--bbaasseedd RReellaattiioonnaall 
DDaattaa VViissuuaalliizzaattiioonn 
DDaanniieell MMáárriioo 
pdf at: www.icmc.usp.br/pessoas/junio 
ddee LLiimmaa 
JJoosséé FFeerrnnaannddoo 
RRooddrriigguueess JJrr.. 
AAggmmaa JJuuccii 
MMaacchhaaddoo TTrraaiinnaa 
<<ddaanniieellmm@@iiccmmcc.. 
uusspp..bbrr>> 
<<jjuunniioo@@iiccmmcc..uusspp..bbrr>> <<aaggmmaa@@iiccmmcc..uusspp..bbrr>> 
Instituto de Ciências Matemáticas e de Computação 
Universidade de São Paulo 
15, 16, 17 and 18 July 2013 
SOAS, University of London ● London ● UK 
pdf at http://www.icmc.usp.br/~junio/PublishedPapers/Lima-et_al_IV-2013.pdf
pdf at: www.icmc.usp.br/pessoas/junio 
OOuuttlliinnee 
1. Introduction 
2. Method 
3. Experiments 
4. Conclusions
11.. IInnttrroodduuccttiioonn 
pdf at: www.icmc.usp.br/pessoas/junio
IInnttrroodduuccttiioonn 
• Large datasets are common 
• unstructured: text 
• semi-structured: XML, RDF, sensor data 
• structured: relational (DBMS), network (graph-like) 
• Analysis Process 
• Data Representation / Transformation 
• Storage / Retrieval 
• Statistics 
• Visualization 
• Analysis 
pdf at: www.icmc.usp.br/pessoas/junio 
Iterate
IInnttrroodduuccttiioonn 
• How to spot interesting facts in the relationships 
of large relational databases? 
• How are the entities on the database related to 
each other? 
• How are the entities distributed over the 
relations of the database? 
• How do the several attributes of the database 
influence the relationships of the entities? 
• How do we quickly and intuitively browse the 
relational database, considering its complex 
structure? 
pdf at: www.icmc.usp.br/pessoas/junio
OOuurr aapppprrooaacchh 
• Use graph representation 
• Graph-partitioning techniques 
• Graph-processing 
• Interactive Visualization 
Database  Graph  Partitioning  Visualization  Analysis 
pdf at: www.icmc.usp.br/pessoas/junio
22.. MMeetthhoodd 
pdf at: www.icmc.usp.br/pessoas/junio
DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis 
RReellaattiioonnsshhiippss aass GGrraapphhss 
Author Publish Work 
pdf at: www.icmc.usp.br/pessoas/junio
DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis 
RReellaattiioonnsshhiippss aass GGrraapphhss 
Author Publish Work 
pdf at: www.icmc.usp.br/pessoas/junio 
Alice A 
Bob B 
Charles C 
… 
A 1 
B 2 
C 3 
A 2 
… 
1 Optic Fiber 
2 Networks 
3 Cryptography 
… 
11 
22 
33 
AA 
BB 
CC
DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis 
RReellaattiioonnsshhiippss aass GGrraapphhss 
Author Publish Work 
pdf at: www.icmc.usp.br/pessoas/junio 
Alice A 
Bob B 
Charles C 
… 
A 1 
B 2 
C 3 
A 2 
… 
1 Optic Fiber 
2 Networks 
3 Cryptography 
… 
11 
22 
33 
AA 
BB 
CC
DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis 
RReellaattiioonnsshhiippss aass GGrraapphhss 
Author Publish Work 
pdf at: www.icmc.usp.br/pessoas/junio 
Alice A 
Bob B 
Charles C 
… 
A 1 
B 2 
C 3 
A 2 
… 
1 Optic Fiber 
2 Networks 
3 Cryptography 
… 
11 
22 
33 
AA 
BB 
CC
DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis 
GGrraapphh PPaarrttiittiioonniinngg 
pdf at: www.icmc.usp.br/pessoas/junio
DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis 
GGrraapphh PPaarrttiittiioonniinngg 
pdf at: www.icmc.usp.br/pessoas/junio
DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis 
GGrraapphh PPaarrttiittiioonniinngg 
pdf at: www.icmc.usp.br/pessoas/junio
DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis 
HHiieerraarrcchhiiccaall PPaarrttiittiioonniinngg 
pdf at: www.icmc.usp.br/pessoas/junio
DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis 
HHiieerraarrcchhiiccaall PPaarrttiittiioonniinngg 
subgraph 1 subgraph 2 
cut 0 
pdf at: www.icmc.usp.br/pessoas/junio
DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis 
HHiieerraarrcchhiiccaall PPaarrttiittiioonniinngg 
subgraph 1 subgraph 2 
cut 0 
pdf at: www.icmc.usp.br/pessoas/junio
DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis 
HHiieerraarrcchhiiccaall PPaarrttiittiioonniinngg 
subgraph 1 subgraph 2 
cut 0 
pdf at: www.icmc.usp.br/pessoas/junio
DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis 
HHiieerraarrcchhiiccaall PPaarrttiittiioonniinngg 
cut 0 
pdf at: www.icmc.usp.br/pessoas/junio
DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis 
HHiieerraarrcchhiiccaall PPaarrttiittiioonniinngg 
cut 1 cut 0 cut 2 
pdf at: www.icmc.usp.br/pessoas/junio
DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis 
HHiieerraarrcchhiiccaall PPaarrttiittiioonniinngg 
cut 1 cut 0 cut 2 
pdf at: www.icmc.usp.br/pessoas/junio 
subgraph 1-1 
subgraph 1-2 
subgraph 2-1 
subgraph 2-2
DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis 
SSuuppeerrGGrraapphh 
SuperNode 1-1 
cut 1 cut 0 cut 2 
SuperNode 1-2 
pdf at: www.icmc.usp.br/pessoas/junio 
subgraph 2-1 
subgraph 2-2
DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis 
SSuuppeerrGGrraapphh 
SuperNode 1-1 
pdf at: www.icmc.usp.br/pessoas/junio 
SuperEdge 1 
SuperNode 1-2 
subgraph 2-1 
cut 0 cut 2 
subgraph 2-2
DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis 
SSuuppeerrGGrraapphh 
pdf at: www.icmc.usp.br/pessoas/junio 
SuperEdge 2 
SuperNode 2-1 
SuperNode 2-2 
cut 0
DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis 
SSuuppeerrGGrraapphh 
subgraph 1 subgraph 2 
cut 0 
pdf at: www.icmc.usp.br/pessoas/junio
DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis 
SSuuppeerrGGrraapphh 
SuperNode 1 SuperNode 2 
cut 0 
pdf at: www.icmc.usp.br/pessoas/junio
DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis 
SSuuppeerrGGrraapphh 
SuperNode 1 SuperNode 2 
SuperEdge 0 
pdf at: www.icmc.usp.br/pessoas/junio
DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis 
SSuuppeerrGGrraapphh 
pdf at: www.icmc.usp.br/pessoas/junio
DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis 
SSuuppeerrGGrraapphh 
• Further details in the paper 
pdf at: www.icmc.usp.br/pessoas/junio
DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis 
AAttttrriibbuuttee--bbaasseedd PPaarrttiittiioonniinngg 
Paper Author 
Left relation: Paper = {idPaper, country, year, title} 
Rght relation: Author = {idAuthor, age, dept, authorName} 
pdf at: www.icmc.usp.br/pessoas/junio
DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis 
AAttttrriibbuuttee--bbaasseedd PPaarrttiittiioonniinngg 
Paper Author 
Left relation: Paper = {idPaper, country, year, title} 
Rght relation: Author = {idAuthor, age, dept, authorName} 
pdf at: www.icmc.usp.br/pessoas/junio
DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis 
AAttttrriibbuuttee--bbaasseedd PPaarrttiittiioonniinngg 
Paper Author 
PPaappeerr AAuutthhoorr 
pdf at: www.icmc.usp.br/pessoas/junio 
PP AA 
Left relation: Paper = {idPaper, country, year, title} 
Rght relation: Author = {idAuthor, age, dept, authorName}
DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis 
AAttttrriibbuuttee--bbaasseedd PPaarrttiittiioonniinngg 
Paper Author 
PPaappeerr AAuutthhoorr 
pdf at: www.icmc.usp.br/pessoas/junio 
PP AA 
local 
Left relation: Paper = {idPaper, country, year, title} 
Rght relation: Author = {idAuthor, age, dept, authorName}
DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis 
AAttttrriibbuuttee--bbaasseedd PPaarrttiittiioonniinngg 
Paper Author 
PPaappeerr AAuutthhoorr 
pdf at: www.icmc.usp.br/pessoas/junio 
UUSS 
PP AA 
UUSS BBRR 
BBRR 
local 
Left relation: Paper = {idPaper, country, year, title} 
Rght relation: Author = {idAuthor, age, dept, authorName}
DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis 
AAttttrriibbuuttee--bbaasseedd PPaarrttiittiioonniinngg 
year 
Paper Author 
PPaappeerr AAuutthhoorr 
pdf at: www.icmc.usp.br/pessoas/junio 
UUSS 
PP AA 
UUSS BBRR 
BBRR 
local 
Left relation: Paper = {idPaper, country, year, title} 
Rght relation: Author = {idAuthor, age, dept, authorName}
DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis 
AAttttrriibbuuttee--bbaasseedd PPaarrttiittiioonniinngg 
year 
local 
Paper Author 
PPaappeerr AAuutthhoorr 
pdf at: www.icmc.usp.br/pessoas/junio 
UUSS 
PP AA 
UUSS BBRR 
BBRR 
’0’000-’-0’066 
’0’066-’-1’111 
Left relation: Paper = {idPaper, country, year, title} 
Rght relation: Author = {idAuthor, age, dept, authorName}
DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis 
AAttttrriibbuuttee--bbaasseedd PPaarrttiittiioonniinngg 
year 
local 
Paper Author 
‘9‘955++ ’0’022++ 
’0’066++ ** 
PPaappeerr AAuutthhoorr 
pdf at: www.icmc.usp.br/pessoas/junio 
UUSS 
PP AA 
UUSS BBRR 
BBRR 
’0’000-’-0’066 
’0’066-’-1’111 
Left relation: Paper = {idPaper, country, year, title} 
Rght relation: Author = {idAuthor, age, dept, authorName}
DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis 
AAttttrriibbuuttee--bbaasseedd PPaarrttiittiioonniinngg 
year 
local age 
Paper Author 
‘9‘955++ ’0’022++ 
’0’066++ ** 
PPaappeerr AAuutthhoorr 
pdf at: www.icmc.usp.br/pessoas/junio 
UUSS 
PP AA 
UUSS BBRR 
BBRR 
’0’000-’-0’066 
’0’066-’-1’111 
Left relation: Paper = {idPaper, country, year, title} 
Rght relation: Author = {idAuthor, age, dept, authorName}
DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis 
AAttttrriibbuuttee--bbaasseedd PPaarrttiittiioonniinngg 
year 
local age 
Paper Author 
‘9‘955++ ’0’022++ 
’0’066++ ** 
PPaappeerr AAuutthhoorr 
pdf at: www.icmc.usp.br/pessoas/junio 
UUSS 
PP AA 
UUSS BBRR <<4400 >>4400 
<<4400 
>>4400 
BBRR 
’0’000-’-0’066 
’0’066-’-1’111 
Left relation: Paper = {idPaper, country, year, title} 
Rght relation: Author = {idAuthor, age, dept, authorName}
DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis 
AAttttrriibbuuttee--bbaasseedd PPaarrttiittiioonniinngg 
year dept 
local age 
Paper Author 
‘9‘955++ ’0’022++ 
’0’066++ ** 
PPaappeerr AAuutthhoorr 
pdf at: www.icmc.usp.br/pessoas/junio 
UUSS 
PP AA 
UUSS BBRR <<4400 >>4400 
<<4400 
>>4400 
BBRR 
’0’000-’-0’066 
’0’066-’-1’111 
Left relation: Paper = {idPaper, country, year, title} 
Rght relation: Author = {idAuthor, age, dept, authorName}
DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis 
AAttttrriibbuuttee--bbaasseedd PPaarrttiittiioonniinngg 
year dept 
local age 
Paper Author 
PPaappeerr AAuutthhoorr 
pdf at: www.icmc.usp.br/pessoas/junio 
UUSS 
PP AA 
UUSS BBRR <<4400 >>4400 
<<4400 
>>4400 
BBRR 
’0’000-’-0’066 
IMIMEE ** 
’0’066++ ** 
EEEESSCC 
ICICMMCC 
‘9‘955++ ’0’022++ 
’0’066-’-1’111 
FFFFLLCCHH 
Left relation: Paper = {idPaper, country, year, title} 
Rght relation: Author = {idAuthor, age, dept, authorName}
DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis 
AAttttrriibbuuttee--bbaasseedd PPaarrttiittiioonniinngg 
year dept 
local age 
Paper Author 
‘9‘955++ ’0’022++ 
PPaappeerr AAuutthhoorr 
pdf at: www.icmc.usp.br/pessoas/junio 
UUSS 
PP AA 
UUSS BBRR <<4400 >>4400 
<<4400 
>>4400 
BBRR 
’0’000-’-0’066 
IMIMEE ** 
’0’066++ ** 
’0’066-’-1’111 
FFFFLLCCHH 
Connectivity 
SuperEdges 
Left relation: Paper = {idPaper, country, year, title} 
Rght relation: Author = {idAuthor, age, dept, authorName}
DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis 
AAttttrriibbuuttee--bbaasseedd PPaarrttiittiioonniinngg 
year dept 
local age 
Paper Author 
‘9‘955++ ’0’022++ 
PPaappeerr AAuutthhoorr 
pdf at: www.icmc.usp.br/pessoas/junio 
UUSS 
PP AA 
UUSS BBRR <<4400 >>4400 
<<4400 
>>4400 
BBRR 
IMIMEE ** 
’0’066++ ** 
FFFFLLCCHH 
Connectivity 
SuperEdges 
Left relation: Paper = {idPaper, country, year, title} 
Rght relation: Author = {idAuthor, age, dept, authorName}
DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis 
RR--MMiinnee PPrroottoottyyppee 
• Based on the GMine System 
• Test platform with minimalistic design 
• SuperNode tree: 
• node-link, radial layout, partial focus 
• SuperEdge graphs: 
• node-link, bipartite layout, edge filtering 
• Leaf SuperNode graphs: typical node-link 
pdf at: www.icmc.usp.br/pessoas/junio
DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis 
RR--MMiinnee PPrroottoottyyppee 
pdf at: www.icmc.usp.br/pessoas/junio
DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis 
33.. EExxppeerriimmeennttss 
pdf at: www.icmc.usp.br/pessoas/junio
TTyycchhoo UUSSPP ddaattaabbaassee 
• Data from several USP systems 
• Personnel, Supervisions, Publications, Events… 
pdf at: www.icmc.usp.br/pessoas/junio
TTyycchhoo UUSSPP ddaattaabbaassee 
• Using 5 entities and 5 relationships 
• 350k events 
• 380k examinations 
• 691k publications 
• 50k people 
• 26k supervisions 
• 1.5 million nodes total 
• 1.8 million edges (relationships) 
pdf at: www.icmc.usp.br/pessoas/junio
QQ11:: aaccttiivvee aauutthhoorrss 
• Which group of People (by age) have the 
largest number of recent publications? 
SQL: 
SELECT a.age, count(*) num 
FROM PersonPublication x 
JOIN Publication p ON p.id = x.publication 
AND p.year >= 2008 
JOIN Person a ON a.id = x.author 
GROUP BY a.age ORDER BY num DESC 
pdf at: www.icmc.usp.br/pessoas/junio
QQ11:: aaccttiivvee aauutthhoorrss 
pdf at: www.icmc.usp.br/pessoas/junio
QQ11..bb:: aaccttiivvee aauutthhoorrss 
• Who are them? 
• SQL: SELECT a.name, p.title 
FROM PersonPublication x 
JOIN Publication p ON p.id = x.publication 
AND p.year >= 2008 
JOIN Person a ON a.id = x.author 
WHERE a.age IN 
(SELECT age FROM 
(SELECT a.age age, count(*) num 
FROM PersonPublication x 
JOIN Publication p ON p.id = x.publication 
AND p.year >= 2008 
JOIN Person a ON a.id = x.author 
GROUP BY a.age ORDER BY num DESC) T) 
pdf at: www.icmc.usp.br/pessoas/junio
QQ11..bb:: aaccttiivvee aauutthhoorrss 
pdf at: www.icmc.usp.br/pessoas/junio
QQ22:: ffaavvoorriittee ccoouunnttrriieess 
• Which country receives the largest number of recent 
publications from this group of people? 
• SQL: SELECT a.name, p.title 
FROM PersonPublication x 
JOIN Publication p ON p.id = x.publication 
AND p.year >= 2008 
JOIN Person a ON a.id = x.author 
AND a.age BETWEEN 56 AND 63 
WHERE p.country IN 
(SELECT country FROM 
(SELECT p.country country, count(*) num 
FROM PersonPublication x 
JOIN Publication p ON p.id = x.publication 
AND p.year >= 2008 
JOIN Person a ON a.id = x.author 
AND a.age BETWEEN 56 AND 63 
GROUP BY p.country ORDER BY num DESC) T) 
pdf at: www.icmc.usp.br/pessoas/junio
QQ22:: ffaavvoorriittee ccoouunnttrriieess 
pdf at: www.icmc.usp.br/pessoas/junio
QQ22:: ffaavvoorriittee ccoouunnttrriieess 
pdf at: www.icmc.usp.br/pessoas/junio
QQ33:: aaccttiivvee aauutthhoorrss ppeerr 
ccoouunnttrryy • Now in one specific country, which group of People is the 
most active recently? 
• SQL: SELECT a.name, p.title 
FROM PersonPublication x 
JOIN Publication p ON p.id = x.publication 
AND p.year >= 2008 
AND p.country = ‘Estados Unidos’ 
JOIN Person a ON a.id = x.author 
WHERE a.age IN 
(SELECT age FROM 
(SELECT a.age age, count(*) num 
FROM PersonPublication x 
JOIN Publication p ON p.id = x.publication 
AND p.year >= 2008 
AND p.country = ‘Estados Unidos’ 
JOIN Person a ON a.id = x.author 
GROUP BY a.age ORDER BY num DESC) T) 
pdf at: www.icmc.usp.br/pessoas/junio
QQ33:: aaccttiivvee aauutthhoorrss ppeerr 
ccoouunnttrryy 
pdf at: www.icmc.usp.br/pessoas/junio
QQ33:: aaccttiivvee aauutthhoorrss ppeerr 
ccoouunnttrryy 
pdf at: www.icmc.usp.br/pessoas/junio
PPeerrffoorrmmaannccee:: iinnddiivviidduuaall 
qquueerriieess 
150 analytical questions: PostgreSQL × R-Mine 
pdf at: www.icmc.usp.br/pessoas/junio
PPeerrffoorrmmaannccee:: aaccccuummuullaatteedd 
ttiimmee 
150 analytical questions: PostgreSQL × R-Mine 
pdf at: www.icmc.usp.br/pessoas/junio
PPeerrffoorrmmaannccee:: llooaaddiinngg ttiimmee 
SuperNode Load(s) 
Connectivity 
to all siblings 
(seconds) 
pdf at: www.icmc.usp.br/pessoas/junio 
SQL 
(seconds) 
(initial loading) 6.032 - - 
Person 0.057 5.847 7.349 
Event 0.271 5.276 26.716 
Publication 0.160 4.484 27.677 
Total 6.520 15.607 61.742
44.. CCoonncclluussiioonnss 
pdf at: www.icmc.usp.br/pessoas/junio
OOuurr aapppprrooaacchh 
• Can use the Relational information 
• To guide the partitioning 
• To give an initial context to the analyst 
• Faster than running SQL queries 
• Make neighborhood exploration easy 
• Interactive Visualization environment 
pdf at: www.icmc.usp.br/pessoas/junio
CCoonnssiiddeerraattiioonnss 
• Initial parameters 
• Which entities, relationships and attributes? 
• In which order? 
• How to define partitions? Ranges? 
• How many partitions? 
• Different interaction tasks 
• Ongoing usability evaluation 
pdf at: www.icmc.usp.br/pessoas/junio
TThhaannkkss 
pdf at: www.icmc.usp.br/pessoas/junio

Más contenido relacionado

Destacado

SuperGraph visualization
SuperGraph visualizationSuperGraph visualization
SuperGraph visualization
Universidade de São Paulo
 
6 7-metodologia depesquisaemcienciadacomputacao-escritadeartigocientifico-plagio
6 7-metodologia depesquisaemcienciadacomputacao-escritadeartigocientifico-plagio6 7-metodologia depesquisaemcienciadacomputacao-escritadeartigocientifico-plagio
6 7-metodologia depesquisaemcienciadacomputacao-escritadeartigocientifico-plagio
Universidade de São Paulo
 
On the Support of a Similarity-Enabled Relational Database Management System ...
On the Support of a Similarity-Enabled Relational Database Management System ...On the Support of a Similarity-Enabled Relational Database Management System ...
On the Support of a Similarity-Enabled Relational Database Management System ...
Universidade de São Paulo
 
Techniques for effective and efficient fire detection from social media images
Techniques for effective and efficient fire detection from social media imagesTechniques for effective and efficient fire detection from social media images
Techniques for effective and efficient fire detection from social media images
Universidade de São Paulo
 
Multimodal graph-based analysis over the DBLP repository: critical discoverie...
Multimodal graph-based analysis over the DBLP repository: critical discoverie...Multimodal graph-based analysis over the DBLP repository: critical discoverie...
Multimodal graph-based analysis over the DBLP repository: critical discoverie...
Universidade de São Paulo
 
Fosdem 2011 - A Common Graph Database Access Layer for .Net and Mono
Fosdem 2011 - A Common Graph Database Access Layer for .Net and MonoFosdem 2011 - A Common Graph Database Access Layer for .Net and Mono
Fosdem 2011 - A Common Graph Database Access Layer for .Net and Mono
Achim Friedland
 
Vertex Centric Asynchronous Belief Propagation Algorithm for Large-Scale Graphs
Vertex Centric Asynchronous Belief Propagation Algorithm for Large-Scale GraphsVertex Centric Asynchronous Belief Propagation Algorithm for Large-Scale Graphs
Vertex Centric Asynchronous Belief Propagation Algorithm for Large-Scale Graphs
Universidade de São Paulo
 

Destacado (20)

SuperGraph visualization
SuperGraph visualizationSuperGraph visualization
SuperGraph visualization
 
6 7-metodologia depesquisaemcienciadacomputacao-escritadeartigocientifico-plagio
6 7-metodologia depesquisaemcienciadacomputacao-escritadeartigocientifico-plagio6 7-metodologia depesquisaemcienciadacomputacao-escritadeartigocientifico-plagio
6 7-metodologia depesquisaemcienciadacomputacao-escritadeartigocientifico-plagio
 
Reviewing Data Visualization: an Analytical Taxonomical Study
Reviewing Data Visualization: an Analytical Taxonomical StudyReviewing Data Visualization: an Analytical Taxonomical Study
Reviewing Data Visualization: an Analytical Taxonomical Study
 
Apresentacao vldb
Apresentacao vldbApresentacao vldb
Apresentacao vldb
 
On the Support of a Similarity-Enabled Relational Database Management System ...
On the Support of a Similarity-Enabled Relational Database Management System ...On the Support of a Similarity-Enabled Relational Database Management System ...
On the Support of a Similarity-Enabled Relational Database Management System ...
 
StructMatrix: large-scale visualization of graphs by means of structure detec...
StructMatrix: large-scale visualization of graphs by means of structure detec...StructMatrix: large-scale visualization of graphs by means of structure detec...
StructMatrix: large-scale visualization of graphs by means of structure detec...
 
Supervised-Learning Link Recommendation in the DBLP co-authoring network
Supervised-Learning Link Recommendation in the DBLP co-authoring networkSupervised-Learning Link Recommendation in the DBLP co-authoring network
Supervised-Learning Link Recommendation in the DBLP co-authoring network
 
Techniques for effective and efficient fire detection from social media images
Techniques for effective and efficient fire detection from social media imagesTechniques for effective and efficient fire detection from social media images
Techniques for effective and efficient fire detection from social media images
 
Multimodal graph-based analysis over the DBLP repository: critical discoverie...
Multimodal graph-based analysis over the DBLP repository: critical discoverie...Multimodal graph-based analysis over the DBLP repository: critical discoverie...
Multimodal graph-based analysis over the DBLP repository: critical discoverie...
 
Fire Detection on Unconstrained Videos Using Color-Aware Spatial Modeling and...
Fire Detection on Unconstrained Videos Using Color-Aware Spatial Modeling and...Fire Detection on Unconstrained Videos Using Color-Aware Spatial Modeling and...
Fire Detection on Unconstrained Videos Using Color-Aware Spatial Modeling and...
 
Tutorial for RDF Graphs
Tutorial for RDF GraphsTutorial for RDF Graphs
Tutorial for RDF Graphs
 
Saveface - Save your Facebook content as RDF data
Saveface - Save your Facebook content as RDF dataSaveface - Save your Facebook content as RDF data
Saveface - Save your Facebook content as RDF data
 
Fosdem 2011 - A Common Graph Database Access Layer for .Net and Mono
Fosdem 2011 - A Common Graph Database Access Layer for .Net and MonoFosdem 2011 - A Common Graph Database Access Layer for .Net and Mono
Fosdem 2011 - A Common Graph Database Access Layer for .Net and Mono
 
Vertex Centric Asynchronous Belief Propagation Algorithm for Large-Scale Graphs
Vertex Centric Asynchronous Belief Propagation Algorithm for Large-Scale GraphsVertex Centric Asynchronous Belief Propagation Algorithm for Large-Scale Graphs
Vertex Centric Asynchronous Belief Propagation Algorithm for Large-Scale Graphs
 
Fast Billion-scale Graph Computation Using a Bimodal Block Processing Model
Fast Billion-scale Graph Computation Using a Bimodal Block Processing ModelFast Billion-scale Graph Computation Using a Bimodal Block Processing Model
Fast Billion-scale Graph Computation Using a Bimodal Block Processing Model
 
Two graph data models : RDF and Property Graphs
Two graph data models : RDF and Property GraphsTwo graph data models : RDF and Property Graphs
Two graph data models : RDF and Property Graphs
 
RDF2Vec: RDF Graph Embeddings for Data Mining
RDF2Vec: RDF Graph Embeddings for Data MiningRDF2Vec: RDF Graph Embeddings for Data Mining
RDF2Vec: RDF Graph Embeddings for Data Mining
 
Dawarehouse e OLAP
Dawarehouse e OLAPDawarehouse e OLAP
Dawarehouse e OLAP
 
Complexidade de Algoritmos, Notação assintótica, Algoritmos polinomiais e in...
Complexidade de Algoritmos, Notação assintótica, Algoritmos polinomiais e in...Complexidade de Algoritmos, Notação assintótica, Algoritmos polinomiais e in...
Complexidade de Algoritmos, Notação assintótica, Algoritmos polinomiais e in...
 
SPARQL Cheat Sheet
SPARQL Cheat SheetSPARQL Cheat Sheet
SPARQL Cheat Sheet
 

Más de Universidade de São Paulo

Metric s plat - a platform for quick development testing and visualization of...
Metric s plat - a platform for quick development testing and visualization of...Metric s plat - a platform for quick development testing and visualization of...
Metric s plat - a platform for quick development testing and visualization of...
Universidade de São Paulo
 
Hierarchical visual filtering pragmatic and epistemic actions for database vi...
Hierarchical visual filtering pragmatic and epistemic actions for database vi...Hierarchical visual filtering pragmatic and epistemic actions for database vi...
Hierarchical visual filtering pragmatic and epistemic actions for database vi...
Universidade de São Paulo
 

Más de Universidade de São Paulo (11)

A gentle introduction to Deep Learning
A gentle introduction to Deep LearningA gentle introduction to Deep Learning
A gentle introduction to Deep Learning
 
Computação: carreira e mercado de trabalho
Computação: carreira e mercado de trabalhoComputação: carreira e mercado de trabalho
Computação: carreira e mercado de trabalho
 
Introdução às ferramentas de Business Intelligence do ecossistema Hadoop
Introdução às ferramentas de Business Intelligence do ecossistema HadoopIntrodução às ferramentas de Business Intelligence do ecossistema Hadoop
Introdução às ferramentas de Business Intelligence do ecossistema Hadoop
 
Metric s plat - a platform for quick development testing and visualization of...
Metric s plat - a platform for quick development testing and visualization of...Metric s plat - a platform for quick development testing and visualization of...
Metric s plat - a platform for quick development testing and visualization of...
 
Hierarchical visual filtering pragmatic and epistemic actions for database vi...
Hierarchical visual filtering pragmatic and epistemic actions for database vi...Hierarchical visual filtering pragmatic and epistemic actions for database vi...
Hierarchical visual filtering pragmatic and epistemic actions for database vi...
 
Java generics-basics
Java generics-basicsJava generics-basics
Java generics-basics
 
Java collections-basic
Java collections-basicJava collections-basic
Java collections-basic
 
Java network-sockets-etc
Java network-sockets-etcJava network-sockets-etc
Java network-sockets-etc
 
Java streams
Java streamsJava streams
Java streams
 
Infovis tutorial
Infovis tutorialInfovis tutorial
Infovis tutorial
 
Java platform
Java platformJava platform
Java platform
 

Graph-based Relational Data Visualization

  • 1. 17th International Conference Information Visualization GGrraapphh--bbaasseedd RReellaattiioonnaall DDaattaa VViissuuaalliizzaattiioonn DDaanniieell MMáárriioo pdf at: www.icmc.usp.br/pessoas/junio ddee LLiimmaa JJoosséé FFeerrnnaannddoo RRooddrriigguueess JJrr.. AAggmmaa JJuuccii MMaacchhaaddoo TTrraaiinnaa <<ddaanniieellmm@@iiccmmcc.. uusspp..bbrr>> <<jjuunniioo@@iiccmmcc..uusspp..bbrr>> <<aaggmmaa@@iiccmmcc..uusspp..bbrr>> Instituto de Ciências Matemáticas e de Computação Universidade de São Paulo 15, 16, 17 and 18 July 2013 SOAS, University of London ● London ● UK pdf at http://www.icmc.usp.br/~junio/PublishedPapers/Lima-et_al_IV-2013.pdf
  • 2. pdf at: www.icmc.usp.br/pessoas/junio OOuuttlliinnee 1. Introduction 2. Method 3. Experiments 4. Conclusions
  • 3. 11.. IInnttrroodduuccttiioonn pdf at: www.icmc.usp.br/pessoas/junio
  • 4. IInnttrroodduuccttiioonn • Large datasets are common • unstructured: text • semi-structured: XML, RDF, sensor data • structured: relational (DBMS), network (graph-like) • Analysis Process • Data Representation / Transformation • Storage / Retrieval • Statistics • Visualization • Analysis pdf at: www.icmc.usp.br/pessoas/junio Iterate
  • 5. IInnttrroodduuccttiioonn • How to spot interesting facts in the relationships of large relational databases? • How are the entities on the database related to each other? • How are the entities distributed over the relations of the database? • How do the several attributes of the database influence the relationships of the entities? • How do we quickly and intuitively browse the relational database, considering its complex structure? pdf at: www.icmc.usp.br/pessoas/junio
  • 6. OOuurr aapppprrooaacchh • Use graph representation • Graph-partitioning techniques • Graph-processing • Interactive Visualization Database  Graph  Partitioning  Visualization  Analysis pdf at: www.icmc.usp.br/pessoas/junio
  • 7. 22.. MMeetthhoodd pdf at: www.icmc.usp.br/pessoas/junio
  • 8. DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis RReellaattiioonnsshhiippss aass GGrraapphhss Author Publish Work pdf at: www.icmc.usp.br/pessoas/junio
  • 9. DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis RReellaattiioonnsshhiippss aass GGrraapphhss Author Publish Work pdf at: www.icmc.usp.br/pessoas/junio Alice A Bob B Charles C … A 1 B 2 C 3 A 2 … 1 Optic Fiber 2 Networks 3 Cryptography … 11 22 33 AA BB CC
  • 10. DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis RReellaattiioonnsshhiippss aass GGrraapphhss Author Publish Work pdf at: www.icmc.usp.br/pessoas/junio Alice A Bob B Charles C … A 1 B 2 C 3 A 2 … 1 Optic Fiber 2 Networks 3 Cryptography … 11 22 33 AA BB CC
  • 11. DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis RReellaattiioonnsshhiippss aass GGrraapphhss Author Publish Work pdf at: www.icmc.usp.br/pessoas/junio Alice A Bob B Charles C … A 1 B 2 C 3 A 2 … 1 Optic Fiber 2 Networks 3 Cryptography … 11 22 33 AA BB CC
  • 12. DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis GGrraapphh PPaarrttiittiioonniinngg pdf at: www.icmc.usp.br/pessoas/junio
  • 13. DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis GGrraapphh PPaarrttiittiioonniinngg pdf at: www.icmc.usp.br/pessoas/junio
  • 14. DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis GGrraapphh PPaarrttiittiioonniinngg pdf at: www.icmc.usp.br/pessoas/junio
  • 15. DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis HHiieerraarrcchhiiccaall PPaarrttiittiioonniinngg pdf at: www.icmc.usp.br/pessoas/junio
  • 16. DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis HHiieerraarrcchhiiccaall PPaarrttiittiioonniinngg subgraph 1 subgraph 2 cut 0 pdf at: www.icmc.usp.br/pessoas/junio
  • 17. DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis HHiieerraarrcchhiiccaall PPaarrttiittiioonniinngg subgraph 1 subgraph 2 cut 0 pdf at: www.icmc.usp.br/pessoas/junio
  • 18. DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis HHiieerraarrcchhiiccaall PPaarrttiittiioonniinngg subgraph 1 subgraph 2 cut 0 pdf at: www.icmc.usp.br/pessoas/junio
  • 19. DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis HHiieerraarrcchhiiccaall PPaarrttiittiioonniinngg cut 0 pdf at: www.icmc.usp.br/pessoas/junio
  • 20. DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis HHiieerraarrcchhiiccaall PPaarrttiittiioonniinngg cut 1 cut 0 cut 2 pdf at: www.icmc.usp.br/pessoas/junio
  • 21. DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis HHiieerraarrcchhiiccaall PPaarrttiittiioonniinngg cut 1 cut 0 cut 2 pdf at: www.icmc.usp.br/pessoas/junio subgraph 1-1 subgraph 1-2 subgraph 2-1 subgraph 2-2
  • 22. DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis SSuuppeerrGGrraapphh SuperNode 1-1 cut 1 cut 0 cut 2 SuperNode 1-2 pdf at: www.icmc.usp.br/pessoas/junio subgraph 2-1 subgraph 2-2
  • 23. DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis SSuuppeerrGGrraapphh SuperNode 1-1 pdf at: www.icmc.usp.br/pessoas/junio SuperEdge 1 SuperNode 1-2 subgraph 2-1 cut 0 cut 2 subgraph 2-2
  • 24. DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis SSuuppeerrGGrraapphh pdf at: www.icmc.usp.br/pessoas/junio SuperEdge 2 SuperNode 2-1 SuperNode 2-2 cut 0
  • 25. DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis SSuuppeerrGGrraapphh subgraph 1 subgraph 2 cut 0 pdf at: www.icmc.usp.br/pessoas/junio
  • 26. DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis SSuuppeerrGGrraapphh SuperNode 1 SuperNode 2 cut 0 pdf at: www.icmc.usp.br/pessoas/junio
  • 27. DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis SSuuppeerrGGrraapphh SuperNode 1 SuperNode 2 SuperEdge 0 pdf at: www.icmc.usp.br/pessoas/junio
  • 28. DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis SSuuppeerrGGrraapphh pdf at: www.icmc.usp.br/pessoas/junio
  • 29. DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis SSuuppeerrGGrraapphh • Further details in the paper pdf at: www.icmc.usp.br/pessoas/junio
  • 30. DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis AAttttrriibbuuttee--bbaasseedd PPaarrttiittiioonniinngg Paper Author Left relation: Paper = {idPaper, country, year, title} Rght relation: Author = {idAuthor, age, dept, authorName} pdf at: www.icmc.usp.br/pessoas/junio
  • 31. DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis AAttttrriibbuuttee--bbaasseedd PPaarrttiittiioonniinngg Paper Author Left relation: Paper = {idPaper, country, year, title} Rght relation: Author = {idAuthor, age, dept, authorName} pdf at: www.icmc.usp.br/pessoas/junio
  • 32. DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis AAttttrriibbuuttee--bbaasseedd PPaarrttiittiioonniinngg Paper Author PPaappeerr AAuutthhoorr pdf at: www.icmc.usp.br/pessoas/junio PP AA Left relation: Paper = {idPaper, country, year, title} Rght relation: Author = {idAuthor, age, dept, authorName}
  • 33. DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis AAttttrriibbuuttee--bbaasseedd PPaarrttiittiioonniinngg Paper Author PPaappeerr AAuutthhoorr pdf at: www.icmc.usp.br/pessoas/junio PP AA local Left relation: Paper = {idPaper, country, year, title} Rght relation: Author = {idAuthor, age, dept, authorName}
  • 34. DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis AAttttrriibbuuttee--bbaasseedd PPaarrttiittiioonniinngg Paper Author PPaappeerr AAuutthhoorr pdf at: www.icmc.usp.br/pessoas/junio UUSS PP AA UUSS BBRR BBRR local Left relation: Paper = {idPaper, country, year, title} Rght relation: Author = {idAuthor, age, dept, authorName}
  • 35. DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis AAttttrriibbuuttee--bbaasseedd PPaarrttiittiioonniinngg year Paper Author PPaappeerr AAuutthhoorr pdf at: www.icmc.usp.br/pessoas/junio UUSS PP AA UUSS BBRR BBRR local Left relation: Paper = {idPaper, country, year, title} Rght relation: Author = {idAuthor, age, dept, authorName}
  • 36. DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis AAttttrriibbuuttee--bbaasseedd PPaarrttiittiioonniinngg year local Paper Author PPaappeerr AAuutthhoorr pdf at: www.icmc.usp.br/pessoas/junio UUSS PP AA UUSS BBRR BBRR ’0’000-’-0’066 ’0’066-’-1’111 Left relation: Paper = {idPaper, country, year, title} Rght relation: Author = {idAuthor, age, dept, authorName}
  • 37. DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis AAttttrriibbuuttee--bbaasseedd PPaarrttiittiioonniinngg year local Paper Author ‘9‘955++ ’0’022++ ’0’066++ ** PPaappeerr AAuutthhoorr pdf at: www.icmc.usp.br/pessoas/junio UUSS PP AA UUSS BBRR BBRR ’0’000-’-0’066 ’0’066-’-1’111 Left relation: Paper = {idPaper, country, year, title} Rght relation: Author = {idAuthor, age, dept, authorName}
  • 38. DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis AAttttrriibbuuttee--bbaasseedd PPaarrttiittiioonniinngg year local age Paper Author ‘9‘955++ ’0’022++ ’0’066++ ** PPaappeerr AAuutthhoorr pdf at: www.icmc.usp.br/pessoas/junio UUSS PP AA UUSS BBRR BBRR ’0’000-’-0’066 ’0’066-’-1’111 Left relation: Paper = {idPaper, country, year, title} Rght relation: Author = {idAuthor, age, dept, authorName}
  • 39. DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis AAttttrriibbuuttee--bbaasseedd PPaarrttiittiioonniinngg year local age Paper Author ‘9‘955++ ’0’022++ ’0’066++ ** PPaappeerr AAuutthhoorr pdf at: www.icmc.usp.br/pessoas/junio UUSS PP AA UUSS BBRR <<4400 >>4400 <<4400 >>4400 BBRR ’0’000-’-0’066 ’0’066-’-1’111 Left relation: Paper = {idPaper, country, year, title} Rght relation: Author = {idAuthor, age, dept, authorName}
  • 40. DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis AAttttrriibbuuttee--bbaasseedd PPaarrttiittiioonniinngg year dept local age Paper Author ‘9‘955++ ’0’022++ ’0’066++ ** PPaappeerr AAuutthhoorr pdf at: www.icmc.usp.br/pessoas/junio UUSS PP AA UUSS BBRR <<4400 >>4400 <<4400 >>4400 BBRR ’0’000-’-0’066 ’0’066-’-1’111 Left relation: Paper = {idPaper, country, year, title} Rght relation: Author = {idAuthor, age, dept, authorName}
  • 41. DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis AAttttrriibbuuttee--bbaasseedd PPaarrttiittiioonniinngg year dept local age Paper Author PPaappeerr AAuutthhoorr pdf at: www.icmc.usp.br/pessoas/junio UUSS PP AA UUSS BBRR <<4400 >>4400 <<4400 >>4400 BBRR ’0’000-’-0’066 IMIMEE ** ’0’066++ ** EEEESSCC ICICMMCC ‘9‘955++ ’0’022++ ’0’066-’-1’111 FFFFLLCCHH Left relation: Paper = {idPaper, country, year, title} Rght relation: Author = {idAuthor, age, dept, authorName}
  • 42. DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis AAttttrriibbuuttee--bbaasseedd PPaarrttiittiioonniinngg year dept local age Paper Author ‘9‘955++ ’0’022++ PPaappeerr AAuutthhoorr pdf at: www.icmc.usp.br/pessoas/junio UUSS PP AA UUSS BBRR <<4400 >>4400 <<4400 >>4400 BBRR ’0’000-’-0’066 IMIMEE ** ’0’066++ ** ’0’066-’-1’111 FFFFLLCCHH Connectivity SuperEdges Left relation: Paper = {idPaper, country, year, title} Rght relation: Author = {idAuthor, age, dept, authorName}
  • 43. DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis AAttttrriibbuuttee--bbaasseedd PPaarrttiittiioonniinngg year dept local age Paper Author ‘9‘955++ ’0’022++ PPaappeerr AAuutthhoorr pdf at: www.icmc.usp.br/pessoas/junio UUSS PP AA UUSS BBRR <<4400 >>4400 <<4400 >>4400 BBRR IMIMEE ** ’0’066++ ** FFFFLLCCHH Connectivity SuperEdges Left relation: Paper = {idPaper, country, year, title} Rght relation: Author = {idAuthor, age, dept, authorName}
  • 44. DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis RR--MMiinnee PPrroottoottyyppee • Based on the GMine System • Test platform with minimalistic design • SuperNode tree: • node-link, radial layout, partial focus • SuperEdge graphs: • node-link, bipartite layout, edge filtering • Leaf SuperNode graphs: typical node-link pdf at: www.icmc.usp.br/pessoas/junio
  • 45. DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis RR--MMiinnee PPrroottoottyyppee pdf at: www.icmc.usp.br/pessoas/junio
  • 46. DDBB m m××mm GGrraapphh PPaarrttititioionniningg GGrraapphhTTrreeee VVisisuuaalilzizaattioionn AAnnaalylyssisis 33.. EExxppeerriimmeennttss pdf at: www.icmc.usp.br/pessoas/junio
  • 47. TTyycchhoo UUSSPP ddaattaabbaassee • Data from several USP systems • Personnel, Supervisions, Publications, Events… pdf at: www.icmc.usp.br/pessoas/junio
  • 48. TTyycchhoo UUSSPP ddaattaabbaassee • Using 5 entities and 5 relationships • 350k events • 380k examinations • 691k publications • 50k people • 26k supervisions • 1.5 million nodes total • 1.8 million edges (relationships) pdf at: www.icmc.usp.br/pessoas/junio
  • 49. QQ11:: aaccttiivvee aauutthhoorrss • Which group of People (by age) have the largest number of recent publications? SQL: SELECT a.age, count(*) num FROM PersonPublication x JOIN Publication p ON p.id = x.publication AND p.year >= 2008 JOIN Person a ON a.id = x.author GROUP BY a.age ORDER BY num DESC pdf at: www.icmc.usp.br/pessoas/junio
  • 50. QQ11:: aaccttiivvee aauutthhoorrss pdf at: www.icmc.usp.br/pessoas/junio
  • 51. QQ11..bb:: aaccttiivvee aauutthhoorrss • Who are them? • SQL: SELECT a.name, p.title FROM PersonPublication x JOIN Publication p ON p.id = x.publication AND p.year >= 2008 JOIN Person a ON a.id = x.author WHERE a.age IN (SELECT age FROM (SELECT a.age age, count(*) num FROM PersonPublication x JOIN Publication p ON p.id = x.publication AND p.year >= 2008 JOIN Person a ON a.id = x.author GROUP BY a.age ORDER BY num DESC) T) pdf at: www.icmc.usp.br/pessoas/junio
  • 52. QQ11..bb:: aaccttiivvee aauutthhoorrss pdf at: www.icmc.usp.br/pessoas/junio
  • 53. QQ22:: ffaavvoorriittee ccoouunnttrriieess • Which country receives the largest number of recent publications from this group of people? • SQL: SELECT a.name, p.title FROM PersonPublication x JOIN Publication p ON p.id = x.publication AND p.year >= 2008 JOIN Person a ON a.id = x.author AND a.age BETWEEN 56 AND 63 WHERE p.country IN (SELECT country FROM (SELECT p.country country, count(*) num FROM PersonPublication x JOIN Publication p ON p.id = x.publication AND p.year >= 2008 JOIN Person a ON a.id = x.author AND a.age BETWEEN 56 AND 63 GROUP BY p.country ORDER BY num DESC) T) pdf at: www.icmc.usp.br/pessoas/junio
  • 54. QQ22:: ffaavvoorriittee ccoouunnttrriieess pdf at: www.icmc.usp.br/pessoas/junio
  • 55. QQ22:: ffaavvoorriittee ccoouunnttrriieess pdf at: www.icmc.usp.br/pessoas/junio
  • 56. QQ33:: aaccttiivvee aauutthhoorrss ppeerr ccoouunnttrryy • Now in one specific country, which group of People is the most active recently? • SQL: SELECT a.name, p.title FROM PersonPublication x JOIN Publication p ON p.id = x.publication AND p.year >= 2008 AND p.country = ‘Estados Unidos’ JOIN Person a ON a.id = x.author WHERE a.age IN (SELECT age FROM (SELECT a.age age, count(*) num FROM PersonPublication x JOIN Publication p ON p.id = x.publication AND p.year >= 2008 AND p.country = ‘Estados Unidos’ JOIN Person a ON a.id = x.author GROUP BY a.age ORDER BY num DESC) T) pdf at: www.icmc.usp.br/pessoas/junio
  • 57. QQ33:: aaccttiivvee aauutthhoorrss ppeerr ccoouunnttrryy pdf at: www.icmc.usp.br/pessoas/junio
  • 58. QQ33:: aaccttiivvee aauutthhoorrss ppeerr ccoouunnttrryy pdf at: www.icmc.usp.br/pessoas/junio
  • 59. PPeerrffoorrmmaannccee:: iinnddiivviidduuaall qquueerriieess 150 analytical questions: PostgreSQL × R-Mine pdf at: www.icmc.usp.br/pessoas/junio
  • 60. PPeerrffoorrmmaannccee:: aaccccuummuullaatteedd ttiimmee 150 analytical questions: PostgreSQL × R-Mine pdf at: www.icmc.usp.br/pessoas/junio
  • 61. PPeerrffoorrmmaannccee:: llooaaddiinngg ttiimmee SuperNode Load(s) Connectivity to all siblings (seconds) pdf at: www.icmc.usp.br/pessoas/junio SQL (seconds) (initial loading) 6.032 - - Person 0.057 5.847 7.349 Event 0.271 5.276 26.716 Publication 0.160 4.484 27.677 Total 6.520 15.607 61.742
  • 62. 44.. CCoonncclluussiioonnss pdf at: www.icmc.usp.br/pessoas/junio
  • 63. OOuurr aapppprrooaacchh • Can use the Relational information • To guide the partitioning • To give an initial context to the analyst • Faster than running SQL queries • Make neighborhood exploration easy • Interactive Visualization environment pdf at: www.icmc.usp.br/pessoas/junio
  • 64. CCoonnssiiddeerraattiioonnss • Initial parameters • Which entities, relationships and attributes? • In which order? • How to define partitions? Ranges? • How many partitions? • Different interaction tasks • Ongoing usability evaluation pdf at: www.icmc.usp.br/pessoas/junio
  • 65. TThhaannkkss pdf at: www.icmc.usp.br/pessoas/junio

Notas del editor

  1. Greetings
  2. This presentation is divided in four parts: Introduction of the problem, Our proposed Method, Experimental demonstrations and Conclusions.
  3. With the current state of Information and Communication Technology, there is a crescent number of large datasets available, such as (dataset examples). To obtain knowledge and information from these data, the analytical process goes through some steps (in the slide), by several iterations, until deemed appropriate by the analyst.
  4. Given the widespread use of relational databases, and their increasing size and complexity, the challenge is to discover useful things about them -- relationships, distribution, influence – in a quick and uncomplicated way.
  5. So, our method attacks this challenge by using graph representations along with graph techniques and interactive visualization, following the pipeline below.
  6. Starting with a many-to-many relationship between two entities,
  7. Each entity is a relation with several rows, each one constituted by several attributes, and a relationship table (a 3rd relation) that holds the primary keys of the related objects
  8. Naturally, this model can be represented as the nodes of a graph
  9. And the relationship rows hold the edges linking the nodes: A, Alice to 1, Optic Fiber; B, Bob to 2, Networks; A, Alice to 2, Networks; (which is the same as Bob) And so on… Larger schemata with more relations can be transformed in just one big graph.
  10. At the end, the database will be transformed in a graph, and graph partitioning techniques can be used.
  11. The partitioning is an operation that divides the graph into disjoint sets of nodes, following the minimum cut (considering the smallest number of edges between subsets) technique.
  12. These are the edges of the minimum cut.
  13. And after separating them from the graph…
  14. …the result is the separate subgraphs (1 and 2), and the cut-set of edges (cut 0). The cut-set contains the edges between the nodes of these subgraphs.
  15. And the whole process can be repeated recursively…
  16. … by selecting edges,
  17. … splitting the subgraphs,
  18. … and resulting in more subgraphs and cut-sets, in a hierarchy. This hierarchy is stored in the GraphTree structure, formally defined by a SuperGraph.
  19. In the SuperGraph, each subgraph resulted from the partitioning corresponds to a SuperNode,
  20. and each cut-set forms a SuperEdge between them,
  21. … so this process is applied to all subgraphs,
  22. recursively, because it is possible to obtain the initial unpartitioned parent subgraph by reuniting the smaller subgraphs and cut-sets.
  23. Here, each group linked by SuperEdges will form a new SuperNode.
  24. And the cut-set 0 will form the SuperEdge between them.
  25. Resulting in the SuperGraph structure with all the information from the original unpartitioned graph.
  26. So, after we organize the partitioned graph in a SuperGraph, we can store it in a disk-based data structure called the GraphTree. The definition and further details are in the paper.
  27. But how we define a partitioning? Starting from a schema of interest, here with the base case: a many-to-many relationship.
  28. We start representing the database as a graph,
  29. And make an initial partitioning by the entities involved. On the right we can see the hierarchical structure of the corresponding SuperGraph.
  30. For the next partitioning, we select an attribute from one entity, local of publishing
  31. And split the subgraph according to categories of this attribute (like a group by query), in this case, the Papers are split in papers published in Brazil and papers published in the United States
  32. And we go on, selecting another attibute, year of publication
  33. In this case, dividing papers in US in two categories, one from 2000 to 2006, and the other from 2006 to 2011
  34. And the Brazilian node in three categories. The asterisk category includes the “other nodes”, nodes that do not belong to any of the categories of interest, but that are still relevant.
  35. Now we select an attribute from the other entity, the ages of authors
  36. Which split them in two groups, more than 40 years old, and less than 40 years old
  37. And now by author department
  38. Which splits the previous authors’ partitions, showing the largest categories and giving a hint about how their sizes compare. In this example, we see that in the United States there were more papers published from 2006 to 2011 than from 2000 to 2006
  39. And we can also compute connectivity SuperEdges. They retrieve the edges between any two SuperNodes in the SuperGraph, and therefore allows to make natural joins between the subgroups on demand. In the example, we can analyze the publications published in brazil since 2006 from authors with less than 40 years old
  40. Or how many papers were published in the United States by authors of the FFLCH department with more than 40 years old.
  41. To evaluate this method, we implemented the RMine prototype based on the GMine System, a previous tool for general graph visualization with a minimalistic design. The SuperNodes are viewed as a node-link radial layout, with per-entity focusable subgraphs. The SuperEdge and LeafSuperNode visualizations are typical graph layouts, such as influence, force-directed and bipartite node-link layouts.
  42. R-Mine implements the attribute-based partitioning idea, with hierarchical SuperNodes inside bubbles, and SuperEdges linking them. The SuperEdges summarize the weight of the edges in between. And additionally, each time the user selects another category of the partitioning, the SuperNode is focused and occupy the area of the entity SuperNode, therefore allowing better screen usage.  Here: follow images (a) through (d), pinpointing the tree and then the visualization, which have a correspondence.
  43. The Tycho USP database is a database with the academic data from several USP systems. We use a subset of the database with some entities we are interested in.
  44. And some numbers for the relations used in the experiment.
  45. Now we answer some questions with an SQL query and compare it with the interaction in R-Mine. (question) That’s answered by joining the relationship Person-Publication, Publications published after 2008, and counting the rows grouped by the author age.
  46. In the R-Mine system, we first choose the entities and attributes of the database to prepare the SuperGraph. With the SuperGraph ready, we open the Publication and Person SuperNodes, select the SuperNode Publications from 2008 to 2012; and visualize the weight of the SuperEdges to the people SuperNodes (which were partitioned by age). And the group of authors between 56 and 63 years old had the biggest number of publications.
  47. But, how we fetch those authors? By selecting each one contained in the previous group by.
  48. In R-Mine, we open the SuperEdge, and the graph of the active authors and their respective recent publications is loaded and visualized.
  49. Now, where these authors published their works? What are their favorite countries? The SQL query is similar, but we select a specific group of people, and group by publication country.
  50. And in R-Mine, we open the SuperNode of Publications from 2008 to 2012, revealing the next partitioning by country; and select the person group with ages from 56 to 63. We can see that Brazil, United States, Argentina and England are the preferred countries for those authors.
  51. And again the nodes and edges inside the SuperEdge.
  52. Now reverting the question: within one specific country, who are the most active authors? The SQL query is the same as the first, but with an additional predicate defining the country.
  53. In R-Mine we just select the United States SuperNodes, and by inspecting the SuperEdges to people groups, we can observe that the most active authors in the United States are a different group, now from 42 to 49 years old.
  54. In our experiments, 150 queries like the previous questions were computed in the R-Mine system and in PostgreSQL with the corresponding indexes. The R-Mine System showed better performance in all the cases.
  55. By summing the time of these queries, we observe that a session with an increasing number of questions would be 10-fold more efficient within a SuperGraph-based tool, such as R-Mine.
  56. This table shows the results of another experiment, where we compute connectivity SuperEdges to all the siblings of a given SuperNode, in both R-Mine and PostgreSQL.
  57. So, we saw that: - our approach can use the Relational information to guide the partitioning, thus giving an initial context to the analyst - it was faster than running sql queries - and it makes neighborhood exploration easy, because of the visual environment
  58. In the current prototype, the SuperGraph is built before the visualization, thus we need to decide which relations and attributes are included beforehand. Attribute ordering will define the sequence of exploration. Different attribute distributions can result in very unbalanced partitions, but for numerical attributes we can easily group by equally sized ranges. Too many partitions are difficult to see, but too few can be useless. As a general recommendation we follow Miller’s Law, that is, 7 +- 2 elements per task. We would like to include other analytical tasks and visualizations along the existing ones. Finally, there is a usability evaluation in progress to improve the visual layout and to compare with other visualization approaches.