This document describes ExSchema, a tool that discovers schemas from the source code of polyglot persistence applications. It analyzes project structure, update operations, repositories, and annotations to understand implicit schemas across different databases. The tool represents schemas as a uniform metamodel and outputs results as PDF files. It is demonstrated on applications using Neo4j, Spring Data Neo4j, MongoDB, HBase, CouchDB, relational databases, and combinations of these. Limitations and future work are discussed. The implementation leverages Eclipse JDT and AST to analyze Java code and output results through Spring Data and as PDF files.
2. Objective
Discover schemas from the source code
of polyglot persistence applications
Source
code
Relational Graph Document Column- Key-Value
DB DB DB Family DB DB
2
3. Why?
Polyglot persistence applications
are becoming widespread
Schema-less datastores
But for their development and maintenance,
software engineers have to deal with… Non-standard APIs
Implicit schemas described
in the source code
3
4. How? ExSchema
MetaLayer
Representation
Declarations Updates Repositories Annotations
Analyzer Analyzer Analyzer Analyzer
Analyze project structure
and update operations
Application source code
Neo4j API MongoDB API HBase API
JPA API Spring Data CouchDB API
4
5. MetaLayer
*
*
Set Attribute
*
* *
*
* Struct Relationship
* *
Based on: P. Atzeni, F. Bugiotti, and L. Rossi. Uniform access to non-relational database systems:
the SOS platform. In CAiSE’12, volume 7328 of LNCS, pages 160–174. Springer, 2012.
5
6. Results
PDF file
Spring Roo scripts
(JPA, MongoDB, Neo4j)
6
7. Demonstration
import org.neo4j.graphdb.Node;
Neo4j
import org.neo4j.graphdb.Relationship;
class ActorImpl implements Actor {
private static final String NAME_PROPERTY = "name”;
private final Node underlyingNode; Declaration
public void setName( final String name ) {
}
underlyingNode.setProperty( NAME_PROPERTY, name ); Update
public Role createRole( final Actor actor, final Movie movie, final String roleName )
final Node actorNode = ((ActorImpl) actor).getUnderlyingNode();
final Node movieNode = ((MovieImpl) movie).getUnderlyingNode();
final Relationship rel = actorNode.createRelationshipTo( movieNode, RelTypes.ACTS_IN );
…
} https://github.com/neo4j-examples/imdb
7
24. Test applications
Neo4j:
- https://github.com/neo4j-examples/cineasts.git
- https://github.com/neo4j-examples/imdb.git
- https://github.com/neo4j-examples/java-astar-routing.git
- https://github.com/neo4j-examples/java-dijkstra.git
- https://github.com/neo4j-examples/java-tree-traverse.git
- MyNetContacts (http://vargas-solar.imag.fr/academika/cloud-data-management/)
MongoDB:
- https://github.com/mongolab/mongodb-driver-examples.git
HBase:
- https://github.com/larsgeorge/hbase-book.git (ch03)
- https://github.com/SpringSource/spring-hadoop-samples.git (original-samples/hbase-crud)
CouchDB:
- https://github.com/mbreese/couchdb4j.git
Relational:
- Indvalid-core (http://www.indvalid.com/)
Relational + MongoDB:
- https://github.com/SpringSource/cloudfoundry-samples.git (cross-store)
- MyNet (http://vargas-solar.imag.fr/academika/cloud-data-management/) Industrial application
- Indvalid-dao (http://www.indvalid.com/)
Neo4j + MongoDB + Relational:
- twitter-spring
- twitter-polyglot
(Based on: P. Atzeni, F. Bugiotti, and L. Rossi. Uniform access to non-relational database systems: 24
the SOS platform. In CAiSE’12, volume 7328 of LNCS, pages 160–174. Springer, 2012)
25. Limitations
Based on project structure and update operations
(Queries and get operations not considered)
Based on programming styles of test applications
(Heavily relies on local variables)
Limited associations between entities
(Besides Neo4j’s relationships and MongoDB cross-store)
25
26. Future work
Analysis of queries and get operations
Support additional languages besides Java
Increase support for different programming styles
26