The document describes Amada, a system for storing, indexing, and querying RDF and XML datasets using Amazon Web Services. Amada uses EC2 for virtual computing, DynamoDB and SimpleDB for NoSQL data stores, and S3 for storage. It was developed by researchers at Inria and has over 30,000 lines of Java code implemented using the Java AWS API. The code is available online and Amada includes modules for file storage, indexing, and query processing of XML and RDF data.
2. Amada
Storing, indexing and querying RDF and XML Datasets using
Amazon Web Services:
I Elastic Computing Cloud (EC2)
I Virtual computing environment
I DynamoDB
I NoSQL datastore
I SimpleDB
I NoSQL datastore
I Simple Storage Service
I Storage web service for raw data
I Amazon Simple Queue Service (SQS)
I Asynchronous Communication between distributed
components
F. Bugiotti – Amada September, 2014 – 2
3. Amada
Contributors to the project:
I Andr`es Aranda And` ujar
I Zoi Kaoudi
I Francesca Bugiotti
I Jesus Camacho Rodriguez
I Dario Colazzo
I Francois Goasdou´e
I Ioana Manolescu
I Stamatis Zampetakis
F. Bugiotti – Amada September, 2014 – 3
4. Amada
Contributors to the code:
I Andr`es Aranda And` ujar
I Francesca Bugiotti
I Jesus Camacho Rodriguez
I Zoi Kaoudi
I Ioana Manolescu
I Stamatis Zampetakis
F. Bugiotti – Amada September, 2014 – 4
5. Amada
Where to find the code:
I https://scm.gforge.inria.fr/svn/amada/trunk
Language:
I Java
I Java API for Amazon Web Services
Code size:
I 30600 lines of code
I 255 classes
History:
I ViP2P
I Tree patterns
I Operators
F. Bugiotti – Amada September, 2014 – 5
6. Amada
Now:
I APP Deposit
I http://cloak.saclay.inria.fr/research/amada/
F. Bugiotti – Amada September, 2014 – 6
7. Amada Architecture
② ""
XML
RDF
Front-end
Indexing&
module&
⑤ ""
③ ""
⑤ "" ④ "" ③ ""
Query&
processor&
① ""
④ ""
① ""
② ""
⑥ ""
⑦ ""
⑧ ""
File"storage"service"
Virtual"machines"
Indexing"service"
Queue"service"
F. Bugiotti – Amada September, 2014 – 7
8. Amada Modules
File Storage Service
I Input: XML/RDF file
I Parses and stores the file into the Simple Storage Service
I Output: the file is stored into the system
② ""
XML
RDF
Front-end
Indexing&
module&
⑤ ""
③ ""
⑤ "" ④ "" ③ ""
Query&
processor&
① ""
④ ""
① ""
② ""
⑥ ""
⑦ ""
⑧ ""
File"storage"service"
Virtual"machines"
Indexing"service"
Queue"service"
F. Bugiotti – Amada September, 2014 – 8
9. Amada Modules
Indexing Service
I I Input: XML/RDF parsed file, an indexing strategy
I It creates and stores an index for the specified file
I Output: the index created and stored into the indexing service
② ""
XML
RDF
Front-end
Indexing&
module&
⑤ ""
③ ""
⑤ "" ④ "" ③ ""
Query&
processor&
① ""
④ ""
① ""
② ""
⑥ ""
⑦ ""
⑧ ""
File"storage"service"
Virtual"machines"
Indexing"service"
Queue"service"
F. Bugiotti – Amada September, 2014 – 9
10. Amada Modules
Indexing Strategies
I Handled by the indexing module
I Various indexing strategies for XML and RDF datasets
I Some strategies handle strings, others compress data using a
delta compression of binary
F. Bugiotti – Amada September, 2014 – 10
11. Amada Modules
Query Processor
I Input: SPARQL query/XQuery Dialect query
I It parses and run the query and uses the available indexes for it
I Output: the query result
② ""
XML
RDF
Front-end
Indexing&
module&
⑤ ""
③ ""
⑤ "" ④ "" ③ ""
Query&
processor&
① ""
④ ""
① ""
② ""
⑥ ""
⑦ ""
⑧ ""
File"storage"service"
Virtual"machines"
Indexing"service"
Queue"service"
F. Bugiotti – Amada September, 2014 – 11