2. 2
Motivation and driving consideration about the service
Service architecture and interfaces: overview
- How the user can access the service
E.g.: REST, GUI, CLIs, etc.
- Service options and attributes
Acceptable Usage Policy (AUP)
Use cases
Documentation/tutorial/information
11/27/2018
Content
3. 3
Research challenges are getting larger and more complex:
E.g. full-Earth climate simulation, coupled simulations of
multiple organs in the human body, seismic analyses of
earthquakes at continental scale
Researcher data and compute demands are rising fast
Efficient transfer of data to high performance computing
(HPC) workspaces is essential especially in distributed
computing, where resources are geographically dispersed
11/27/2018
Motivation 1
4. 4
Facilitates transfer of large data collections from EUDAT
storage resources to HPC facilities.
Provides the means to re-ingest computational results back
into the EUDAT infrastructure.
Ingests data sets into EUDAT resources for long-term
preservation.
Offers reliable, efficient, easy-to-use tools to manage data
transfers.
11/27/2018
Motivation 2
6. 6
GridFTP iRODS-DSI dependencies
• iRODS v4.x deployment and configuration
• Including the Development Tools and Runtime Libraries packages (see
http://irods.org/download/)
• Globus GridFTP server (globus-gridftp-server-progs) deployment
and configuration
• Software components deployment:
– CMake 2.7 or higher
– libglobus-common-dev (.deb) or globus-common-devel (.rpm)
– libglobus-gridftp-server-dev (.deb) or globus-gridftp-server-devel (.rpm)
– libglobus-gridmap-callout-error-dev (.deb) or globus-gridmap-callout-error-
devel (.rpm) (see http://www.ige-
project.eu/downloads/software/releases/downloads)
– libcurl4-openssl-dev
It is possible to use the official iRODS and GridFTP
server packages without recompiling them.
7. HTTP API – Architecture
711/27/2018
HTTP API
Nginx reverse proxy
Flask server
Session database
8. HTTP API – Components
811/27/2018
The HTTP API stack is based on three main components
• HTTP API server (based on Python Flask)
– User authentication and authorization
– Implementation of all interactions with underlying resources
• Sessions database (based on postgres)
– Stores information about logged users
• HTTPS reverse Proxy (based on nginx)
– Enable secure connection over SSL, supporting by default Let's Encrypt
certificates
9. HTTP API – Dependencies
911/27/2018
Fully developed with Python and deployed as docker containers
• Requirements:
– Python 3.4.3+
– Docker 1.13+
– Docker-compose 1.18+
– Rapydo controller (https://github.com/rapydo/do)
Deployment with docker ensure the maximum reproducibility and
portability of the whole software stack on every supported OS
13. 1311/27/2018
User is authenticating
with username/password
Upload
Download
Oauth2: HTTP API get a oauth2 token from
B2ACCESS and provides an api token to the
user
data are streamed from the http
client to b2safe, avoiding to cache
them at the HTTP API server
B2SAFE validates the
oauth2 token and gets
user attributes to map
the user on a local
account
HTTP API talks with
B2SAFE on behalf of
the user, using the
oauth2 token
data are streamed from b2safe,to the http client, avoiding
to cache them at the HTTP API server
HTTP API
14. HTTP API – Clients
1411/27/2018
• HTTP APIs are REST endpoints and can be queried any http client
• Command line http clients (curl, wget, httpie)
• Any programming language with http libraries
– This allows to integrate requests to HTTP API in your own software and
automate the interaction with data repositories
This session does not cover deployment and configuration of iRODS v4.1; seek the B2SAFE training material for this. Also, deployment and configuration of GridFTP is assumed; note in particular firewall considerations apply to GridFTP.
You will also need the following software components:
CMake 2.7 or higher
libglobus-common-dev (.deb) or globus-common-devel (.rpm)
libglobus-gridftp-server-dev (.deb) or globus-gridftp-server-devel (.rpm)
libglobus-gridmap-callout-error-dev (.deb) or globus-gridmap-callout-error-devel (.rpm) (see http://www.ige-project.eu/downloads/software/releases/downloads)
libcurl4-openssl-dev
It is important to note that you can use the official iRODS and gridftp server binaries.
(This is a continuation from the last sentence of the previous slide).
This is better depicted in this figure. The user employs the GridFTP client of their choice, which interacts with B2STAGE instances on the sites involved in the transfer. Underneath the B2STAGE hood is a GridFTP server, enriched with the EUDAT Data Storage Interface component. When data arrive at an EUDAT node to be deposited, the B2SAFE service ensures that a PID is generated by B2HANDLE for each artefact, and this is recorded in the EPIC PID Register. The iRODS Server also handles any replication required for these artefacts, according to the community policies that apply to the user who initiated the transfer. If the user utilises the EUDAT DSS script, then any PIDs generated, and this again depends on the iRODS server configuration and the community agreement, are returned to them.
The situation is similar when the user transfers data into an EUDAT centre.