We describe the large-scale data transfer scenario, referencing current and past research teams and their challenges. We demonstrate a web application that uses Globus to perform large-scale data transfers, and walk through a code repository with the web application’s code.
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Gateways 2020 Tutorial - Large Scale Data Transfer with Globus
1. Simplifying Science Gateway Data
Management with Globus
Part 2 – Large-scale Data Transfer
October 2020, Gateways 2020
2. Globus’s platform simplifies applications
• Mobile-friendly web app
– Desktops & laptops
– Tablets
– Smartphones
• Platform support
– Web GUI
– Command-line (CLI)
– REST APIs and Python SDK
– Jupyterlab notebooks, etc.
3. Why would your gateway need to transfer big files?
• Run an analysis on a community dataset
– Gateway user specifies a type of analysis using a standard dataset, or slice of a
dataset
– Data needs to be moved to the compute server
• Analyze the gateway user’s data
– Data needs to be uploaded from researcher’s computer to the compute server
• Allow gateway user to download simulation results
– Data needs to be downloaded from the compute server to researcher’s computer
• Allow gateway user to submit data to a repository
– Data needs to be transferred to the gateway’s storage from the researcher’s
computer or from a compute server
4. Generic Globus application workflow
1. Assemble the necessary credentials
2. Get the right endpoint(s)
3. Make a request
1. Transfer file(s) or folder(s)
2. List contents of a folder
3. Create a folder
4. Delete file(s) or folder(s)
4. (optional) Confirm task completion
5. Things your application doesn’t have to do
• Interact with storage systems or DTNs
• Speak the transfer protocol (FTP, GridFTP, SCP, etc.)
• Keep track of what has and hasn’t been transferred
• Monitor for transfer failures
• Know how many files or the sizes of files
• Know when the transfer finishes (well, unless it does)
7. Generic Globus application workflow
1. Assemble the necessary credentials
2. Get the right endpoint(s)
3. Make a request
1. Transfer file(s) or folder(s)
2. List contents of a folder
3. Create a folder
4. Delete file(s) or folder(s)
4. (optional) Confirm task completion
8. Single, globally accessible
multi-tenant service
Server
Storage
Control
Channel
Data Channel
Control
Channel
Data Transfer!
Researcher
Uses web browser to access
the science gateway from
anywhere in the world
Globus transfer service
Connects to Globus Connect
software on storage systems to set up
and monitor data transfers between
systems.
Personal or Campus Computer
Globus Connect Server
Enables data access on shared
systems, such as servers and
clusters, including creation of guest
collections for non-local users.
Globus Connect Personal
Enables data access on personal
systems, such as laptops or
desktops, for uploading and
downloading to your own systems.
Lab, Campus, or
National-scale Server
Science gateway
A web application tailored to the researcher’s
specific field, discipline, or type of analysis.
Uses Globus Auth API to acquire credentials
and Globus Transfer API as a command-and-
control interface for interacting with storage
and moving data where it needs to be.
Auth API Transfer API
9. What needs to be in place for it to work?
• To enable uploads or downloads with the researcher’s
personal system, the researcher installs Globus Connect
Personal on their system.
– Transfers will use the researcher’s credentials.
• To enable transfers to/from community storage, install
Globus Connect Server and create guest collections.
– The storage administrator installs Globus Connect Server and gives
you (the gateway operator) access.
– You can create guest collections for your gateway to use.
10. How do we identify the endpoints?
• For fixed endpoints (known to
the gateway ahead of time), you
can use the web app to display
the endpoint UUID.
• For researcher endpoints, your
gateway can use the Globus
browse endpoint helper page
https://docs.globus.org/api/help
er-pages/browse-endpoint/
11. What credentials are required?
• Gateway requests the transfer on the researcher’s behalf
– E.g., upload/download from researcher’s personal endpoint
– Requires researcher credentials (and permissions)
– Researcher must login to the gateway using Globus and allow the
gateway to perform transfers on the researcher’s behalf
– Researcher must be granted permission to the other end of the transfer
(e.g., via a guest collection)
• Gateway requests the transfer on its own behalf
– E.g., community-owned data in the gateway’s storage
– Requires gateway credentials (and permissions)
– The request is made using the gateway’s credentials and permissions
– Doesn’t require the researcher to login using Globus
12. Researcher credentials
• For requests on behalf of the
researcher, you’ll need researchers to
login to your application using
Globus
• Globus provides a standard OpenID
Connect (OIDC) interface
– Make sure your application requests the
transfer scope in addition to the defaults:
urn:globus:auth:scope:transfer.api.globus.org:all
– Your application will receive an access
token for the researcher, allowing transfer
requests on behalf of the researcher
13. Client (application) credentials
• To use Globus in an application, you need to register it at
https://developers.globus.org/
• When you register, you’ll receive a Client ID and a Client
Secret.
– These allow your application to use Globus services (as itself)
– Your code can obtain an access token for the Globus Transfer
service
– All requests using this access token will be performed as user
client-id@clients.auth.globus.org
– You can assign permissions to this ID on Globus endpoints, so the
gateway can do things as itself instead of as the logged-in user
The same things you can do in Globus’s GUI can also be done by an application using APIs, SDK, or CLI.
This simplifies applications because Globus manages the transfer: the application only needs to assemble the right credentials and make the request. Globus does the rest.
Examples of “analysis on a community dataset”:
Examples of ”analyze user’s data”:
Examples of “download simulation results”:
Examples of “submit data to a repository”:
KEY POINT: This is ALL the application has to do. (Next slide shows what the application DOES NOT have to do.)
Modern Research Data Portal: https://mrdp.globus.org/
Show how you login using Globus.
Show how you browse data in the portal and SELECT AN ENDPOINT using the Globus browse endpoint helper page.
Show how the transfer is submitted by the portal itself.
(Maybe) show how https://app.globus.org/ also shows the transfer!
KEY POINT: This is ALL the application has to do. (Next slide shows what the application DOES NOT have to do.)