SlideShare una empresa de Scribd logo
1 de 9
Team Argon
“A Commons Platform for Promoting Continuous FAIRness”
NIH Data Commons Pilot
Globus, University of Chicago
University of Southern California
Contact: Ian Foster, foster@uchicago.edu
PIs: Kyle Chard, Ian Foster, Carl Kesselman, Ravi Madduri
Three big picture themes
• Continuous FAIRness: Make all data findable, accessible,
interoperable, reusable at every stage, via pervasive use of
simple identifier and exchange format conventions
• Build on proven security, data, and computation building
blocks that have large user communities inside and outside
biomedicine (see subsequent slides for details)
• Solutions leverage industry best practices and professional
services team to meet scalability, interoperability,
sustainability, and reliability needs
App
(Client)
Service
(Resource Server)
Service
(Resource Server)
Globus Auth: A foundational service for an
authentication and authorization ecosystem
• A flexible security infrastructure that can be used across the Commons
• Enables federation across services using arbitrary linked identities (e.g., @gmail
@xsede @uchicago)
• Facilitates secure/authorized communication between users, services, clients
• Supports arbitrary clients including REST, web, command line, software
• Flexible token management
• Secure sharing between services
• Fine-grain user consents and revocation
Service
(Resource Server)
Resource
Owner
Resource
server operator
App
(Client)
3
https://docs.globus.org/api/auth/
Standards-based, reliable, performant data management
• Globus Connect Server: S3-compatible
HTTP/OAuth interface for secure
client-server transfer
• Endpoints have DNS names
• Globus Transfer: Managed, high-
performance, secure, reliable bulk
asynchronous transfer
• In-place data sharing with flexible and
secure ACLs
• Standards compliant
• S3, OAuth, OIDC, HTTP, GridFTP
4
https://docs.globus.org/api/transfer/
Interoperability: naming and exchange
Minid
• Lightweight identifiers for any product
at any stage
• Easily created, dereferenced, validated
• Global integrity – validate content
across the commons
BDBag
• Self-describing and flexible format
for exchange
• Extended BagIt Specification
• Standard manifest representation
that supports different protocols
Data
Metadata
File1 2AG230..
File2 A31FDC.. FTP
File3 D0F142.. HTTP
…
Minid 001
Minid 007
Minid 719
http://minid.bd2k.org http://bd2k.ini.usc.edu/tools/bdbag/
Infrastructure
My Workspace
• Workspaces bring together data and tools
• Infrastructure designed for scalability and portability
• Leverages
• Federated identities & access control
• Secure access to distributed data
• Data interoperability, exchange
• Provenance
• Tracking activity around data
• By whom? With what?
• Publication & sharing of tools
and workflows
• Cost aware resource allocation for
both compute and data movement
Workspaces: Scalable compute for distributed data
Data Tools
6
Search, navigation, and virtual cohorts
• DERIVA: Digital asset management for heterogeneous data
• Organize, navigate, discover interrelated objects (e.g., assays from a sample over time)
• REST interface
• Entity/Relation model for organizing data
• Supports various DCPPC metadata models
• Fine grain access control to support diverse
collaboration models
• Model evolution to enable continuous
publication, diverse, heterogeneous use cases
• Model driven user interface that
self-configures to current data model
• Integration with Globus Auth, Minids, BDBags, and other components
• Complements Globus Search: Access-controlled search of derived data products
7
Workspace Manager
Bags Workspaces Pipelines
minid_1 Galaxy GTExRNA
minid_2 Jupyter GATKVar
minid_3 RStudio
UCSC
GTEx
TOPMed
MOD
User catalogs
User catalogs
User catalogs
User catalogs
Search
Analyze Visualize
Publish & Reproduce
Discover
Uniform,
secure,
reliable
access to
storage
Virtual cohorts
in standard
manifest with
lightweight ID
Uniform search
across multiple
data sources
All results tracked via standard
manifest and lightweight IDs
Workspaces support Jupyter and
Galaxy on different clouds
Publication
assigns DOIs
and indexes
datasets
Integrating scenario
Summary: Reusable components include...
• Globus Connect Server for data access, transfer, and sharing
• HTTP/S3 access to many storage systems (Posix, object store, etc.)
• GridFTP for managed, reliable, secure, efficient transfers
• Integration with Globus Auth for authentication and authorization
• Offers: A universal storage API
• Globus Auth for securing all REST API interactions
• OAuth2 and OIDC + fine-grained consents and revocation
• Offers: A universal authentication and authorization API
• BDBag (“big data bags”: profiles on BagIt) tools
• BagIt specification with profiles for “holey bags”, etc.
• Offers: Common manifest for exchange of query results, virtual cohorts
• Identifier service for creating lightweight identifiers
• ARKs, created on demand, associated checksum, simple metadata
• Offers: Common mechanism for naming and tracking derived data products9

Más contenido relacionado

La actualidad más candente

Recent Upgrades to ARM Data Transfer and Delivery Using Globus
Recent Upgrades to ARM Data Transfer and Delivery Using GlobusRecent Upgrades to ARM Data Transfer and Delivery Using Globus
Recent Upgrades to ARM Data Transfer and Delivery Using GlobusGlobus
 
Enabling Secure Data Discoverability (SC21 Tutorial)
Enabling Secure Data Discoverability (SC21 Tutorial)Enabling Secure Data Discoverability (SC21 Tutorial)
Enabling Secure Data Discoverability (SC21 Tutorial)Globus
 
A Data Ecosystem to Support Machine Learning in Materials Science
A Data Ecosystem to Support Machine Learning in Materials ScienceA Data Ecosystem to Support Machine Learning in Materials Science
A Data Ecosystem to Support Machine Learning in Materials ScienceGlobus
 
Gateways 2020 Tutorial - Instrument Data Distribution with Globus
Gateways 2020 Tutorial - Instrument Data Distribution with GlobusGateways 2020 Tutorial - Instrument Data Distribution with Globus
Gateways 2020 Tutorial - Instrument Data Distribution with GlobusGlobus
 
20090701 Climate Data Staging
20090701 Climate Data Staging20090701 Climate Data Staging
20090701 Climate Data StagingHenning Bergmeyer
 
Gateways 2020 Tutorial - Large Scale Data Transfer with Globus
Gateways 2020 Tutorial - Large Scale Data Transfer with GlobusGateways 2020 Tutorial - Large Scale Data Transfer with Globus
Gateways 2020 Tutorial - Large Scale Data Transfer with GlobusGlobus
 
Globus: Enabling the Open Storage Network
Globus: Enabling the Open Storage NetworkGlobus: Enabling the Open Storage Network
Globus: Enabling the Open Storage NetworkGlobus
 
Gateways 2020 Tutorial - Automated Data Ingest and Search with Globus
Gateways 2020 Tutorial - Automated Data Ingest and Search with GlobusGateways 2020 Tutorial - Automated Data Ingest and Search with Globus
Gateways 2020 Tutorial - Automated Data Ingest and Search with GlobusGlobus
 
Automating Research Data Management at Scale with Globus
Automating Research Data Management at Scale with GlobusAutomating Research Data Management at Scale with Globus
Automating Research Data Management at Scale with GlobusGlobus
 
Foundations for the Future of Science
Foundations for the Future of ScienceFoundations for the Future of Science
Foundations for the Future of ScienceGlobus
 
Doing Research in the Cloud - NIH Workshop Dennis Gannon
Doing Research in the Cloud - NIH Workshop Dennis GannonDoing Research in the Cloud - NIH Workshop Dennis Gannon
Doing Research in the Cloud - NIH Workshop Dennis GannonMicrosoft Azure for Research
 
Webinar: Q&A on Globus Subscription Features
Webinar: Q&A on Globus Subscription FeaturesWebinar: Q&A on Globus Subscription Features
Webinar: Q&A on Globus Subscription FeaturesGlobus
 
Gateways 2020 Tutorial - Introduction to Globus
Gateways 2020 Tutorial - Introduction to GlobusGateways 2020 Tutorial - Introduction to Globus
Gateways 2020 Tutorial - Introduction to GlobusGlobus
 
An Approach for RDF-based Semantic Access to NoSQL Repositories
An Approach for RDF-based Semantic Access to NoSQL RepositoriesAn Approach for RDF-based Semantic Access to NoSQL Repositories
An Approach for RDF-based Semantic Access to NoSQL RepositoriesLuiz Henrique Zambom Santana
 
Linked Open Data and DANS
Linked Open Data and DANSLinked Open Data and DANS
Linked Open Data and DANSvty
 
Why Elastic? @ 50th Vinitaly 2016
Why Elastic? @ 50th Vinitaly 2016Why Elastic? @ 50th Vinitaly 2016
Why Elastic? @ 50th Vinitaly 2016Christoph Wurm
 
Delivering a Campus Research Data Service with Globus
Delivering a Campus Research Data Service with GlobusDelivering a Campus Research Data Service with Globus
Delivering a Campus Research Data Service with GlobusIan Foster
 
balloon: LOD forecasting - cloudy with a chance of services
balloon: LOD forecasting - cloudy with a chance of servicesballoon: LOD forecasting - cloudy with a chance of services
balloon: LOD forecasting - cloudy with a chance of servicesKai Schlegel
 
20160922 Materials Data Facility TMS Webinar
20160922 Materials Data Facility TMS Webinar20160922 Materials Data Facility TMS Webinar
20160922 Materials Data Facility TMS WebinarBen Blaiszik
 
balloon Fusion: SPARQL Rewriting Based on Unified Co-Reference Information
balloon Fusion: SPARQL Rewriting Based on  Unified Co-Reference Informationballoon Fusion: SPARQL Rewriting Based on  Unified Co-Reference Information
balloon Fusion: SPARQL Rewriting Based on Unified Co-Reference InformationKai Schlegel
 

La actualidad más candente (20)

Recent Upgrades to ARM Data Transfer and Delivery Using Globus
Recent Upgrades to ARM Data Transfer and Delivery Using GlobusRecent Upgrades to ARM Data Transfer and Delivery Using Globus
Recent Upgrades to ARM Data Transfer and Delivery Using Globus
 
Enabling Secure Data Discoverability (SC21 Tutorial)
Enabling Secure Data Discoverability (SC21 Tutorial)Enabling Secure Data Discoverability (SC21 Tutorial)
Enabling Secure Data Discoverability (SC21 Tutorial)
 
A Data Ecosystem to Support Machine Learning in Materials Science
A Data Ecosystem to Support Machine Learning in Materials ScienceA Data Ecosystem to Support Machine Learning in Materials Science
A Data Ecosystem to Support Machine Learning in Materials Science
 
Gateways 2020 Tutorial - Instrument Data Distribution with Globus
Gateways 2020 Tutorial - Instrument Data Distribution with GlobusGateways 2020 Tutorial - Instrument Data Distribution with Globus
Gateways 2020 Tutorial - Instrument Data Distribution with Globus
 
20090701 Climate Data Staging
20090701 Climate Data Staging20090701 Climate Data Staging
20090701 Climate Data Staging
 
Gateways 2020 Tutorial - Large Scale Data Transfer with Globus
Gateways 2020 Tutorial - Large Scale Data Transfer with GlobusGateways 2020 Tutorial - Large Scale Data Transfer with Globus
Gateways 2020 Tutorial - Large Scale Data Transfer with Globus
 
Globus: Enabling the Open Storage Network
Globus: Enabling the Open Storage NetworkGlobus: Enabling the Open Storage Network
Globus: Enabling the Open Storage Network
 
Gateways 2020 Tutorial - Automated Data Ingest and Search with Globus
Gateways 2020 Tutorial - Automated Data Ingest and Search with GlobusGateways 2020 Tutorial - Automated Data Ingest and Search with Globus
Gateways 2020 Tutorial - Automated Data Ingest and Search with Globus
 
Automating Research Data Management at Scale with Globus
Automating Research Data Management at Scale with GlobusAutomating Research Data Management at Scale with Globus
Automating Research Data Management at Scale with Globus
 
Foundations for the Future of Science
Foundations for the Future of ScienceFoundations for the Future of Science
Foundations for the Future of Science
 
Doing Research in the Cloud - NIH Workshop Dennis Gannon
Doing Research in the Cloud - NIH Workshop Dennis GannonDoing Research in the Cloud - NIH Workshop Dennis Gannon
Doing Research in the Cloud - NIH Workshop Dennis Gannon
 
Webinar: Q&A on Globus Subscription Features
Webinar: Q&A on Globus Subscription FeaturesWebinar: Q&A on Globus Subscription Features
Webinar: Q&A on Globus Subscription Features
 
Gateways 2020 Tutorial - Introduction to Globus
Gateways 2020 Tutorial - Introduction to GlobusGateways 2020 Tutorial - Introduction to Globus
Gateways 2020 Tutorial - Introduction to Globus
 
An Approach for RDF-based Semantic Access to NoSQL Repositories
An Approach for RDF-based Semantic Access to NoSQL RepositoriesAn Approach for RDF-based Semantic Access to NoSQL Repositories
An Approach for RDF-based Semantic Access to NoSQL Repositories
 
Linked Open Data and DANS
Linked Open Data and DANSLinked Open Data and DANS
Linked Open Data and DANS
 
Why Elastic? @ 50th Vinitaly 2016
Why Elastic? @ 50th Vinitaly 2016Why Elastic? @ 50th Vinitaly 2016
Why Elastic? @ 50th Vinitaly 2016
 
Delivering a Campus Research Data Service with Globus
Delivering a Campus Research Data Service with GlobusDelivering a Campus Research Data Service with Globus
Delivering a Campus Research Data Service with Globus
 
balloon: LOD forecasting - cloudy with a chance of services
balloon: LOD forecasting - cloudy with a chance of servicesballoon: LOD forecasting - cloudy with a chance of services
balloon: LOD forecasting - cloudy with a chance of services
 
20160922 Materials Data Facility TMS Webinar
20160922 Materials Data Facility TMS Webinar20160922 Materials Data Facility TMS Webinar
20160922 Materials Data Facility TMS Webinar
 
balloon Fusion: SPARQL Rewriting Based on Unified Co-Reference Information
balloon Fusion: SPARQL Rewriting Based on  Unified Co-Reference Informationballoon Fusion: SPARQL Rewriting Based on  Unified Co-Reference Information
balloon Fusion: SPARQL Rewriting Based on Unified Co-Reference Information
 

Similar a Team Argon Summary

Introduction to Globus for New Users (GlobusWorld Tour - Columbia University)
Introduction to Globus for New Users (GlobusWorld Tour - Columbia University)Introduction to Globus for New Users (GlobusWorld Tour - Columbia University)
Introduction to Globus for New Users (GlobusWorld Tour - Columbia University)Globus
 
Introduction to Globus for New Users
Introduction to Globus for New UsersIntroduction to Globus for New Users
Introduction to Globus for New UsersGlobus
 
Introduction to Globus for New Users (GlobusWorld Tour - UCSD)
Introduction to Globus for New Users (GlobusWorld Tour - UCSD)Introduction to Globus for New Users (GlobusWorld Tour - UCSD)
Introduction to Globus for New Users (GlobusWorld Tour - UCSD)Globus
 
Scalable Data Management: Automation and the Modern Research Data Portal
Scalable Data Management: Automation and the Modern Research Data PortalScalable Data Management: Automation and the Modern Research Data Portal
Scalable Data Management: Automation and the Modern Research Data PortalGlobus
 
What's New With Globus
What's New With GlobusWhat's New With Globus
What's New With GlobusGlobus
 
Managing Protected and Controlled Data with Globus
Managing Protected and Controlled Data with Globus Managing Protected and Controlled Data with Globus
Managing Protected and Controlled Data with Globus Globus
 
Globus: Beyond File Transfer
Globus: Beyond File TransferGlobus: Beyond File Transfer
Globus: Beyond File TransferGlobus
 
Introduction to Globus
Introduction to GlobusIntroduction to Globus
Introduction to GlobusGlobus
 
Globus: A Data Management Platform for Collaborative Research (CHPC 2019 - So...
Globus: A Data Management Platform for Collaborative Research (CHPC 2019 - So...Globus: A Data Management Platform for Collaborative Research (CHPC 2019 - So...
Globus: A Data Management Platform for Collaborative Research (CHPC 2019 - So...Globus
 
Introduction to the Globus SaaS (GlobusWorld Tour - STFC)
Introduction to the Globus SaaS (GlobusWorld Tour - STFC)Introduction to the Globus SaaS (GlobusWorld Tour - STFC)
Introduction to the Globus SaaS (GlobusWorld Tour - STFC)Globus
 
Facilitating Collaboration with Globus (GlobusWorld Tour - STFC)
Facilitating Collaboration with Globus (GlobusWorld Tour - STFC)Facilitating Collaboration with Globus (GlobusWorld Tour - STFC)
Facilitating Collaboration with Globus (GlobusWorld Tour - STFC)Globus
 
Introduction to Globus (GlobusWorld Tour West)
Introduction to Globus (GlobusWorld Tour West)Introduction to Globus (GlobusWorld Tour West)
Introduction to Globus (GlobusWorld Tour West)Globus
 
Introduction to Globus (APS Workshop)
Introduction to Globus (APS Workshop)Introduction to Globus (APS Workshop)
Introduction to Globus (APS Workshop)Globus
 
Globus High Assurance for Protected Data (GlobusWorld Tour - UCSD)
Globus High Assurance for Protected Data (GlobusWorld Tour - UCSD)Globus High Assurance for Protected Data (GlobusWorld Tour - UCSD)
Globus High Assurance for Protected Data (GlobusWorld Tour - UCSD)Globus
 
Introduction to Globus - XSEDE14 Tutorial
Introduction to Globus - XSEDE14 TutorialIntroduction to Globus - XSEDE14 Tutorial
Introduction to Globus - XSEDE14 TutorialGlobus
 
Introduction to Globus (GlobusWorld Tour - UMich)
Introduction to Globus (GlobusWorld Tour - UMich)Introduction to Globus (GlobusWorld Tour - UMich)
Introduction to Globus (GlobusWorld Tour - UMich)Globus
 
GlobusWorld 2021 Tutorial: Introduction to Globus
GlobusWorld 2021 Tutorial: Introduction to GlobusGlobusWorld 2021 Tutorial: Introduction to Globus
GlobusWorld 2021 Tutorial: Introduction to GlobusGlobus
 
KoprowskiT_session1_SDNEvent_WASDforBeginners
KoprowskiT_session1_SDNEvent_WASDforBeginnersKoprowskiT_session1_SDNEvent_WASDforBeginners
KoprowskiT_session1_SDNEvent_WASDforBeginnersTobias Koprowski
 
Vaultize Cloud Architecture - Enterprise File Sync and Share (EFSS)
Vaultize Cloud Architecture - Enterprise File Sync and Share (EFSS)Vaultize Cloud Architecture - Enterprise File Sync and Share (EFSS)
Vaultize Cloud Architecture - Enterprise File Sync and Share (EFSS)Vaultize
 

Similar a Team Argon Summary (20)

Introduction to Globus for New Users (GlobusWorld Tour - Columbia University)
Introduction to Globus for New Users (GlobusWorld Tour - Columbia University)Introduction to Globus for New Users (GlobusWorld Tour - Columbia University)
Introduction to Globus for New Users (GlobusWorld Tour - Columbia University)
 
Introduction to Globus for New Users
Introduction to Globus for New UsersIntroduction to Globus for New Users
Introduction to Globus for New Users
 
Introduction to Globus for New Users (GlobusWorld Tour - UCSD)
Introduction to Globus for New Users (GlobusWorld Tour - UCSD)Introduction to Globus for New Users (GlobusWorld Tour - UCSD)
Introduction to Globus for New Users (GlobusWorld Tour - UCSD)
 
Scalable Data Management: Automation and the Modern Research Data Portal
Scalable Data Management: Automation and the Modern Research Data PortalScalable Data Management: Automation and the Modern Research Data Portal
Scalable Data Management: Automation and the Modern Research Data Portal
 
What's New With Globus
What's New With GlobusWhat's New With Globus
What's New With Globus
 
Managing Protected and Controlled Data with Globus
Managing Protected and Controlled Data with Globus Managing Protected and Controlled Data with Globus
Managing Protected and Controlled Data with Globus
 
Globus: Beyond File Transfer
Globus: Beyond File TransferGlobus: Beyond File Transfer
Globus: Beyond File Transfer
 
Introduction to Globus
Introduction to GlobusIntroduction to Globus
Introduction to Globus
 
Globus: A Data Management Platform for Collaborative Research (CHPC 2019 - So...
Globus: A Data Management Platform for Collaborative Research (CHPC 2019 - So...Globus: A Data Management Platform for Collaborative Research (CHPC 2019 - So...
Globus: A Data Management Platform for Collaborative Research (CHPC 2019 - So...
 
Introduction to the Globus SaaS (GlobusWorld Tour - STFC)
Introduction to the Globus SaaS (GlobusWorld Tour - STFC)Introduction to the Globus SaaS (GlobusWorld Tour - STFC)
Introduction to the Globus SaaS (GlobusWorld Tour - STFC)
 
Facilitating Collaboration with Globus (GlobusWorld Tour - STFC)
Facilitating Collaboration with Globus (GlobusWorld Tour - STFC)Facilitating Collaboration with Globus (GlobusWorld Tour - STFC)
Facilitating Collaboration with Globus (GlobusWorld Tour - STFC)
 
Introduction to Globus (GlobusWorld Tour West)
Introduction to Globus (GlobusWorld Tour West)Introduction to Globus (GlobusWorld Tour West)
Introduction to Globus (GlobusWorld Tour West)
 
Introduction to Globus (APS Workshop)
Introduction to Globus (APS Workshop)Introduction to Globus (APS Workshop)
Introduction to Globus (APS Workshop)
 
Document Management System
Document Management System Document Management System
Document Management System
 
Globus High Assurance for Protected Data (GlobusWorld Tour - UCSD)
Globus High Assurance for Protected Data (GlobusWorld Tour - UCSD)Globus High Assurance for Protected Data (GlobusWorld Tour - UCSD)
Globus High Assurance for Protected Data (GlobusWorld Tour - UCSD)
 
Introduction to Globus - XSEDE14 Tutorial
Introduction to Globus - XSEDE14 TutorialIntroduction to Globus - XSEDE14 Tutorial
Introduction to Globus - XSEDE14 Tutorial
 
Introduction to Globus (GlobusWorld Tour - UMich)
Introduction to Globus (GlobusWorld Tour - UMich)Introduction to Globus (GlobusWorld Tour - UMich)
Introduction to Globus (GlobusWorld Tour - UMich)
 
GlobusWorld 2021 Tutorial: Introduction to Globus
GlobusWorld 2021 Tutorial: Introduction to GlobusGlobusWorld 2021 Tutorial: Introduction to Globus
GlobusWorld 2021 Tutorial: Introduction to Globus
 
KoprowskiT_session1_SDNEvent_WASDforBeginners
KoprowskiT_session1_SDNEvent_WASDforBeginnersKoprowskiT_session1_SDNEvent_WASDforBeginners
KoprowskiT_session1_SDNEvent_WASDforBeginners
 
Vaultize Cloud Architecture - Enterprise File Sync and Share (EFSS)
Vaultize Cloud Architecture - Enterprise File Sync and Share (EFSS)Vaultize Cloud Architecture - Enterprise File Sync and Share (EFSS)
Vaultize Cloud Architecture - Enterprise File Sync and Share (EFSS)
 

Más de Ian Foster

Global Services for Global Science March 2023.pptx
Global Services for Global Science March 2023.pptxGlobal Services for Global Science March 2023.pptx
Global Services for Global Science March 2023.pptxIan Foster
 
The Earth System Grid Federation: Origins, Current State, Evolution
The Earth System Grid Federation: Origins, Current State, EvolutionThe Earth System Grid Federation: Origins, Current State, Evolution
The Earth System Grid Federation: Origins, Current State, EvolutionIan Foster
 
Better Information Faster: Programming the Continuum
Better Information Faster: Programming the ContinuumBetter Information Faster: Programming the Continuum
Better Information Faster: Programming the ContinuumIan Foster
 
ESnet6 and Smart Instruments
ESnet6 and Smart InstrumentsESnet6 and Smart Instruments
ESnet6 and Smart InstrumentsIan Foster
 
Linking Scientific Instruments and Computation
Linking Scientific Instruments and ComputationLinking Scientific Instruments and Computation
Linking Scientific Instruments and ComputationIan Foster
 
A Global Research Data Platform: How Globus Services Enable Scientific Discovery
A Global Research Data Platform: How Globus Services Enable Scientific DiscoveryA Global Research Data Platform: How Globus Services Enable Scientific Discovery
A Global Research Data Platform: How Globus Services Enable Scientific DiscoveryIan Foster
 
Foster CRA March 2022.pptx
Foster CRA March 2022.pptxFoster CRA March 2022.pptx
Foster CRA March 2022.pptxIan Foster
 
Big Data, Big Computing, AI, and Environmental Science
Big Data, Big Computing, AI, and Environmental ScienceBig Data, Big Computing, AI, and Environmental Science
Big Data, Big Computing, AI, and Environmental ScienceIan Foster
 
AI at Scale for Materials and Chemistry
AI at Scale for Materials and ChemistryAI at Scale for Materials and Chemistry
AI at Scale for Materials and ChemistryIan Foster
 
Coding the Continuum
Coding the ContinuumCoding the Continuum
Coding the ContinuumIan Foster
 
Data Tribology: Overcoming Data Friction with Cloud Automation
Data Tribology: Overcoming Data Friction with Cloud AutomationData Tribology: Overcoming Data Friction with Cloud Automation
Data Tribology: Overcoming Data Friction with Cloud AutomationIan Foster
 
Research Automation for Data-Driven Discovery
Research Automation for Data-Driven DiscoveryResearch Automation for Data-Driven Discovery
Research Automation for Data-Driven DiscoveryIan Foster
 
Scaling collaborative data science with Globus and Jupyter
Scaling collaborative data science with Globus and JupyterScaling collaborative data science with Globus and Jupyter
Scaling collaborative data science with Globus and JupyterIan Foster
 
Learning Systems for Science
Learning Systems for ScienceLearning Systems for Science
Learning Systems for ScienceIan Foster
 
Data Automation at Light Sources
Data Automation at Light SourcesData Automation at Light Sources
Data Automation at Light SourcesIan Foster
 
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...Ian Foster
 
Going Smart and Deep on Materials at ALCF
Going Smart and Deep on Materials at ALCFGoing Smart and Deep on Materials at ALCF
Going Smart and Deep on Materials at ALCFIan Foster
 
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...Ian Foster
 
Software Infrastructure for a National Research Platform
Software Infrastructure for a National Research PlatformSoftware Infrastructure for a National Research Platform
Software Infrastructure for a National Research PlatformIan Foster
 
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...Ian Foster
 

Más de Ian Foster (20)

Global Services for Global Science March 2023.pptx
Global Services for Global Science March 2023.pptxGlobal Services for Global Science March 2023.pptx
Global Services for Global Science March 2023.pptx
 
The Earth System Grid Federation: Origins, Current State, Evolution
The Earth System Grid Federation: Origins, Current State, EvolutionThe Earth System Grid Federation: Origins, Current State, Evolution
The Earth System Grid Federation: Origins, Current State, Evolution
 
Better Information Faster: Programming the Continuum
Better Information Faster: Programming the ContinuumBetter Information Faster: Programming the Continuum
Better Information Faster: Programming the Continuum
 
ESnet6 and Smart Instruments
ESnet6 and Smart InstrumentsESnet6 and Smart Instruments
ESnet6 and Smart Instruments
 
Linking Scientific Instruments and Computation
Linking Scientific Instruments and ComputationLinking Scientific Instruments and Computation
Linking Scientific Instruments and Computation
 
A Global Research Data Platform: How Globus Services Enable Scientific Discovery
A Global Research Data Platform: How Globus Services Enable Scientific DiscoveryA Global Research Data Platform: How Globus Services Enable Scientific Discovery
A Global Research Data Platform: How Globus Services Enable Scientific Discovery
 
Foster CRA March 2022.pptx
Foster CRA March 2022.pptxFoster CRA March 2022.pptx
Foster CRA March 2022.pptx
 
Big Data, Big Computing, AI, and Environmental Science
Big Data, Big Computing, AI, and Environmental ScienceBig Data, Big Computing, AI, and Environmental Science
Big Data, Big Computing, AI, and Environmental Science
 
AI at Scale for Materials and Chemistry
AI at Scale for Materials and ChemistryAI at Scale for Materials and Chemistry
AI at Scale for Materials and Chemistry
 
Coding the Continuum
Coding the ContinuumCoding the Continuum
Coding the Continuum
 
Data Tribology: Overcoming Data Friction with Cloud Automation
Data Tribology: Overcoming Data Friction with Cloud AutomationData Tribology: Overcoming Data Friction with Cloud Automation
Data Tribology: Overcoming Data Friction with Cloud Automation
 
Research Automation for Data-Driven Discovery
Research Automation for Data-Driven DiscoveryResearch Automation for Data-Driven Discovery
Research Automation for Data-Driven Discovery
 
Scaling collaborative data science with Globus and Jupyter
Scaling collaborative data science with Globus and JupyterScaling collaborative data science with Globus and Jupyter
Scaling collaborative data science with Globus and Jupyter
 
Learning Systems for Science
Learning Systems for ScienceLearning Systems for Science
Learning Systems for Science
 
Data Automation at Light Sources
Data Automation at Light SourcesData Automation at Light Sources
Data Automation at Light Sources
 
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
 
Going Smart and Deep on Materials at ALCF
Going Smart and Deep on Materials at ALCFGoing Smart and Deep on Materials at ALCF
Going Smart and Deep on Materials at ALCF
 
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
 
Software Infrastructure for a National Research Platform
Software Infrastructure for a National Research PlatformSoftware Infrastructure for a National Research Platform
Software Infrastructure for a National Research Platform
 
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...
 

Último

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 

Último (20)

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 

Team Argon Summary

  • 1. Team Argon “A Commons Platform for Promoting Continuous FAIRness” NIH Data Commons Pilot Globus, University of Chicago University of Southern California Contact: Ian Foster, foster@uchicago.edu PIs: Kyle Chard, Ian Foster, Carl Kesselman, Ravi Madduri
  • 2. Three big picture themes • Continuous FAIRness: Make all data findable, accessible, interoperable, reusable at every stage, via pervasive use of simple identifier and exchange format conventions • Build on proven security, data, and computation building blocks that have large user communities inside and outside biomedicine (see subsequent slides for details) • Solutions leverage industry best practices and professional services team to meet scalability, interoperability, sustainability, and reliability needs
  • 3. App (Client) Service (Resource Server) Service (Resource Server) Globus Auth: A foundational service for an authentication and authorization ecosystem • A flexible security infrastructure that can be used across the Commons • Enables federation across services using arbitrary linked identities (e.g., @gmail @xsede @uchicago) • Facilitates secure/authorized communication between users, services, clients • Supports arbitrary clients including REST, web, command line, software • Flexible token management • Secure sharing between services • Fine-grain user consents and revocation Service (Resource Server) Resource Owner Resource server operator App (Client) 3 https://docs.globus.org/api/auth/
  • 4. Standards-based, reliable, performant data management • Globus Connect Server: S3-compatible HTTP/OAuth interface for secure client-server transfer • Endpoints have DNS names • Globus Transfer: Managed, high- performance, secure, reliable bulk asynchronous transfer • In-place data sharing with flexible and secure ACLs • Standards compliant • S3, OAuth, OIDC, HTTP, GridFTP 4 https://docs.globus.org/api/transfer/
  • 5. Interoperability: naming and exchange Minid • Lightweight identifiers for any product at any stage • Easily created, dereferenced, validated • Global integrity – validate content across the commons BDBag • Self-describing and flexible format for exchange • Extended BagIt Specification • Standard manifest representation that supports different protocols Data Metadata File1 2AG230.. File2 A31FDC.. FTP File3 D0F142.. HTTP … Minid 001 Minid 007 Minid 719 http://minid.bd2k.org http://bd2k.ini.usc.edu/tools/bdbag/
  • 6. Infrastructure My Workspace • Workspaces bring together data and tools • Infrastructure designed for scalability and portability • Leverages • Federated identities & access control • Secure access to distributed data • Data interoperability, exchange • Provenance • Tracking activity around data • By whom? With what? • Publication & sharing of tools and workflows • Cost aware resource allocation for both compute and data movement Workspaces: Scalable compute for distributed data Data Tools 6
  • 7. Search, navigation, and virtual cohorts • DERIVA: Digital asset management for heterogeneous data • Organize, navigate, discover interrelated objects (e.g., assays from a sample over time) • REST interface • Entity/Relation model for organizing data • Supports various DCPPC metadata models • Fine grain access control to support diverse collaboration models • Model evolution to enable continuous publication, diverse, heterogeneous use cases • Model driven user interface that self-configures to current data model • Integration with Globus Auth, Minids, BDBags, and other components • Complements Globus Search: Access-controlled search of derived data products 7
  • 8. Workspace Manager Bags Workspaces Pipelines minid_1 Galaxy GTExRNA minid_2 Jupyter GATKVar minid_3 RStudio UCSC GTEx TOPMed MOD User catalogs User catalogs User catalogs User catalogs Search Analyze Visualize Publish & Reproduce Discover Uniform, secure, reliable access to storage Virtual cohorts in standard manifest with lightweight ID Uniform search across multiple data sources All results tracked via standard manifest and lightweight IDs Workspaces support Jupyter and Galaxy on different clouds Publication assigns DOIs and indexes datasets Integrating scenario
  • 9. Summary: Reusable components include... • Globus Connect Server for data access, transfer, and sharing • HTTP/S3 access to many storage systems (Posix, object store, etc.) • GridFTP for managed, reliable, secure, efficient transfers • Integration with Globus Auth for authentication and authorization • Offers: A universal storage API • Globus Auth for securing all REST API interactions • OAuth2 and OIDC + fine-grained consents and revocation • Offers: A universal authentication and authorization API • BDBag (“big data bags”: profiles on BagIt) tools • BagIt specification with profiles for “holey bags”, etc. • Offers: Common manifest for exchange of query results, virtual cohorts • Identifier service for creating lightweight identifiers • ARKs, created on demand, associated checksum, simple metadata • Offers: Common mechanism for naming and tracking derived data products9