SlideShare una empresa de Scribd logo
1 de 23
CYVERSE: TRANSFORMING LIFE
SCIENCE RESEARCH VIA
CYBERINFRASTRUCTURE
Matthew Vaughn @mattdotvaughn
Director, Life Sciences Computing, TACC
Co-PI Cyverse, Araport, Jetstream Cloud
9/8/2016 1
OVERVIEW
9/8/2016 2
• WHAT IS CYVERSE?
• HOW IS IT TRANSFORMATIONAL FOR LIFE
SCIENCES RESEARCH?
• HOW DOES IT FIT INTO THE BIGGER SCHEME?
• WHAT DIRECTIONS AND CHALLENGES ARE IN ITS
FUTURE?
CYVERSE IS A CYBERINFRASTRUCTURE
9/8/2016 3
Vision: Transforming science through data-driven
discovery
Mission: To design, develop, deploy, and expand a
national cyberinfrastructure for life
science research, and to train scientists in
its use
SUPPORTED BY THE NSF BIO DIRECTORATE
9/8/2016 4
• Division of Biological
Infrastructure
• $100 Million, 10-year investment
• CyVerse resources are
– Freely available to the
community
– Intended to spur national and
international collaboration for
research and education
iPlant 2008
Empowering a
New Plant Biology
iPlant 2013
Cyberinfrastructure for
Life Science
CyVerse 2016
Transforming
Science Through
Data-Driven
Discovery
DBI-0735191
DBI-1265383
9/8/2016 5
DISRUPTIVE MEASUREMENT TECHNOLOGIES
NOT JUST ONE DATA TSUNAMI BUT THOUSANDS OF THEM
9/8/2016 6
EXPLOSION IN SOFTWARE AND SYSTEMS COMPLEXITY
9/8/2016 8
INCREASED ADOPTION OF COMPUTATIONAL METHODS
RESEARCH TEAMS NEED THIS
 Store, organize, share primary data
 Do basic analysis
 Store, organize, share data products
 Generate and explore hypotheses
 Share analysis code with the scientific public
 Integrate results from new experiments
 Publish data alongside plots, visualizations and
analytical tools
9/8/2016 9
BUT END UP DEALING WITH THIS
 Data lifecycle management
 Fine-grained permission management
 Discoverability
 Version control
 Taming promising new analysis codes (usually
based immature technology)
 Paying for storage, cycles, and consulting
 Making their science reproducible
9/8/2016 10
THE CYVERSE APPROACH
9/8/2016 11
CYVERSE PRODUCT MATRIX
9/8/2016 12
Atmosphere
User-provisioned, highly configurable cloud computing environment tailored for
sciences
Discovery
Environment
Web-accessible analysis workbench and gateway to national HPC infrastructure
(XSEDE)
Bisque Software for managing, analyzing and visualizing high throughput imaging data
Data Store
Scalable data storage for managing and sharing data across CyVerse’s CI and
external data resources
Science APIs
Automation interfaces to connect data and computation for rapid integration
external resources. Also used as a graduate teaching platform.
DNA Subway Classroom-friendly bioinformatics teaching platform
Powered by CyVerse Third-party applications built on CyVerse’s foundational services and
Welch et al. 2013
Bioinformatics
Specialist
Computing
Professional
Bench Scientist
EMPOWER USERS AT ALL LEVELS
Help them avoid
data and
operations siloes
9/8/2016 14
Science
applications
Domain-specific
services
Established
software and CI
Physical resources
Federated
Storage
National CI Virtualization
Job
Scheduling
Single
Sign-on
EaseofUse
EaseofRe-use
IMPACTS
9/8/2016 15
• 500+ publications
• >2PB user data stored
• 40+k registered users
• Millions of compute
hours annually
• Hundreds of trainees
CYVERSE IS A HUB
IN A RICH &
COLLABORATIVE
ECOSYSTEM
9/8/2016 16
• Using
• Collaborating
• Contributing
• Supporting
• Inventing
CURRENT INITIATIVES
9/8/2016 17
Enabling Data-Driven Discovery. Providing Advanced Training to Researchers. Removing
Barriers to Reproducible Science.
Cyverse Data Commons
Portable Science Lab
Intensive Engagement
CYVERSE DATA COMMONS
9/8/2016 18
Make research data discoverable and reusable. Ensure it ends up stored in its natural repository.
Cyverse Data
Store
Staging Area
Data Commons
Portal
Natural
Repositories
Publish in place
simply by sharing
Curate, format,
describe metadata
Published
snapshot with
DOI and open
access
Facilitated deposit
to NCBI-SRA,
Genbank, and
more
PORTABLE SCIENCE LAB
9/8/2016 19
Continue adoption of technologies to describe, encapsulate, and share research code and
data.
Virtual machines, Linux containers, Web Service APIs,
Workflow Standards
Integrated via Interactive, Narrative
Notebooks
INTENSIVE ENGAGEMENT
9/8/2016 20
Extended
Collaborative
Support
Consultation and
Support Forums
Hands-on
Training and
Tutorials
Enhanced
Support Tooling
Empower
Researchers to
Embrace and
Extend Cyverse
SUMMARY
9/8/2016 21
• CyVerse is a reference model for cyberinfrastructure that is already
being extended to other disciplines
• CyVerse provides a vertically integrated, scalable data-to-discovery
cyberinfrastructure that leverages existing federal and state
investments to transform life science research
• Cyverse is driving technological and operational innovation via a
web of interactions and collaborations with other projects,
platforms, and infrastructures.
KEY CHALLENGE - CYVERSE VALUE PROPOSITION
9/8/2016 22
“Are you still going to be around in 3 years?”
”Why did my analysis fail? Don’t you have big computers?”
“Shouldn’t we just go to Amazon Web Services?”
“I don’t want my students spending time learning computing.”
“Why aren’t you working on X?”
DISCUSSION
9/8/2016 23
@mattdotvaughn www.slideshare.net/mattdotvaughn vaughn@tacc.utexas.edu

Más contenido relacionado

La actualidad más candente

Analyzing Big Data in Medicine with Virtual Research Environments and Microse...
Analyzing Big Data in Medicine with Virtual Research Environments and Microse...Analyzing Big Data in Medicine with Virtual Research Environments and Microse...
Analyzing Big Data in Medicine with Virtual Research Environments and Microse...Ola Spjuth
 
Interoperability and scalability with microservices in science
Interoperability and scalability with microservices in scienceInteroperability and scalability with microservices in science
Interoperability and scalability with microservices in scienceOla Spjuth
 
SEAD Datanet and Sustainability Science
SEAD Datanet and Sustainability Science SEAD Datanet and Sustainability Science
SEAD Datanet and Sustainability Science Robert H. McDonald
 
The pulse of cloud computing with bioinformatics as an example
The pulse of cloud computing with bioinformatics as an exampleThe pulse of cloud computing with bioinformatics as an example
The pulse of cloud computing with bioinformatics as an exampleEnis Afgan
 
D4Science Data Infrastructure - Facilitator for a FAIR Data Management
D4Science Data Infrastructure - Facilitator for a FAIR Data ManagementD4Science Data Infrastructure - Facilitator for a FAIR Data Management
D4Science Data Infrastructure - Facilitator for a FAIR Data ManagementBlue BRIDGE
 
Panel members v2_datajournals_repositories_repofringe3aug2015
Panel members v2_datajournals_repositories_repofringe3aug2015Panel members v2_datajournals_repositories_repofringe3aug2015
Panel members v2_datajournals_repositories_repofringe3aug2015University of Edinburgh
 
From data to discovery webinar - University of Newcastle
From data to discovery webinar - University of NewcastleFrom data to discovery webinar - University of Newcastle
From data to discovery webinar - University of NewcastleARDC
 
ANDS Applications Program: Building Tools to Facilitate Data Reuse
ANDS Applications Program: Building Tools to Facilitate Data ReuseANDS Applications Program: Building Tools to Facilitate Data Reuse
ANDS Applications Program: Building Tools to Facilitate Data ReuseAndrew Treloar
 
Provenance in Support of the ANDS Four Transformations
Provenance in Support of the ANDS Four TransformationsProvenance in Support of the ANDS Four Transformations
Provenance in Support of the ANDS Four TransformationsAndrew Treloar
 
Storage for research-data webinar - Deakin University
Storage for research-data webinar - Deakin UniversityStorage for research-data webinar - Deakin University
Storage for research-data webinar - Deakin UniversityARDC
 
Data management: international challenges, national infrastructure, and insti...
Data management: international challenges, national infrastructure, and insti...Data management: international challenges, national infrastructure, and insti...
Data management: international challenges, national infrastructure, and insti...Andrew Treloar
 
Dataverse on the MOC
Dataverse on the MOCDataverse on the MOC
Dataverse on the MOCMerce Crosas
 

La actualidad más candente (20)

Cloud Dataverse
Cloud DataverseCloud Dataverse
Cloud Dataverse
 
Accelerating your research with Microsoft Azure
Accelerating your research with Microsoft AzureAccelerating your research with Microsoft Azure
Accelerating your research with Microsoft Azure
 
Cyverse: Extensible Cyberinfrastructure for Life Science
Cyverse: Extensible Cyberinfrastructure for Life ScienceCyverse: Extensible Cyberinfrastructure for Life Science
Cyverse: Extensible Cyberinfrastructure for Life Science
 
Analyzing Big Data in Medicine with Virtual Research Environments and Microse...
Analyzing Big Data in Medicine with Virtual Research Environments and Microse...Analyzing Big Data in Medicine with Virtual Research Environments and Microse...
Analyzing Big Data in Medicine with Virtual Research Environments and Microse...
 
Sept 24 NISO Virtual Conference: Library Data in the Cloud
Sept 24 NISO Virtual Conference: Library Data in the CloudSept 24 NISO Virtual Conference: Library Data in the Cloud
Sept 24 NISO Virtual Conference: Library Data in the Cloud
 
Sept 24 NISO Virtual Conference: Library Data in the Cloud
Sept 24 NISO Virtual Conference: Library Data in the CloudSept 24 NISO Virtual Conference: Library Data in the Cloud
Sept 24 NISO Virtual Conference: Library Data in the Cloud
 
Sept 24 NISO Virtual Conference: Library Data in the Cloud
Sept 24 NISO Virtual Conference: Library Data in the CloudSept 24 NISO Virtual Conference: Library Data in the Cloud
Sept 24 NISO Virtual Conference: Library Data in the Cloud
 
Virtualization for HPC at NCI
Virtualization for HPC at NCIVirtualization for HPC at NCI
Virtualization for HPC at NCI
 
Interoperability and scalability with microservices in science
Interoperability and scalability with microservices in scienceInteroperability and scalability with microservices in science
Interoperability and scalability with microservices in science
 
Sept 24 NISO Virtual Conference: Library Data in the Cloud
Sept 24 NISO Virtual Conference: Library Data in the CloudSept 24 NISO Virtual Conference: Library Data in the Cloud
Sept 24 NISO Virtual Conference: Library Data in the Cloud
 
SEAD Datanet and Sustainability Science
SEAD Datanet and Sustainability Science SEAD Datanet and Sustainability Science
SEAD Datanet and Sustainability Science
 
The pulse of cloud computing with bioinformatics as an example
The pulse of cloud computing with bioinformatics as an exampleThe pulse of cloud computing with bioinformatics as an example
The pulse of cloud computing with bioinformatics as an example
 
D4Science Data Infrastructure - Facilitator for a FAIR Data Management
D4Science Data Infrastructure - Facilitator for a FAIR Data ManagementD4Science Data Infrastructure - Facilitator for a FAIR Data Management
D4Science Data Infrastructure - Facilitator for a FAIR Data Management
 
Panel members v2_datajournals_repositories_repofringe3aug2015
Panel members v2_datajournals_repositories_repofringe3aug2015Panel members v2_datajournals_repositories_repofringe3aug2015
Panel members v2_datajournals_repositories_repofringe3aug2015
 
From data to discovery webinar - University of Newcastle
From data to discovery webinar - University of NewcastleFrom data to discovery webinar - University of Newcastle
From data to discovery webinar - University of Newcastle
 
ANDS Applications Program: Building Tools to Facilitate Data Reuse
ANDS Applications Program: Building Tools to Facilitate Data ReuseANDS Applications Program: Building Tools to Facilitate Data Reuse
ANDS Applications Program: Building Tools to Facilitate Data Reuse
 
Provenance in Support of the ANDS Four Transformations
Provenance in Support of the ANDS Four TransformationsProvenance in Support of the ANDS Four Transformations
Provenance in Support of the ANDS Four Transformations
 
Storage for research-data webinar - Deakin University
Storage for research-data webinar - Deakin UniversityStorage for research-data webinar - Deakin University
Storage for research-data webinar - Deakin University
 
Data management: international challenges, national infrastructure, and insti...
Data management: international challenges, national infrastructure, and insti...Data management: international challenges, national infrastructure, and insti...
Data management: international challenges, national infrastructure, and insti...
 
Dataverse on the MOC
Dataverse on the MOCDataverse on the MOC
Dataverse on the MOC
 

Similar a CYVERSE: TRANSFORMING LIFE SCIENCE RESEARCH VIA CYBERINFRASTRUCTURE

EMBL Australia Bioinformatics Resource BioInfoSummer 2016
EMBL Australia Bioinformatics Resource BioInfoSummer 2016EMBL Australia Bioinformatics Resource BioInfoSummer 2016
EMBL Australia Bioinformatics Resource BioInfoSummer 2016Philippa Griffin
 
Federation and Interoperability in the Nectar Research Cloud
Federation and Interoperability in the Nectar Research CloudFederation and Interoperability in the Nectar Research Cloud
Federation and Interoperability in the Nectar Research CloudOpenStack
 
Ucla july 2018 natasha simons
Ucla july 2018 natasha simonsUcla july 2018 natasha simons
Ucla july 2018 natasha simonsARDC
 
Globus "Down Under"
Globus "Down Under"Globus "Down Under"
Globus "Down Under"Globus
 
Internet2 Bio IT 2016 v2
Internet2 Bio IT 2016 v2Internet2 Bio IT 2016 v2
Internet2 Bio IT 2016 v2Dan Taylor
 
VREs and Research Tools - supporting collaborative research
VREs and Research Tools - supporting collaborative researchVREs and Research Tools - supporting collaborative research
VREs and Research Tools - supporting collaborative researchChristopher Brown
 
Accelerating Science, Technology and Innovation Through Open Data and Open Sc...
Accelerating Science, Technology and Innovation Through Open Data and Open Sc...Accelerating Science, Technology and Innovation Through Open Data and Open Sc...
Accelerating Science, Technology and Innovation Through Open Data and Open Sc...African Open Science Platform
 
The role of biodiversity informatics in GBIF, 2021-05-18
The role of biodiversity informatics in GBIF, 2021-05-18The role of biodiversity informatics in GBIF, 2021-05-18
The role of biodiversity informatics in GBIF, 2021-05-18Dag Endresen
 
iplant-highlights-pag2015
iplant-highlights-pag2015iplant-highlights-pag2015
iplant-highlights-pag2015Matthew Vaughn
 
Data Science: History repeated? – The heritage of the Free and Open Source GI...
Data Science: History repeated? – The heritage of the Free and Open Source GI...Data Science: History repeated? – The heritage of the Free and Open Source GI...
Data Science: History repeated? – The heritage of the Free and Open Source GI...Peter Löwe
 
Opening up data – Jisc and CNI conference 10 July 2014
Opening up data – Jisc and CNI conference 10 July 2014Opening up data – Jisc and CNI conference 10 July 2014
Opening up data – Jisc and CNI conference 10 July 2014Jisc
 
12.10.14 Slides, “Roadmap to the Future of SHARE”
12.10.14 Slides, “Roadmap to the Future of SHARE”12.10.14 Slides, “Roadmap to the Future of SHARE”
12.10.14 Slides, “Roadmap to the Future of SHARE”DuraSpace
 
Perspectives from the African Open Science Platform/Susan Veldsman
Perspectives from the African Open Science Platform/Susan VeldsmanPerspectives from the African Open Science Platform/Susan Veldsman
Perspectives from the African Open Science Platform/Susan VeldsmanAfrican Open Science Platform
 
10th e concertation-brussels-06march2013-v2
10th e concertation-brussels-06march2013-v210th e concertation-brussels-06march2013-v2
10th e concertation-brussels-06march2013-v2Alex Hardisty
 
Perspectives from the African Open Science Platform/Susan Veldsman
Perspectives from the African Open Science Platform/Susan VeldsmanPerspectives from the African Open Science Platform/Susan Veldsman
Perspectives from the African Open Science Platform/Susan VeldsmanAfrican Open Science Platform
 
Emerging researchers slideshow jv r -7-fonts
Emerging researchers slideshow   jv r -7-fontsEmerging researchers slideshow   jv r -7-fonts
Emerging researchers slideshow jv r -7-fontseResearchatUCT
 

Similar a CYVERSE: TRANSFORMING LIFE SCIENCE RESEARCH VIA CYBERINFRASTRUCTURE (20)

EMBL Australia Bioinformatics Resource BioInfoSummer 2016
EMBL Australia Bioinformatics Resource BioInfoSummer 2016EMBL Australia Bioinformatics Resource BioInfoSummer 2016
EMBL Australia Bioinformatics Resource BioInfoSummer 2016
 
Federation and Interoperability in the Nectar Research Cloud
Federation and Interoperability in the Nectar Research CloudFederation and Interoperability in the Nectar Research Cloud
Federation and Interoperability in the Nectar Research Cloud
 
Ucla july 2018 natasha simons
Ucla july 2018 natasha simonsUcla july 2018 natasha simons
Ucla july 2018 natasha simons
 
Sgci esip-7-20-18
Sgci esip-7-20-18Sgci esip-7-20-18
Sgci esip-7-20-18
 
Globus "Down Under"
Globus "Down Under"Globus "Down Under"
Globus "Down Under"
 
Internet2 Bio IT 2016 v2
Internet2 Bio IT 2016 v2Internet2 Bio IT 2016 v2
Internet2 Bio IT 2016 v2
 
VREs and Research Tools - supporting collaborative research
VREs and Research Tools - supporting collaborative researchVREs and Research Tools - supporting collaborative research
VREs and Research Tools - supporting collaborative research
 
Accelerating Science, Technology and Innovation Through Open Data and Open Sc...
Accelerating Science, Technology and Innovation Through Open Data and Open Sc...Accelerating Science, Technology and Innovation Through Open Data and Open Sc...
Accelerating Science, Technology and Innovation Through Open Data and Open Sc...
 
Ucsd research-it-09-11-18
Ucsd research-it-09-11-18Ucsd research-it-09-11-18
Ucsd research-it-09-11-18
 
The role of biodiversity informatics in GBIF, 2021-05-18
The role of biodiversity informatics in GBIF, 2021-05-18The role of biodiversity informatics in GBIF, 2021-05-18
The role of biodiversity informatics in GBIF, 2021-05-18
 
iplant-highlights-pag2015
iplant-highlights-pag2015iplant-highlights-pag2015
iplant-highlights-pag2015
 
Data Science: History repeated? – The heritage of the Free and Open Source GI...
Data Science: History repeated? – The heritage of the Free and Open Source GI...Data Science: History repeated? – The heritage of the Free and Open Source GI...
Data Science: History repeated? – The heritage of the Free and Open Source GI...
 
Opening up data – Jisc and CNI conference 10 July 2014
Opening up data – Jisc and CNI conference 10 July 2014Opening up data – Jisc and CNI conference 10 July 2014
Opening up data – Jisc and CNI conference 10 July 2014
 
12.10.14 Slides, “Roadmap to the Future of SHARE”
12.10.14 Slides, “Roadmap to the Future of SHARE”12.10.14 Slides, “Roadmap to the Future of SHARE”
12.10.14 Slides, “Roadmap to the Future of SHARE”
 
ICT Infrastructure in Support of Data Sharing
ICT Infrastructure in Support of Data SharingICT Infrastructure in Support of Data Sharing
ICT Infrastructure in Support of Data Sharing
 
Perspectives from the African Open Science Platform/Susan Veldsman
Perspectives from the African Open Science Platform/Susan VeldsmanPerspectives from the African Open Science Platform/Susan Veldsman
Perspectives from the African Open Science Platform/Susan Veldsman
 
The View from South Africa/Colin Wright, David Walwyn
The View from South Africa/Colin Wright, David WalwynThe View from South Africa/Colin Wright, David Walwyn
The View from South Africa/Colin Wright, David Walwyn
 
10th e concertation-brussels-06march2013-v2
10th e concertation-brussels-06march2013-v210th e concertation-brussels-06march2013-v2
10th e concertation-brussels-06march2013-v2
 
Perspectives from the African Open Science Platform/Susan Veldsman
Perspectives from the African Open Science Platform/Susan VeldsmanPerspectives from the African Open Science Platform/Susan Veldsman
Perspectives from the African Open Science Platform/Susan Veldsman
 
Emerging researchers slideshow jv r -7-fonts
Emerging researchers slideshow   jv r -7-fontsEmerging researchers slideshow   jv r -7-fonts
Emerging researchers slideshow jv r -7-fonts
 

Más de Matthew Vaughn

On-Demand Cloud Computing for Life Sciences Research and Education
On-Demand Cloud Computing for Life Sciences Research and EducationOn-Demand Cloud Computing for Life Sciences Research and Education
On-Demand Cloud Computing for Life Sciences Research and EducationMatthew Vaughn
 
Towards a (united) federation of Bioinformatics resources
Towards a (united) federation of Bioinformatics resourcesTowards a (united) federation of Bioinformatics resources
Towards a (united) federation of Bioinformatics resourcesMatthew Vaughn
 
Clouds, Clusters, and Containers: Tools for responsible, collaborative computing
Clouds, Clusters, and Containers: Tools for responsible, collaborative computingClouds, Clusters, and Containers: Tools for responsible, collaborative computing
Clouds, Clusters, and Containers: Tools for responsible, collaborative computingMatthew Vaughn
 
Packaging computational biology tools for broad distribution and ease-of-reuse
Packaging computational biology tools for broad distribution and ease-of-reusePackaging computational biology tools for broad distribution and ease-of-reuse
Packaging computational biology tools for broad distribution and ease-of-reuseMatthew Vaughn
 
Jetstream: Adding Cloud-based Computing to the National Cyberinfrastructure
Jetstream: Adding Cloud-based Computing to the National CyberinfrastructureJetstream: Adding Cloud-based Computing to the National Cyberinfrastructure
Jetstream: Adding Cloud-based Computing to the National CyberinfrastructureMatthew Vaughn
 
Scaling People, Not Just Systems, to Take On Big Data Challenges
Scaling People, Not Just Systems, to Take On Big Data ChallengesScaling People, Not Just Systems, to Take On Big Data Challenges
Scaling People, Not Just Systems, to Take On Big Data ChallengesMatthew Vaughn
 
Arabidopsis Information Portal: A Community-Extensible Platform for Open Data
Arabidopsis Information Portal: A Community-Extensible Platform for Open DataArabidopsis Information Portal: A Community-Extensible Platform for Open Data
Arabidopsis Information Portal: A Community-Extensible Platform for Open DataMatthew Vaughn
 
Developing Apps: Exposing Your Data Through Araport
Developing Apps: Exposing Your Data Through AraportDeveloping Apps: Exposing Your Data Through Araport
Developing Apps: Exposing Your Data Through AraportMatthew Vaughn
 
Dinosaur bioinformatics
Dinosaur bioinformaticsDinosaur bioinformatics
Dinosaur bioinformaticsMatthew Vaughn
 
aip-developer-intro_pag2015
aip-developer-intro_pag2015aip-developer-intro_pag2015
aip-developer-intro_pag2015Matthew Vaughn
 
aip-workshop1-dev-tutorial
aip-workshop1-dev-tutorialaip-workshop1-dev-tutorial
aip-workshop1-dev-tutorialMatthew Vaughn
 
aip_developer_overview_icar_2014
aip_developer_overview_icar_2014aip_developer_overview_icar_2014
aip_developer_overview_icar_2014Matthew Vaughn
 
Arabidopsis Information Portal overview from Plant Biology Europe 2014
Arabidopsis Information Portal overview from Plant Biology Europe 2014Arabidopsis Information Portal overview from Plant Biology Europe 2014
Arabidopsis Information Portal overview from Plant Biology Europe 2014Matthew Vaughn
 

Más de Matthew Vaughn (13)

On-Demand Cloud Computing for Life Sciences Research and Education
On-Demand Cloud Computing for Life Sciences Research and EducationOn-Demand Cloud Computing for Life Sciences Research and Education
On-Demand Cloud Computing for Life Sciences Research and Education
 
Towards a (united) federation of Bioinformatics resources
Towards a (united) federation of Bioinformatics resourcesTowards a (united) federation of Bioinformatics resources
Towards a (united) federation of Bioinformatics resources
 
Clouds, Clusters, and Containers: Tools for responsible, collaborative computing
Clouds, Clusters, and Containers: Tools for responsible, collaborative computingClouds, Clusters, and Containers: Tools for responsible, collaborative computing
Clouds, Clusters, and Containers: Tools for responsible, collaborative computing
 
Packaging computational biology tools for broad distribution and ease-of-reuse
Packaging computational biology tools for broad distribution and ease-of-reusePackaging computational biology tools for broad distribution and ease-of-reuse
Packaging computational biology tools for broad distribution and ease-of-reuse
 
Jetstream: Adding Cloud-based Computing to the National Cyberinfrastructure
Jetstream: Adding Cloud-based Computing to the National CyberinfrastructureJetstream: Adding Cloud-based Computing to the National Cyberinfrastructure
Jetstream: Adding Cloud-based Computing to the National Cyberinfrastructure
 
Scaling People, Not Just Systems, to Take On Big Data Challenges
Scaling People, Not Just Systems, to Take On Big Data ChallengesScaling People, Not Just Systems, to Take On Big Data Challenges
Scaling People, Not Just Systems, to Take On Big Data Challenges
 
Arabidopsis Information Portal: A Community-Extensible Platform for Open Data
Arabidopsis Information Portal: A Community-Extensible Platform for Open DataArabidopsis Information Portal: A Community-Extensible Platform for Open Data
Arabidopsis Information Portal: A Community-Extensible Platform for Open Data
 
Developing Apps: Exposing Your Data Through Araport
Developing Apps: Exposing Your Data Through AraportDeveloping Apps: Exposing Your Data Through Araport
Developing Apps: Exposing Your Data Through Araport
 
Dinosaur bioinformatics
Dinosaur bioinformaticsDinosaur bioinformatics
Dinosaur bioinformatics
 
aip-developer-intro_pag2015
aip-developer-intro_pag2015aip-developer-intro_pag2015
aip-developer-intro_pag2015
 
aip-workshop1-dev-tutorial
aip-workshop1-dev-tutorialaip-workshop1-dev-tutorial
aip-workshop1-dev-tutorial
 
aip_developer_overview_icar_2014
aip_developer_overview_icar_2014aip_developer_overview_icar_2014
aip_developer_overview_icar_2014
 
Arabidopsis Information Portal overview from Plant Biology Europe 2014
Arabidopsis Information Portal overview from Plant Biology Europe 2014Arabidopsis Information Portal overview from Plant Biology Europe 2014
Arabidopsis Information Portal overview from Plant Biology Europe 2014
 

Último

Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...chandars293
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfSumit Kumar yadav
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and ClassificationsAreesha Ahmad
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)Areesha Ahmad
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPirithiRaju
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxgindu3009
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000Sapana Sha
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptxAlMamun560346
 
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...Lokesh Kothari
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICEayushi9330
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptxRajatChauhan518211
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)Areesha Ahmad
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 

Último (20)

Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
 
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptx
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 

CYVERSE: TRANSFORMING LIFE SCIENCE RESEARCH VIA CYBERINFRASTRUCTURE

  • 1. CYVERSE: TRANSFORMING LIFE SCIENCE RESEARCH VIA CYBERINFRASTRUCTURE Matthew Vaughn @mattdotvaughn Director, Life Sciences Computing, TACC Co-PI Cyverse, Araport, Jetstream Cloud 9/8/2016 1
  • 2. OVERVIEW 9/8/2016 2 • WHAT IS CYVERSE? • HOW IS IT TRANSFORMATIONAL FOR LIFE SCIENCES RESEARCH? • HOW DOES IT FIT INTO THE BIGGER SCHEME? • WHAT DIRECTIONS AND CHALLENGES ARE IN ITS FUTURE?
  • 3. CYVERSE IS A CYBERINFRASTRUCTURE 9/8/2016 3 Vision: Transforming science through data-driven discovery Mission: To design, develop, deploy, and expand a national cyberinfrastructure for life science research, and to train scientists in its use
  • 4. SUPPORTED BY THE NSF BIO DIRECTORATE 9/8/2016 4 • Division of Biological Infrastructure • $100 Million, 10-year investment • CyVerse resources are – Freely available to the community – Intended to spur national and international collaboration for research and education iPlant 2008 Empowering a New Plant Biology iPlant 2013 Cyberinfrastructure for Life Science CyVerse 2016 Transforming Science Through Data-Driven Discovery DBI-0735191 DBI-1265383
  • 6. NOT JUST ONE DATA TSUNAMI BUT THOUSANDS OF THEM 9/8/2016 6
  • 7. EXPLOSION IN SOFTWARE AND SYSTEMS COMPLEXITY
  • 8. 9/8/2016 8 INCREASED ADOPTION OF COMPUTATIONAL METHODS
  • 9. RESEARCH TEAMS NEED THIS  Store, organize, share primary data  Do basic analysis  Store, organize, share data products  Generate and explore hypotheses  Share analysis code with the scientific public  Integrate results from new experiments  Publish data alongside plots, visualizations and analytical tools 9/8/2016 9
  • 10. BUT END UP DEALING WITH THIS  Data lifecycle management  Fine-grained permission management  Discoverability  Version control  Taming promising new analysis codes (usually based immature technology)  Paying for storage, cycles, and consulting  Making their science reproducible 9/8/2016 10
  • 12. CYVERSE PRODUCT MATRIX 9/8/2016 12 Atmosphere User-provisioned, highly configurable cloud computing environment tailored for sciences Discovery Environment Web-accessible analysis workbench and gateway to national HPC infrastructure (XSEDE) Bisque Software for managing, analyzing and visualizing high throughput imaging data Data Store Scalable data storage for managing and sharing data across CyVerse’s CI and external data resources Science APIs Automation interfaces to connect data and computation for rapid integration external resources. Also used as a graduate teaching platform. DNA Subway Classroom-friendly bioinformatics teaching platform Powered by CyVerse Third-party applications built on CyVerse’s foundational services and
  • 13. Welch et al. 2013 Bioinformatics Specialist Computing Professional Bench Scientist EMPOWER USERS AT ALL LEVELS Help them avoid data and operations siloes
  • 14. 9/8/2016 14 Science applications Domain-specific services Established software and CI Physical resources Federated Storage National CI Virtualization Job Scheduling Single Sign-on EaseofUse EaseofRe-use
  • 15. IMPACTS 9/8/2016 15 • 500+ publications • >2PB user data stored • 40+k registered users • Millions of compute hours annually • Hundreds of trainees
  • 16. CYVERSE IS A HUB IN A RICH & COLLABORATIVE ECOSYSTEM 9/8/2016 16 • Using • Collaborating • Contributing • Supporting • Inventing
  • 17. CURRENT INITIATIVES 9/8/2016 17 Enabling Data-Driven Discovery. Providing Advanced Training to Researchers. Removing Barriers to Reproducible Science. Cyverse Data Commons Portable Science Lab Intensive Engagement
  • 18. CYVERSE DATA COMMONS 9/8/2016 18 Make research data discoverable and reusable. Ensure it ends up stored in its natural repository. Cyverse Data Store Staging Area Data Commons Portal Natural Repositories Publish in place simply by sharing Curate, format, describe metadata Published snapshot with DOI and open access Facilitated deposit to NCBI-SRA, Genbank, and more
  • 19. PORTABLE SCIENCE LAB 9/8/2016 19 Continue adoption of technologies to describe, encapsulate, and share research code and data. Virtual machines, Linux containers, Web Service APIs, Workflow Standards Integrated via Interactive, Narrative Notebooks
  • 20. INTENSIVE ENGAGEMENT 9/8/2016 20 Extended Collaborative Support Consultation and Support Forums Hands-on Training and Tutorials Enhanced Support Tooling Empower Researchers to Embrace and Extend Cyverse
  • 21. SUMMARY 9/8/2016 21 • CyVerse is a reference model for cyberinfrastructure that is already being extended to other disciplines • CyVerse provides a vertically integrated, scalable data-to-discovery cyberinfrastructure that leverages existing federal and state investments to transform life science research • Cyverse is driving technological and operational innovation via a web of interactions and collaborations with other projects, platforms, and infrastructures.
  • 22. KEY CHALLENGE - CYVERSE VALUE PROPOSITION 9/8/2016 22 “Are you still going to be around in 3 years?” ”Why did my analysis fail? Don’t you have big computers?” “Shouldn’t we just go to Amazon Web Services?” “I don’t want my students spending time learning computing.” “Why aren’t you working on X?”

Notas del editor

  1. WHY?
  2. LATCHING ONTO MOORE”S LAW… MRI, PET, Multispectral imaging, Laser scanning, LIDAR, Xray Everyone’s generating TERASCALE DATA
  3. Everyone’s generating TERASCALE DATA There aren’t thousands of locations capable of computing at this scale Collaborative teams are geographically dispersed iPlant can HELP
  4. But now we need flexibility! Jetstream doesn't’t solve hardware issues but is aimed at other challenging aspects.
  5. “Create detailed spatial-temporal molecular atlas (RNA, proteins, metabolites) of the developing lung” Here are their high level requirements. Seems familiar, right?
  6. They’re inevitably bogged down in these kinds of details… all while their NEED for computing is outpacing their resources Ah-ha, you say. They should just move to the cloud!