SlideShare a Scribd company logo
1 of 48
Services for science Creating knowledge in the Internet age Ian Foster Computation Institute Argonne National Lab & University of Chicago
Ā 
Knowledge generation in astronomy ~1600 30 years ? years 10 years 6 years 2 years
Astronomy from 1600  to 2010 Automation 10 -1  ļƒ   10 8  Hz data capture Community 10 0  ļƒ   10 4 astronomers (10 6  amateur) Computation Data 10 6  ļƒ   10 15  B aggregate 10 -1  ļƒ   10 15  Hz peak Literature 10 1  ļƒ   10 5 pages/year
Knowledge generation in medicine ~1600
Biomedical research ~2010 ... atcgaattccaggcgtcacattctcaattcca... MPMILGYWDIRGLAHAIRLLLEYTDSSYEEKKYT... Protein-Protein Interactions metabolism pathways receptor-ligand 4Āŗ structure Polymorphism and Variants genetic variants individual patients epidemiology Physiology Cellular biology Biochemistry Neurobiology Endocrinology etc. >10 6 ESTs  Expression patterns Large-scale screens Genetics and Maps Linkage Cytogenetic  Clone-based From John Wooley >10 6 >10 9 >10 6 >10 5 >10 9 DNA sequences alignments Proteins sequence 2Āŗ structure 3Āŗ structure
More data does not always mean more knowledge Folker Meyer, Genome Sequencing vs. Mooreā€™s Law: Cyber Challenges for the Next Decade,  CTWatch , August 2006.
Knowledge generation  as a systems problem ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
An incomplete list of process steps ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Data Artisanal Industrial Data Analyses Models Experiments Literature
SOA as an integrating framework? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],ā€œ Service-Oriented Scienceā€,  Science , 2005 and
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],1070 molecular bio databases  Nucleic Acids Research  Jan 2008 (96 in Jan 2001) Slide: Carole Goble
The cancer Biomedical Informatics Grid Globus
As of Sept 18, 2008: 122 participants 81   services 62   data 19 analytical
As of  Oct 19 , 2008: 122 participants 105   services 70   data 35  analytical
Automating the routine Location A Microarray, Protein,  Image data Location B Microarray, Protein,  Image data Location C Microarray, Protein,  Image data Location C Image Analysis Location D Image Analysis Microarray and protein  databases at other institutions  Different database systems, data representations, security Different program invocation, remote access, data transfer
Automating the routine Location A Microarray, Protein,  Image data Location B Microarray, Protein,  Image data Location C Microarray, Protein,  Image data Location C Image Analysis Location D Image Analysis caGrid Service  Interfaces caGrid Environ-ment Registered Object Definitions Advertise-ment Log on, Grid credentials Query and Analysis Workflow Discovery Microarray & protein  databases at other  institutions  Globus
Location A Microarray, Protein,  Image data Location B Microarray, Protein,  Image data Location C Microarray, Protein,  Image data Location C Image Analysis Location D Image Analysis caGrid Service  Interfaces caGrid Environ-ment Registered Object Definitions Advertise-ment Log on, Grid credentials Query and Analysis Workflow Discovery Microarray & protein  databases at other  institutions  Service  authoring Metadata services Service registries Security services Quality control Queries, workflows caGrid Compute resources Globus
Lifecycle issues caGrid Discovery   Composition   Execution   Analysis   Community   reuse generate
Metadata Services
Taverna Trident Kepler BPEL Ptolemy II
Microarray clustering (Taverna)* ,[object Object],[object Object],[object Object],Workflow in/output caGrid services ā€œ Shimā€ services others *Wei Tan, Ravi Madduri, Kiran Keshav, Baris E. Suzek, Scott Oster, Ian Foster.  Orchestrating caGrid Services in Taverna.   ICWS 08. Wei Tan
Execution trace Execution  result  as XML 1936  gene  expressions
Workflows as communication ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Slide: Carole Goble
Reproducible science means ā€”  context ā€”   trust  ā€”  easy access to methods
Workflows are another form of scholarly outcome to publish, curate and cite and archive along with data and publications
Reuse story that really happened ,[object Object],[object Object],[object Object],[object Object],[object Object],Slide: Carole Goble
Functional Magnetic  Resonance Imaging (fMRI) Mike Wilde
Parallel scripting
Computation as a first-class entity ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],A Virtual Data System for Representing, Querying & Automating Data Derivation [SSDBM02] Data Program Computation operates-on execution-of created-by consumed-by
Example: fMRI analysis First Provenance Challenge, http://twiki.ipaw.info/  [CCPE06]
Query examples ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Challenges of scale ,[object Object],[object Object],[object Object],[object Object],[object Object]
Hosting and provisioning ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],! ! ā€œ Service-Oriented Scienceā€,  Science , 2005
Provisioning for data-intensive workloads ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],S Sloan Data Ioan Raicu + + + + + + = +
ā€œ Sineā€ workload, 2M tasks, 10MB:10ms ratio, 100 nodes, GCC policy, 50GB caches/node
Same scenario, but with dynamic resource provisioning
DOCK on BG/P: ~1M Tasks on 118,000 CPUs ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Ioan Raicu Zhao Zhang Mike Wilde Time (secs)
Ā 
Efficiency  relative to no-I/O case  for 4 second tasks and varying data size (1KB to 1MB) for CIO and GPFS up to 32K processors
Thanks! ,[object Object],[object Object],[object Object],[object Object]
Knowledge generation  as a systems problem ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Service-oriented science ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],! ! ā€œ Service-Oriented Scienceā€,  Science , 2005
Service-oriented science ,[object Object],[object Object],[object Object],[object Object],People  create  services (data or function) ā€¦ which others  discover , decide to use, ā€¦  and  compose  to create a new function ...  which they  publish  as a new service. ā€œ Service-Oriented Scienceā€,  Science , 2005
And big challenges ā€¦ ,[object Object],[object Object],[object Object],[object Object]
Service discovery and selection ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],A B
caGrid data instruments computation resource Virtualization  Security  Connectivity
[object Object],[object Object],[object Object],[object Object],[object Object],Service composition Slide: Carole Goble
ā€œ MIā€ workload, 250K tasks, 10MB:10ms ratio, up to 64 nodes using DRP, GCC policy, 2GB caches/node

More Related Content

What's hot

Human Studies Database Project (demo)
Human Studies Database Project (demo)Human Studies Database Project (demo)
Human Studies Database Project (demo)Ida Sim
Ā 
An Ontology-based Decision Support Framework for Personalized Quality of Life...
An Ontology-based Decision Support Framework for Personalized Quality of Life...An Ontology-based Decision Support Framework for Personalized Quality of Life...
An Ontology-based Decision Support Framework for Personalized Quality of Life...Marina Riga
Ā 
Mercer bosc2010 microsoft_framework
Mercer bosc2010 microsoft_frameworkMercer bosc2010 microsoft_framework
Mercer bosc2010 microsoft_frameworkBOSC 2010
Ā 
Computation and Knowledge
Computation and KnowledgeComputation and Knowledge
Computation and KnowledgeIan Foster
Ā 
DCC Keynote 2007
DCC Keynote 2007DCC Keynote 2007
DCC Keynote 2007Carole Goble
Ā 
2016 davis-plantbio
2016 davis-plantbio2016 davis-plantbio
2016 davis-plantbioc.titus.brown
Ā 
Delroy Cameron's Dissertation Defense: A Contenxt-Driven Subgraph Model for L...
Delroy Cameron's Dissertation Defense: A Contenxt-Driven Subgraph Model for L...Delroy Cameron's Dissertation Defense: A Contenxt-Driven Subgraph Model for L...
Delroy Cameron's Dissertation Defense: A Contenxt-Driven Subgraph Model for L...Amit Sheth
Ā 
Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...
Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...
Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...Amit Sheth
Ā 
Semantics for Bioinformatics: What, Why and How of Search, Integration and An...
Semantics for Bioinformatics: What, Why and How of Search, Integration and An...Semantics for Bioinformatics: What, Why and How of Search, Integration and An...
Semantics for Bioinformatics: What, Why and How of Search, Integration and An...Amit Sheth
Ā 
Scott Edmunds talk at AIST: Overcoming the Reproducibility Crisis: and why I ...
Scott Edmunds talk at AIST: Overcoming the Reproducibility Crisis: and why I ...Scott Edmunds talk at AIST: Overcoming the Reproducibility Crisis: and why I ...
Scott Edmunds talk at AIST: Overcoming the Reproducibility Crisis: and why I ...GigaScience, BGI Hong Kong
Ā 
TERN ESA Workshop 2012, 'Smarter Workflows for Ecologists'
TERN ESA Workshop 2012, 'Smarter Workflows for Ecologists'TERN ESA Workshop 2012, 'Smarter Workflows for Ecologists'
TERN ESA Workshop 2012, 'Smarter Workflows for Ecologists'TERN Australia
Ā 
GlobusWorld 2019 Opening Keynote
GlobusWorld 2019 Opening KeynoteGlobusWorld 2019 Opening Keynote
GlobusWorld 2019 Opening KeynoteGlobus
Ā 
The Seven Deadly Sins of Bioinformatics
The Seven Deadly Sins of BioinformaticsThe Seven Deadly Sins of Bioinformatics
The Seven Deadly Sins of BioinformaticsDuncan Hull
Ā 
BIOMAG2018 - Denis Engemann - MNE-HCP
BIOMAG2018 - Denis Engemann - MNE-HCPBIOMAG2018 - Denis Engemann - MNE-HCP
BIOMAG2018 - Denis Engemann - MNE-HCPRobert Oostenveld
Ā 
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...Robert Grossman
Ā 
The beauty of workflows and models
The beauty of workflows and modelsThe beauty of workflows and models
The beauty of workflows and modelsmyGrid team
Ā 
Donders neuroimage toolkit - open science and good practices
Donders neuroimage toolkit -  open science and good practicesDonders neuroimage toolkit -  open science and good practices
Donders neuroimage toolkit - open science and good practicesRobert Oostenveld
Ā 
Crossing the Analytics Chasm and Getting the Models You Developed Deployed
Crossing the Analytics Chasm and Getting the Models You Developed DeployedCrossing the Analytics Chasm and Getting the Models You Developed Deployed
Crossing the Analytics Chasm and Getting the Models You Developed DeployedRobert Grossman
Ā 
CuttingEEG - Open Science, Open Data and BIDS for EEG
CuttingEEG - Open Science, Open Data and BIDS for EEGCuttingEEG - Open Science, Open Data and BIDS for EEG
CuttingEEG - Open Science, Open Data and BIDS for EEGRobert Oostenveld
Ā 

What's hot (20)

Human Studies Database Project (demo)
Human Studies Database Project (demo)Human Studies Database Project (demo)
Human Studies Database Project (demo)
Ā 
An Ontology-based Decision Support Framework for Personalized Quality of Life...
An Ontology-based Decision Support Framework for Personalized Quality of Life...An Ontology-based Decision Support Framework for Personalized Quality of Life...
An Ontology-based Decision Support Framework for Personalized Quality of Life...
Ā 
Mercer bosc2010 microsoft_framework
Mercer bosc2010 microsoft_frameworkMercer bosc2010 microsoft_framework
Mercer bosc2010 microsoft_framework
Ā 
Computation and Knowledge
Computation and KnowledgeComputation and Knowledge
Computation and Knowledge
Ā 
DCC Keynote 2007
DCC Keynote 2007DCC Keynote 2007
DCC Keynote 2007
Ā 
Contrast Pattern Aided Regression and Classification
Contrast Pattern Aided Regression and ClassificationContrast Pattern Aided Regression and Classification
Contrast Pattern Aided Regression and Classification
Ā 
2016 davis-plantbio
2016 davis-plantbio2016 davis-plantbio
2016 davis-plantbio
Ā 
Delroy Cameron's Dissertation Defense: A Contenxt-Driven Subgraph Model for L...
Delroy Cameron's Dissertation Defense: A Contenxt-Driven Subgraph Model for L...Delroy Cameron's Dissertation Defense: A Contenxt-Driven Subgraph Model for L...
Delroy Cameron's Dissertation Defense: A Contenxt-Driven Subgraph Model for L...
Ā 
Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...
Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...
Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...
Ā 
Semantics for Bioinformatics: What, Why and How of Search, Integration and An...
Semantics for Bioinformatics: What, Why and How of Search, Integration and An...Semantics for Bioinformatics: What, Why and How of Search, Integration and An...
Semantics for Bioinformatics: What, Why and How of Search, Integration and An...
Ā 
Scott Edmunds talk at AIST: Overcoming the Reproducibility Crisis: and why I ...
Scott Edmunds talk at AIST: Overcoming the Reproducibility Crisis: and why I ...Scott Edmunds talk at AIST: Overcoming the Reproducibility Crisis: and why I ...
Scott Edmunds talk at AIST: Overcoming the Reproducibility Crisis: and why I ...
Ā 
TERN ESA Workshop 2012, 'Smarter Workflows for Ecologists'
TERN ESA Workshop 2012, 'Smarter Workflows for Ecologists'TERN ESA Workshop 2012, 'Smarter Workflows for Ecologists'
TERN ESA Workshop 2012, 'Smarter Workflows for Ecologists'
Ā 
GlobusWorld 2019 Opening Keynote
GlobusWorld 2019 Opening KeynoteGlobusWorld 2019 Opening Keynote
GlobusWorld 2019 Opening Keynote
Ā 
The Seven Deadly Sins of Bioinformatics
The Seven Deadly Sins of BioinformaticsThe Seven Deadly Sins of Bioinformatics
The Seven Deadly Sins of Bioinformatics
Ā 
BIOMAG2018 - Denis Engemann - MNE-HCP
BIOMAG2018 - Denis Engemann - MNE-HCPBIOMAG2018 - Denis Engemann - MNE-HCP
BIOMAG2018 - Denis Engemann - MNE-HCP
Ā 
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...
Ā 
The beauty of workflows and models
The beauty of workflows and modelsThe beauty of workflows and models
The beauty of workflows and models
Ā 
Donders neuroimage toolkit - open science and good practices
Donders neuroimage toolkit -  open science and good practicesDonders neuroimage toolkit -  open science and good practices
Donders neuroimage toolkit - open science and good practices
Ā 
Crossing the Analytics Chasm and Getting the Models You Developed Deployed
Crossing the Analytics Chasm and Getting the Models You Developed DeployedCrossing the Analytics Chasm and Getting the Models You Developed Deployed
Crossing the Analytics Chasm and Getting the Models You Developed Deployed
Ā 
CuttingEEG - Open Science, Open Data and BIDS for EEG
CuttingEEG - Open Science, Open Data and BIDS for EEGCuttingEEG - Open Science, Open Data and BIDS for EEG
CuttingEEG - Open Science, Open Data and BIDS for EEG
Ā 

Similar to Services For Science April 2009

Aaas Data Intensive Science And Grid
Aaas Data Intensive Science And GridAaas Data Intensive Science And Grid
Aaas Data Intensive Science And GridIan Foster
Ā 
Spark Summit Europe: Share and analyse genomic data at scale
Spark Summit Europe: Share and analyse genomic data at scaleSpark Summit Europe: Share and analyse genomic data at scale
Spark Summit Europe: Share and analyse genomic data at scaleAndy Petrella
Ā 
Discovery Engines for Big Data: Accelerating Discovery in Basic Energy Sciences
Discovery Engines for Big Data: Accelerating Discovery in Basic Energy SciencesDiscovery Engines for Big Data: Accelerating Discovery in Basic Energy Sciences
Discovery Engines for Big Data: Accelerating Discovery in Basic Energy SciencesIan Foster
Ā 
eScience: A Transformed Scientific Method
eScience: A Transformed Scientific MethodeScience: A Transformed Scientific Method
eScience: A Transformed Scientific MethodDuncan Hull
Ā 
The Discovery Cloud: Accelerating Science via Outsourcing and Automation
The Discovery Cloud: Accelerating Science via Outsourcing and AutomationThe Discovery Cloud: Accelerating Science via Outsourcing and Automation
The Discovery Cloud: Accelerating Science via Outsourcing and AutomationIan Foster
Ā 
Accelerating Discovery via Science Services
Accelerating Discovery via Science ServicesAccelerating Discovery via Science Services
Accelerating Discovery via Science ServicesIan Foster
Ā 
Share and analyze geonomic data at scale by Andy Petrella and Xavier Tordoir
Share and analyze geonomic data at scale by Andy Petrella and Xavier TordoirShare and analyze geonomic data at scale by Andy Petrella and Xavier Tordoir
Share and analyze geonomic data at scale by Andy Petrella and Xavier TordoirSpark Summit
Ā 
Services for Science v2 (APAN26)
Services for Science v2 (APAN26)Services for Science v2 (APAN26)
Services for Science v2 (APAN26)Ian Foster
Ā 
Research Objects: more than the sum of the parts
Research Objects: more than the sum of the partsResearch Objects: more than the sum of the parts
Research Objects: more than the sum of the partsCarole Goble
Ā 
Accelerating data-intensive science by outsourcing the mundane
Accelerating data-intensive science by outsourcing the mundaneAccelerating data-intensive science by outsourcing the mundane
Accelerating data-intensive science by outsourcing the mundaneIan Foster
Ā 
Natural Language Processing & Semantic Models in an Imperfect World
Natural Language Processing & Semantic Modelsin an Imperfect WorldNatural Language Processing & Semantic Modelsin an Imperfect World
Natural Language Processing & Semantic Models in an Imperfect WorldVital.AI
Ā 
Being FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data ScienceBeing FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data ScienceCarole Goble
Ā 
Docker in Open Science Data Analysis Challenges by Bruce Hoff
Docker in Open Science Data Analysis Challenges by Bruce HoffDocker in Open Science Data Analysis Challenges by Bruce Hoff
Docker in Open Science Data Analysis Challenges by Bruce HoffDocker, Inc.
Ā 
Semantic Sensor Networks and Linked Stream Data
Semantic Sensor Networks and Linked Stream DataSemantic Sensor Networks and Linked Stream Data
Semantic Sensor Networks and Linked Stream DataOscar Corcho
Ā 
The seven-deadly-sins-of-bioinformatics3960
The seven-deadly-sins-of-bioinformatics3960The seven-deadly-sins-of-bioinformatics3960
The seven-deadly-sins-of-bioinformatics3960mare34
Ā 
UK Digital Curation Centre: enabling research data management at the coalface
UK Digital Curation Centre: enabling research data management at the coalfaceUK Digital Curation Centre: enabling research data management at the coalface
UK Digital Curation Centre: enabling research data management at the coalfaceLizLyon
Ā 
Sharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reportsSharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reportsGaignard Alban
Ā 
wolstencroft-ogf20-astro
wolstencroft-ogf20-astrowolstencroft-ogf20-astro
wolstencroft-ogf20-astrowebuploader
Ā 
Invited talk @ ESIP summer meeting, 2009
Invited talk @ ESIP summer meeting, 2009Invited talk @ ESIP summer meeting, 2009
Invited talk @ ESIP summer meeting, 2009Paolo Missier
Ā 

Similar to Services For Science April 2009 (20)

Aaas Data Intensive Science And Grid
Aaas Data Intensive Science And GridAaas Data Intensive Science And Grid
Aaas Data Intensive Science And Grid
Ā 
Spark Summit Europe: Share and analyse genomic data at scale
Spark Summit Europe: Share and analyse genomic data at scaleSpark Summit Europe: Share and analyse genomic data at scale
Spark Summit Europe: Share and analyse genomic data at scale
Ā 
A biologist in e-Science
A biologist in e-ScienceA biologist in e-Science
A biologist in e-Science
Ā 
Discovery Engines for Big Data: Accelerating Discovery in Basic Energy Sciences
Discovery Engines for Big Data: Accelerating Discovery in Basic Energy SciencesDiscovery Engines for Big Data: Accelerating Discovery in Basic Energy Sciences
Discovery Engines for Big Data: Accelerating Discovery in Basic Energy Sciences
Ā 
eScience: A Transformed Scientific Method
eScience: A Transformed Scientific MethodeScience: A Transformed Scientific Method
eScience: A Transformed Scientific Method
Ā 
The Discovery Cloud: Accelerating Science via Outsourcing and Automation
The Discovery Cloud: Accelerating Science via Outsourcing and AutomationThe Discovery Cloud: Accelerating Science via Outsourcing and Automation
The Discovery Cloud: Accelerating Science via Outsourcing and Automation
Ā 
Accelerating Discovery via Science Services
Accelerating Discovery via Science ServicesAccelerating Discovery via Science Services
Accelerating Discovery via Science Services
Ā 
Share and analyze geonomic data at scale by Andy Petrella and Xavier Tordoir
Share and analyze geonomic data at scale by Andy Petrella and Xavier TordoirShare and analyze geonomic data at scale by Andy Petrella and Xavier Tordoir
Share and analyze geonomic data at scale by Andy Petrella and Xavier Tordoir
Ā 
Services for Science v2 (APAN26)
Services for Science v2 (APAN26)Services for Science v2 (APAN26)
Services for Science v2 (APAN26)
Ā 
Research Objects: more than the sum of the parts
Research Objects: more than the sum of the partsResearch Objects: more than the sum of the parts
Research Objects: more than the sum of the parts
Ā 
Accelerating data-intensive science by outsourcing the mundane
Accelerating data-intensive science by outsourcing the mundaneAccelerating data-intensive science by outsourcing the mundane
Accelerating data-intensive science by outsourcing the mundane
Ā 
Natural Language Processing & Semantic Models in an Imperfect World
Natural Language Processing & Semantic Modelsin an Imperfect WorldNatural Language Processing & Semantic Modelsin an Imperfect World
Natural Language Processing & Semantic Models in an Imperfect World
Ā 
Being FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data ScienceBeing FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data Science
Ā 
Docker in Open Science Data Analysis Challenges by Bruce Hoff
Docker in Open Science Data Analysis Challenges by Bruce HoffDocker in Open Science Data Analysis Challenges by Bruce Hoff
Docker in Open Science Data Analysis Challenges by Bruce Hoff
Ā 
Semantic Sensor Networks and Linked Stream Data
Semantic Sensor Networks and Linked Stream DataSemantic Sensor Networks and Linked Stream Data
Semantic Sensor Networks and Linked Stream Data
Ā 
The seven-deadly-sins-of-bioinformatics3960
The seven-deadly-sins-of-bioinformatics3960The seven-deadly-sins-of-bioinformatics3960
The seven-deadly-sins-of-bioinformatics3960
Ā 
UK Digital Curation Centre: enabling research data management at the coalface
UK Digital Curation Centre: enabling research data management at the coalfaceUK Digital Curation Centre: enabling research data management at the coalface
UK Digital Curation Centre: enabling research data management at the coalface
Ā 
Sharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reportsSharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reports
Ā 
wolstencroft-ogf20-astro
wolstencroft-ogf20-astrowolstencroft-ogf20-astro
wolstencroft-ogf20-astro
Ā 
Invited talk @ ESIP summer meeting, 2009
Invited talk @ ESIP summer meeting, 2009Invited talk @ ESIP summer meeting, 2009
Invited talk @ ESIP summer meeting, 2009
Ā 

More from Ian Foster

Global Services for Global Science March 2023.pptx
Global Services for Global Science March 2023.pptxGlobal Services for Global Science March 2023.pptx
Global Services for Global Science March 2023.pptxIan Foster
Ā 
The Earth System Grid Federation: Origins, Current State, Evolution
The Earth System Grid Federation: Origins, Current State, EvolutionThe Earth System Grid Federation: Origins, Current State, Evolution
The Earth System Grid Federation: Origins, Current State, EvolutionIan Foster
Ā 
Better Information Faster: Programming the Continuum
Better Information Faster: Programming the ContinuumBetter Information Faster: Programming the Continuum
Better Information Faster: Programming the ContinuumIan Foster
Ā 
ESnet6 and Smart Instruments
ESnet6 and Smart InstrumentsESnet6 and Smart Instruments
ESnet6 and Smart InstrumentsIan Foster
Ā 
Linking Scientific Instruments and Computation
Linking Scientific Instruments and ComputationLinking Scientific Instruments and Computation
Linking Scientific Instruments and ComputationIan Foster
Ā 
A Global Research Data Platform: How Globus Services Enable Scientific Discovery
A Global Research Data Platform: How Globus Services Enable Scientific DiscoveryA Global Research Data Platform: How Globus Services Enable Scientific Discovery
A Global Research Data Platform: How Globus Services Enable Scientific DiscoveryIan Foster
Ā 
Foster CRA March 2022.pptx
Foster CRA March 2022.pptxFoster CRA March 2022.pptx
Foster CRA March 2022.pptxIan Foster
Ā 
Big Data, Big Computing, AI, and Environmental Science
Big Data, Big Computing, AI, and Environmental ScienceBig Data, Big Computing, AI, and Environmental Science
Big Data, Big Computing, AI, and Environmental ScienceIan Foster
Ā 
AI at Scale for Materials and Chemistry
AI at Scale for Materials and ChemistryAI at Scale for Materials and Chemistry
AI at Scale for Materials and ChemistryIan Foster
Ā 
Coding the Continuum
Coding the ContinuumCoding the Continuum
Coding the ContinuumIan Foster
Ā 
Data Tribology: Overcoming Data Friction with Cloud Automation
Data Tribology: Overcoming Data Friction with Cloud AutomationData Tribology: Overcoming Data Friction with Cloud Automation
Data Tribology: Overcoming Data Friction with Cloud AutomationIan Foster
Ā 
Research Automation for Data-Driven Discovery
Research Automation for Data-Driven DiscoveryResearch Automation for Data-Driven Discovery
Research Automation for Data-Driven DiscoveryIan Foster
Ā 
Scaling collaborative data science with Globus and Jupyter
Scaling collaborative data science with Globus and JupyterScaling collaborative data science with Globus and Jupyter
Scaling collaborative data science with Globus and JupyterIan Foster
Ā 
Learning Systems for Science
Learning Systems for ScienceLearning Systems for Science
Learning Systems for ScienceIan Foster
Ā 
Data Automation at Light Sources
Data Automation at Light SourcesData Automation at Light Sources
Data Automation at Light SourcesIan Foster
Ā 
Team Argon Summary
Team Argon SummaryTeam Argon Summary
Team Argon SummaryIan Foster
Ā 
Thoughts on interoperability
Thoughts on interoperabilityThoughts on interoperability
Thoughts on interoperabilityIan Foster
Ā 
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...Ian Foster
Ā 
NIH Data Commons Architecture Ideas
NIH Data Commons Architecture IdeasNIH Data Commons Architecture Ideas
NIH Data Commons Architecture IdeasIan Foster
Ā 
Going Smart and Deep on Materials at ALCF
Going Smart and Deep on Materials at ALCFGoing Smart and Deep on Materials at ALCF
Going Smart and Deep on Materials at ALCFIan Foster
Ā 

More from Ian Foster (20)

Global Services for Global Science March 2023.pptx
Global Services for Global Science March 2023.pptxGlobal Services for Global Science March 2023.pptx
Global Services for Global Science March 2023.pptx
Ā 
The Earth System Grid Federation: Origins, Current State, Evolution
The Earth System Grid Federation: Origins, Current State, EvolutionThe Earth System Grid Federation: Origins, Current State, Evolution
The Earth System Grid Federation: Origins, Current State, Evolution
Ā 
Better Information Faster: Programming the Continuum
Better Information Faster: Programming the ContinuumBetter Information Faster: Programming the Continuum
Better Information Faster: Programming the Continuum
Ā 
ESnet6 and Smart Instruments
ESnet6 and Smart InstrumentsESnet6 and Smart Instruments
ESnet6 and Smart Instruments
Ā 
Linking Scientific Instruments and Computation
Linking Scientific Instruments and ComputationLinking Scientific Instruments and Computation
Linking Scientific Instruments and Computation
Ā 
A Global Research Data Platform: How Globus Services Enable Scientific Discovery
A Global Research Data Platform: How Globus Services Enable Scientific DiscoveryA Global Research Data Platform: How Globus Services Enable Scientific Discovery
A Global Research Data Platform: How Globus Services Enable Scientific Discovery
Ā 
Foster CRA March 2022.pptx
Foster CRA March 2022.pptxFoster CRA March 2022.pptx
Foster CRA March 2022.pptx
Ā 
Big Data, Big Computing, AI, and Environmental Science
Big Data, Big Computing, AI, and Environmental ScienceBig Data, Big Computing, AI, and Environmental Science
Big Data, Big Computing, AI, and Environmental Science
Ā 
AI at Scale for Materials and Chemistry
AI at Scale for Materials and ChemistryAI at Scale for Materials and Chemistry
AI at Scale for Materials and Chemistry
Ā 
Coding the Continuum
Coding the ContinuumCoding the Continuum
Coding the Continuum
Ā 
Data Tribology: Overcoming Data Friction with Cloud Automation
Data Tribology: Overcoming Data Friction with Cloud AutomationData Tribology: Overcoming Data Friction with Cloud Automation
Data Tribology: Overcoming Data Friction with Cloud Automation
Ā 
Research Automation for Data-Driven Discovery
Research Automation for Data-Driven DiscoveryResearch Automation for Data-Driven Discovery
Research Automation for Data-Driven Discovery
Ā 
Scaling collaborative data science with Globus and Jupyter
Scaling collaborative data science with Globus and JupyterScaling collaborative data science with Globus and Jupyter
Scaling collaborative data science with Globus and Jupyter
Ā 
Learning Systems for Science
Learning Systems for ScienceLearning Systems for Science
Learning Systems for Science
Ā 
Data Automation at Light Sources
Data Automation at Light SourcesData Automation at Light Sources
Data Automation at Light Sources
Ā 
Team Argon Summary
Team Argon SummaryTeam Argon Summary
Team Argon Summary
Ā 
Thoughts on interoperability
Thoughts on interoperabilityThoughts on interoperability
Thoughts on interoperability
Ā 
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
Ā 
NIH Data Commons Architecture Ideas
NIH Data Commons Architecture IdeasNIH Data Commons Architecture Ideas
NIH Data Commons Architecture Ideas
Ā 
Going Smart and Deep on Materials at ALCF
Going Smart and Deep on Materials at ALCFGoing Smart and Deep on Materials at ALCF
Going Smart and Deep on Materials at ALCF
Ā 

Recently uploaded

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
Ā 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
Ā 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service šŸø 8923113531 šŸŽ° Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service šŸø 8923113531 šŸŽ° Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service šŸø 8923113531 šŸŽ° Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service šŸø 8923113531 šŸŽ° Avail...gurkirankumar98700
Ā 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
Ā 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
Ā 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
Ā 
WhatsApp 9892124323 āœ“Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 āœ“Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 āœ“Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 āœ“Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
Ā 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
Ā 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
Ā 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
Ā 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
Ā 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
Ā 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
Ā 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
Ā 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
Ā 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
Ā 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
Ā 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
Ā 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
Ā 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
Ā 

Recently uploaded (20)

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
Ā 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
Ā 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service šŸø 8923113531 šŸŽ° Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service šŸø 8923113531 šŸŽ° Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service šŸø 8923113531 šŸŽ° Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service šŸø 8923113531 šŸŽ° Avail...
Ā 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
Ā 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
Ā 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
Ā 
WhatsApp 9892124323 āœ“Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 āœ“Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 āœ“Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 āœ“Call Girls In Kalyan ( Mumbai ) secure service
Ā 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
Ā 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Ā 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
Ā 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
Ā 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
Ā 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
Ā 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Ā 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
Ā 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
Ā 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
Ā 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
Ā 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
Ā 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
Ā 

Services For Science April 2009

  • 1. Services for science Creating knowledge in the Internet age Ian Foster Computation Institute Argonne National Lab & University of Chicago
  • 3. Knowledge generation in astronomy ~1600 30 years ? years 10 years 6 years 2 years
  • 4. Astronomy from 1600 to 2010 Automation 10 -1 ļƒ  10 8 Hz data capture Community 10 0 ļƒ  10 4 astronomers (10 6 amateur) Computation Data 10 6 ļƒ  10 15 B aggregate 10 -1 ļƒ  10 15 Hz peak Literature 10 1 ļƒ  10 5 pages/year
  • 5. Knowledge generation in medicine ~1600
  • 6. Biomedical research ~2010 ... atcgaattccaggcgtcacattctcaattcca... MPMILGYWDIRGLAHAIRLLLEYTDSSYEEKKYT... Protein-Protein Interactions metabolism pathways receptor-ligand 4Āŗ structure Polymorphism and Variants genetic variants individual patients epidemiology Physiology Cellular biology Biochemistry Neurobiology Endocrinology etc. >10 6 ESTs Expression patterns Large-scale screens Genetics and Maps Linkage Cytogenetic Clone-based From John Wooley >10 6 >10 9 >10 6 >10 5 >10 9 DNA sequences alignments Proteins sequence 2Āŗ structure 3Āŗ structure
  • 7. More data does not always mean more knowledge Folker Meyer, Genome Sequencing vs. Mooreā€™s Law: Cyber Challenges for the Next Decade, CTWatch , August 2006.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12. The cancer Biomedical Informatics Grid Globus
  • 13. As of Sept 18, 2008: 122 participants 81 services 62 data 19 analytical
  • 14. As of Oct 19 , 2008: 122 participants 105 services 70 data 35 analytical
  • 15. Automating the routine Location A Microarray, Protein, Image data Location B Microarray, Protein, Image data Location C Microarray, Protein, Image data Location C Image Analysis Location D Image Analysis Microarray and protein databases at other institutions Different database systems, data representations, security Different program invocation, remote access, data transfer
  • 16. Automating the routine Location A Microarray, Protein, Image data Location B Microarray, Protein, Image data Location C Microarray, Protein, Image data Location C Image Analysis Location D Image Analysis caGrid Service Interfaces caGrid Environ-ment Registered Object Definitions Advertise-ment Log on, Grid credentials Query and Analysis Workflow Discovery Microarray & protein databases at other institutions Globus
  • 17. Location A Microarray, Protein, Image data Location B Microarray, Protein, Image data Location C Microarray, Protein, Image data Location C Image Analysis Location D Image Analysis caGrid Service Interfaces caGrid Environ-ment Registered Object Definitions Advertise-ment Log on, Grid credentials Query and Analysis Workflow Discovery Microarray & protein databases at other institutions Service authoring Metadata services Service registries Security services Quality control Queries, workflows caGrid Compute resources Globus
  • 18. Lifecycle issues caGrid Discovery Composition Execution Analysis Community reuse generate
  • 20. Taverna Trident Kepler BPEL Ptolemy II
  • 21.
  • 22. Execution trace Execution result as XML 1936 gene expressions
  • 23.
  • 24. Reproducible science means ā€” context ā€” trust ā€” easy access to methods
  • 25. Workflows are another form of scholarly outcome to publish, curate and cite and archive along with data and publications
  • 26.
  • 27. Functional Magnetic Resonance Imaging (fMRI) Mike Wilde
  • 29.
  • 30. Example: fMRI analysis First Provenance Challenge, http://twiki.ipaw.info/ [CCPE06]
  • 31.
  • 32.
  • 33.
  • 34.
  • 35. ā€œ Sineā€ workload, 2M tasks, 10MB:10ms ratio, 100 nodes, GCC policy, 50GB caches/node
  • 36. Same scenario, but with dynamic resource provisioning
  • 37.
  • 38. Ā 
  • 39. Efficiency relative to no-I/O case for 4 second tasks and varying data size (1KB to 1MB) for CIO and GPFS up to 32K processors
  • 40.
  • 41.
  • 42.
  • 43.
  • 44.
  • 45.
  • 46. caGrid data instruments computation resource Virtualization Security Connectivity
  • 47.
  • 48. ā€œ MIā€ workload, 250K tasks, 10MB:10ms ratio, up to 64 nodes using DRP, GCC policy, 2GB caches/node

Editor's Notes

  1. Subject: how service oriented architecture can contribute to accelerating the pace of knowledge creation.