SlideShare a Scribd company logo
1 of 24
GigaDB and Galaxy: revolutionizing data
dissemination, organization and analysis


                  Peter Li
                GigaScience
           peter@gigasciencejournal.com
Journal and database for
    large-scale data
  in conjunction with


     Editor-in-Chief: Laurie Goodman
          Editor: Scott Edmunds
   Commissioning Editor: Nicole Nogoy
      Lead Curator: Tam Sneddon
          Data Platform: Peter Li
  www.gigasciencejournal.com
Why another *omics journal?




  Already many journals publishing research
           involving large data sets


                Results
             reproducibility
Unrepeatability of scientific results
                                                  Out of 18 microarray papers, results
                                                   from 10 could not be reproduced




Ioannidis et al., 2009. Repeatability of published microarray gene expression analyses.
Nature Genetics 41: 149-155.
How are we supporting data
               reproducibility?


                    Data sets




GigaScience
   paper            Analyses


                                    Community tools for
                                data reproduction and reuse
Linking of papers and data
    by citation of DOIs




                Data set DOI


                Paper DOI
http://gigadb.org
GigaDB is a new database integrated with the GigaScience journal to meet the needs of a new generation of biological
and biomedical research as it enters the era of “big-data”… (see more)
Faster download speeds

Aspera data transfer
BGI Datasets Get DOI®s
Invertebrate                                     Released pre-publication
Ant                                              Paper published in GigaScience
- Florida carpenter ant                                        Microbe
- Jerdon’s jumping ant          Vertebrates                    E. Coli O104:H4 TY-2482
- Leaf-cutter ant               Darwin’s Finch                 T2D gut metagenome
Roundworm                       Giant panda Macaque
Schistosoma                     -Chinese rhesus                Cell-Lines
Silkworm                        -Crab-eating                   Chinese Hamster Ovary
Parasitic nematode              Mini-Pig                       Mouse methylomes
Pacific oyster                  Naked mole rat
Human                           Parrot, Puerto Rican           PLANTS
Asian individual (YH)           Penguin                        Chinese cabbage
- DNA Methylome                 - Emperor penguin              Cucumber
- Genome Assembly               - Adelie penguin               Foxtail millet
- Transcriptome                 Pigeon, domestic               Pigeonpea
Cancer (14TB)                   Polar bear                     Potato
Single cell bladder cancer      Sheep                          Sorghum
HBV infected exomes             Tibetan antelope
Ancient DNA
- Saqqaq Eskimo
                                                       39 data sets
- Aboriginal Australian
Currently: 39 public datasets
                          *10 citations in references*
Humans
Ancient DNA
- Aboriginal Australian
- Saqqaq Eskimo
Asian individual (YH)
What about the analyses?


                           Data sets




GigaScience
   paper                  Analyses




              How will we make analyses available
               for downloading and execution?
Bioinformatics data analyses as workflows




 Example workflow: Investigate the evolutionary relationships between proteins
                                                                       Multiple
                                   Protein
Query                                                                 sequence
                                 sequences
                                                                      alignment
Implement GigaScience workflows
     in a community-accepted format

                                         Open source



                                      Over 20,000 main
                                      Galaxy server users



                                       Over 500 papers
                                       citing Galaxy use



                                        Over 55 Galaxy
                                       servers deployed



http://galaxyproject.org
Tool list   Tool parameterisation   Results panel
Pilot project - Integrate BGI SOAP
         package into Galaxy




Enable SOAP tools to be used from within Galaxy workflows
Integrate BGI SOAP package into
                 Galaxy
                      Data analysis pipelines




Python    Python     Python         Python      Python    Python
wrapper   wrapper    wrapper        wrapper     wrapper   wrapper




SOAP1     SOAP2     SOAPdenovo1   SOAPdenovo2   SOAPsnp   SOAPsplice
GitHub open code repository




  https://github.com/gigascience
Tool list   Tool parameterisation   Results panel
SOAPdenovo2 Galaxy workflow
http://www.myexperiment.org
Why publish in GigaScience?

             Benefit                        Added value
•   Data hosted in GigaDB        •   No need to use own servers
•   Allocation of DOIs to data   •   Citable data
•   Metadata in isa-tab format   •   Aids reuse of data
•   Galaxy tool integration      •   Supports reuse of tools
•   Use of tools in Galaxy       •   Improves documentation
    workflows                    •   Shows how tool can be used
                                     with other bioinf. software
Thanks to:

• Tin-Lap Lee and Huayan Gao - CUHK
• Tam, Jesse, Scott, Nicole & Laurie - GigaScience




          peter@gigasciencejournal.com

More Related Content

What's hot

ContentMine + EPMC: Finding Zika!
ContentMine + EPMC: Finding Zika! ContentMine + EPMC: Finding Zika!
ContentMine + EPMC: Finding Zika! TheContentMine
 
High-performance web services for gene and variant annotations
High-performance web services for gene and variant annotationsHigh-performance web services for gene and variant annotations
High-performance web services for gene and variant annotationsChunlei Wu
 
Content Mining of Science in Cambridge
Content Mining of Science in CambridgeContent Mining of Science in Cambridge
Content Mining of Science in CambridgeTheContentMine
 
Nicole Nogoy's talk at eResearchNZ 2014: Improving data sharing, integration ...
Nicole Nogoy's talk at eResearchNZ 2014: Improving data sharing, integration ...Nicole Nogoy's talk at eResearchNZ 2014: Improving data sharing, integration ...
Nicole Nogoy's talk at eResearchNZ 2014: Improving data sharing, integration ...GigaScience, BGI Hong Kong
 
ContentMine + EPMC: Finding Zika!
ContentMine + EPMC: Finding Zika!ContentMine + EPMC: Finding Zika!
ContentMine + EPMC: Finding Zika!petermurrayrust
 
ContentMine (TDM) at JISC Digifest
ContentMine (TDM) at JISC DigifestContentMine (TDM) at JISC Digifest
ContentMine (TDM) at JISC Digifestpetermurrayrust
 
Alexandra Basford, InCoB 2011: A Journal’s Perspective on Data Standards and ...
Alexandra Basford, InCoB 2011: A Journal’s Perspective on Data Standards and ...Alexandra Basford, InCoB 2011: A Journal’s Perspective on Data Standards and ...
Alexandra Basford, InCoB 2011: A Journal’s Perspective on Data Standards and ...GigaScience, BGI Hong Kong
 
EuKRef. A community effort towards phylogenetic-based curation of ribosomal d...
EuKRef. A community effort towards phylogenetic-based curation of ribosomal d...EuKRef. A community effort towards phylogenetic-based curation of ribosomal d...
EuKRef. A community effort towards phylogenetic-based curation of ribosomal d...EukRef
 
Content Mining of Science in Europe
Content Mining of Science in EuropeContent Mining of Science in Europe
Content Mining of Science in Europepetermurrayrust
 
Cross-Disciplinary Biomedical Research at Calit2
Cross-Disciplinary Biomedical Research at Calit2Cross-Disciplinary Biomedical Research at Calit2
Cross-Disciplinary Biomedical Research at Calit2Larry Smarr
 
iPlant TNRS for digital collections - iDigBio Workshop
iPlant TNRS for digital collections - iDigBio WorkshopiPlant TNRS for digital collections - iDigBio Workshop
iPlant TNRS for digital collections - iDigBio WorkshopNaim Matasci
 
Building an Information Infrastructure to Support Genetic Sciences
Building an Information Infrastructure to Support Genetic SciencesBuilding an Information Infrastructure to Support Genetic Sciences
Building an Information Infrastructure to Support Genetic SciencesLarry Smarr
 
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...GigaScience, BGI Hong Kong
 
leveraging the web to make science more collaborative
leveraging the web to make science more collaborativeleveraging the web to make science more collaborative
leveraging the web to make science more collaborativeBrian Bot
 
How to share useful data
How to share useful dataHow to share useful data
How to share useful dataPeter McQuilton
 
Being FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data ScienceBeing FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data ScienceCarole Goble
 
How novel compute technology transforms life science research
How novel compute technology transforms life science researchHow novel compute technology transforms life science research
How novel compute technology transforms life science researchDenis C. Bauer
 

What's hot (20)

ContentMine + EPMC: Finding Zika!
ContentMine + EPMC: Finding Zika! ContentMine + EPMC: Finding Zika!
ContentMine + EPMC: Finding Zika!
 
High-performance web services for gene and variant annotations
High-performance web services for gene and variant annotationsHigh-performance web services for gene and variant annotations
High-performance web services for gene and variant annotations
 
Content Mining of Science in Cambridge
Content Mining of Science in CambridgeContent Mining of Science in Cambridge
Content Mining of Science in Cambridge
 
Nicole Nogoy's talk at eResearchNZ 2014: Improving data sharing, integration ...
Nicole Nogoy's talk at eResearchNZ 2014: Improving data sharing, integration ...Nicole Nogoy's talk at eResearchNZ 2014: Improving data sharing, integration ...
Nicole Nogoy's talk at eResearchNZ 2014: Improving data sharing, integration ...
 
ContentMine + EPMC: Finding Zika!
ContentMine + EPMC: Finding Zika!ContentMine + EPMC: Finding Zika!
ContentMine + EPMC: Finding Zika!
 
ContentMine (TDM) at JISC Digifest
ContentMine (TDM) at JISC DigifestContentMine (TDM) at JISC Digifest
ContentMine (TDM) at JISC Digifest
 
Alexandra Basford, InCoB 2011: A Journal’s Perspective on Data Standards and ...
Alexandra Basford, InCoB 2011: A Journal’s Perspective on Data Standards and ...Alexandra Basford, InCoB 2011: A Journal’s Perspective on Data Standards and ...
Alexandra Basford, InCoB 2011: A Journal’s Perspective on Data Standards and ...
 
EuKRef. A community effort towards phylogenetic-based curation of ribosomal d...
EuKRef. A community effort towards phylogenetic-based curation of ribosomal d...EuKRef. A community effort towards phylogenetic-based curation of ribosomal d...
EuKRef. A community effort towards phylogenetic-based curation of ribosomal d...
 
Content Mining of Science in Europe
Content Mining of Science in EuropeContent Mining of Science in Europe
Content Mining of Science in Europe
 
Cross-Disciplinary Biomedical Research at Calit2
Cross-Disciplinary Biomedical Research at Calit2Cross-Disciplinary Biomedical Research at Calit2
Cross-Disciplinary Biomedical Research at Calit2
 
Facilitating Scientific Discovery through Crowdsourcing and Distributed Parti...
Facilitating Scientific Discovery through Crowdsourcing and Distributed Parti...Facilitating Scientific Discovery through Crowdsourcing and Distributed Parti...
Facilitating Scientific Discovery through Crowdsourcing and Distributed Parti...
 
iPlant TNRS for digital collections - iDigBio Workshop
iPlant TNRS for digital collections - iDigBio WorkshopiPlant TNRS for digital collections - iDigBio Workshop
iPlant TNRS for digital collections - iDigBio Workshop
 
2014 sage-talk
2014 sage-talk2014 sage-talk
2014 sage-talk
 
Building an Information Infrastructure to Support Genetic Sciences
Building an Information Infrastructure to Support Genetic SciencesBuilding an Information Infrastructure to Support Genetic Sciences
Building an Information Infrastructure to Support Genetic Sciences
 
OpenTox Europe 2013
OpenTox Europe 2013OpenTox Europe 2013
OpenTox Europe 2013
 
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
 
leveraging the web to make science more collaborative
leveraging the web to make science more collaborativeleveraging the web to make science more collaborative
leveraging the web to make science more collaborative
 
How to share useful data
How to share useful dataHow to share useful data
How to share useful data
 
Being FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data ScienceBeing FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data Science
 
How novel compute technology transforms life science research
How novel compute technology transforms life science researchHow novel compute technology transforms life science research
How novel compute technology transforms life science research
 

Similar to GigaDB and Galaxy: revolutionizing data dissemination, organization and analysis

Tin-Lap Lee: CBIIT GigaGalaxy: A Galaxy-based platform for large-scale genomi...
Tin-Lap Lee: CBIIT GigaGalaxy: A Galaxy-based platform for large-scale genomi...Tin-Lap Lee: CBIIT GigaGalaxy: A Galaxy-based platform for large-scale genomi...
Tin-Lap Lee: CBIIT GigaGalaxy: A Galaxy-based platform for large-scale genomi...GigaScience, BGI Hong Kong
 
Scott Edmunds: Revolutionizing Data Dissemination: GigaScience
Scott Edmunds: Revolutionizing Data Dissemination: GigaScienceScott Edmunds: Revolutionizing Data Dissemination: GigaScience
Scott Edmunds: Revolutionizing Data Dissemination: GigaScienceGigaScience, BGI Hong Kong
 
GigaScience: data and beta-database launch. Announcing GigaDB
GigaScience: data and beta-database launch. Announcing GigaDBGigaScience: data and beta-database launch. Announcing GigaDB
GigaScience: data and beta-database launch. Announcing GigaDBGigaScience, BGI Hong Kong
 
Scott Edmunds: GigaScience - a journal or a database? Lessons learned from th...
Scott Edmunds: GigaScience - a journal or a database? Lessons learned from th...Scott Edmunds: GigaScience - a journal or a database? Lessons learned from th...
Scott Edmunds: GigaScience - a journal or a database? Lessons learned from th...GigaScience, BGI Hong Kong
 
Ramil Mauleon: IRRI GALAXY: bioinformatics for rice scientists
Ramil Mauleon: IRRI GALAXY: bioinformatics for rice scientistsRamil Mauleon: IRRI GALAXY: bioinformatics for rice scientists
Ramil Mauleon: IRRI GALAXY: bioinformatics for rice scientistsGigaScience, BGI Hong Kong
 
iplant-highlights-pag2015
iplant-highlights-pag2015iplant-highlights-pag2015
iplant-highlights-pag2015Matthew Vaughn
 
Scott Edmunds: Data publication in the data deluge
Scott Edmunds: Data publication in the data delugeScott Edmunds: Data publication in the data deluge
Scott Edmunds: Data publication in the data delugeGigaScience, BGI Hong Kong
 
If we build it will they come?
If we build it will they come?If we build it will they come?
If we build it will they come?myGrid team
 
Services For Science April 2009
Services For Science April 2009Services For Science April 2009
Services For Science April 2009Ian Foster
 
Sharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reportsSharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reportsGaignard Alban
 
wolstencroft-ogf20-astro
wolstencroft-ogf20-astrowolstencroft-ogf20-astro
wolstencroft-ogf20-astrowebuploader
 
Scott Edmunds flashtalk slides from Beyond the PDF2
Scott Edmunds flashtalk slides from Beyond the PDF2Scott Edmunds flashtalk slides from Beyond the PDF2
Scott Edmunds flashtalk slides from Beyond the PDF2GigaScience, BGI Hong Kong
 
The OptIPlanet Collaboratory Supporting Microbial Metagenomics Researchers Wo...
The OptIPlanet Collaboratory Supporting Microbial Metagenomics Researchers Wo...The OptIPlanet Collaboratory Supporting Microbial Metagenomics Researchers Wo...
The OptIPlanet Collaboratory Supporting Microbial Metagenomics Researchers Wo...Larry Smarr
 
Data analysis & integration challenges in genomics
Data analysis & integration challenges in genomicsData analysis & integration challenges in genomics
Data analysis & integration challenges in genomicsmikaelhuss
 
Making your data work for you: Scratchpads, publishing & the biodiversity dat...
Making your data work for you: Scratchpads, publishing & the biodiversity dat...Making your data work for you: Scratchpads, publishing & the biodiversity dat...
Making your data work for you: Scratchpads, publishing & the biodiversity dat...Vince Smith
 
Scientific Data Management
Scientific Data ManagementScientific Data Management
Scientific Data ManagementAlberto Labarga
 
Preserving the Inputs and Outputs of Scholarship
Preserving the Inputs and Outputs of ScholarshipPreserving the Inputs and Outputs of Scholarship
Preserving the Inputs and Outputs of Scholarshiptsbbbu
 

Similar to GigaDB and Galaxy: revolutionizing data dissemination, organization and analysis (20)

Tin-Lap Lee: CBIIT GigaGalaxy: A Galaxy-based platform for large-scale genomi...
Tin-Lap Lee: CBIIT GigaGalaxy: A Galaxy-based platform for large-scale genomi...Tin-Lap Lee: CBIIT GigaGalaxy: A Galaxy-based platform for large-scale genomi...
Tin-Lap Lee: CBIIT GigaGalaxy: A Galaxy-based platform for large-scale genomi...
 
Scott Edmunds: Revolutionizing Data Dissemination: GigaScience
Scott Edmunds: Revolutionizing Data Dissemination: GigaScienceScott Edmunds: Revolutionizing Data Dissemination: GigaScience
Scott Edmunds: Revolutionizing Data Dissemination: GigaScience
 
GigaScience: data and beta-database launch. Announcing GigaDB
GigaScience: data and beta-database launch. Announcing GigaDBGigaScience: data and beta-database launch. Announcing GigaDB
GigaScience: data and beta-database launch. Announcing GigaDB
 
Scott Edmunds: GigaScience - a journal or a database? Lessons learned from th...
Scott Edmunds: GigaScience - a journal or a database? Lessons learned from th...Scott Edmunds: GigaScience - a journal or a database? Lessons learned from th...
Scott Edmunds: GigaScience - a journal or a database? Lessons learned from th...
 
Ramil Mauleon: IRRI GALAXY: bioinformatics for rice scientists
Ramil Mauleon: IRRI GALAXY: bioinformatics for rice scientistsRamil Mauleon: IRRI GALAXY: bioinformatics for rice scientists
Ramil Mauleon: IRRI GALAXY: bioinformatics for rice scientists
 
iplant-highlights-pag2015
iplant-highlights-pag2015iplant-highlights-pag2015
iplant-highlights-pag2015
 
Scott Edmunds: Data publication in the data deluge
Scott Edmunds: Data publication in the data delugeScott Edmunds: Data publication in the data deluge
Scott Edmunds: Data publication in the data deluge
 
If we build it will they come?
If we build it will they come?If we build it will they come?
If we build it will they come?
 
Services For Science April 2009
Services For Science April 2009Services For Science April 2009
Services For Science April 2009
 
Sharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reportsSharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reports
 
wolstencroft-ogf20-astro
wolstencroft-ogf20-astrowolstencroft-ogf20-astro
wolstencroft-ogf20-astro
 
Scott Edmunds flashtalk slides from Beyond the PDF2
Scott Edmunds flashtalk slides from Beyond the PDF2Scott Edmunds flashtalk slides from Beyond the PDF2
Scott Edmunds flashtalk slides from Beyond the PDF2
 
The OptIPlanet Collaboratory Supporting Microbial Metagenomics Researchers Wo...
The OptIPlanet Collaboratory Supporting Microbial Metagenomics Researchers Wo...The OptIPlanet Collaboratory Supporting Microbial Metagenomics Researchers Wo...
The OptIPlanet Collaboratory Supporting Microbial Metagenomics Researchers Wo...
 
Data analysis & integration challenges in genomics
Data analysis & integration challenges in genomicsData analysis & integration challenges in genomics
Data analysis & integration challenges in genomics
 
Making your data work for you: Scratchpads, publishing & the biodiversity dat...
Making your data work for you: Scratchpads, publishing & the biodiversity dat...Making your data work for you: Scratchpads, publishing & the biodiversity dat...
Making your data work for you: Scratchpads, publishing & the biodiversity dat...
 
Scientific Data Management
Scientific Data ManagementScientific Data Management
Scientific Data Management
 
Preserving the Inputs and Outputs of Scholarship
Preserving the Inputs and Outputs of ScholarshipPreserving the Inputs and Outputs of Scholarship
Preserving the Inputs and Outputs of Scholarship
 
iRODS
iRODSiRODS
iRODS
 
Overview of Next Gen Sequencing Data Analysis
Overview of Next Gen Sequencing Data AnalysisOverview of Next Gen Sequencing Data Analysis
Overview of Next Gen Sequencing Data Analysis
 
NETTAB 2012
NETTAB 2012NETTAB 2012
NETTAB 2012
 

More from GigaScience, BGI Hong Kong

IDW2022: A decades experiences in transparent and interactive publication of ...
IDW2022: A decades experiences in transparent and interactive publication of ...IDW2022: A decades experiences in transparent and interactive publication of ...
IDW2022: A decades experiences in transparent and interactive publication of ...GigaScience, BGI Hong Kong
 
Scott Edmunds: Preparing a data paper for GigaByte
Scott Edmunds: Preparing a data paper for GigaByteScott Edmunds: Preparing a data paper for GigaByte
Scott Edmunds: Preparing a data paper for GigaByteGigaScience, BGI Hong Kong
 
STM Week: Demonstrating bringing publications to life via an End-to-end XML p...
STM Week: Demonstrating bringing publications to life via an End-to-end XML p...STM Week: Demonstrating bringing publications to life via an End-to-end XML p...
STM Week: Demonstrating bringing publications to life via an End-to-end XML p...GigaScience, BGI Hong Kong
 
Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...
Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...
Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...GigaScience, BGI Hong Kong
 
Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...
Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...
Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...GigaScience, BGI Hong Kong
 
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...GigaScience, BGI Hong Kong
 
Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...
Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...
Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...GigaScience, BGI Hong Kong
 
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project: Production...
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project:  Production...PAGAsia19 - The Digitalization of Ruili Botanical Garden Project:  Production...
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project: Production...GigaScience, BGI Hong Kong
 
Democratising biodiversity and genomics research: open and citizen science to...
Democratising biodiversity and genomics research: open and citizen science to...Democratising biodiversity and genomics research: open and citizen science to...
Democratising biodiversity and genomics research: open and citizen science to...GigaScience, BGI Hong Kong
 
Ricardo Wurmus: Reproducible genomics analysis pipelines with GNU Guix
Ricardo Wurmus: Reproducible genomics analysis pipelines with GNU GuixRicardo Wurmus: Reproducible genomics analysis pipelines with GNU Guix
Ricardo Wurmus: Reproducible genomics analysis pipelines with GNU GuixGigaScience, BGI Hong Kong
 
Anil Thanki at #ICG13: Aequatus: An open-source homology browser
Anil Thanki at #ICG13: Aequatus: An open-source homology browserAnil Thanki at #ICG13: Aequatus: An open-source homology browser
Anil Thanki at #ICG13: Aequatus: An open-source homology browserGigaScience, BGI Hong Kong
 
Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...
Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...
Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...GigaScience, BGI Hong Kong
 
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant scienceVenice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant scienceGigaScience, BGI Hong Kong
 
Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...
Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...
Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...GigaScience, BGI Hong Kong
 
Chris Armit at IDW2018: Democratising Data Publishing: A Global Perspective
Chris Armit at IDW2018: Democratising Data Publishing: A Global PerspectiveChris Armit at IDW2018: Democratising Data Publishing: A Global Perspective
Chris Armit at IDW2018: Democratising Data Publishing: A Global PerspectiveGigaScience, BGI Hong Kong
 
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...GigaScience, BGI Hong Kong
 
Reproducible method and benchmarking publishing for the data (and evidence) d...
Reproducible method and benchmarking publishing for the data (and evidence) d...Reproducible method and benchmarking publishing for the data (and evidence) d...
Reproducible method and benchmarking publishing for the data (and evidence) d...GigaScience, BGI Hong Kong
 
Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...
Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...
Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...GigaScience, BGI Hong Kong
 
Laurie Goodman: Sharing and Reusing Cell Image Data, ASCB/EMBO 2017 Subgroup ...
Laurie Goodman: Sharing and Reusing Cell Image Data, ASCB/EMBO 2017 Subgroup ...Laurie Goodman: Sharing and Reusing Cell Image Data, ASCB/EMBO 2017 Subgroup ...
Laurie Goodman: Sharing and Reusing Cell Image Data, ASCB/EMBO 2017 Subgroup ...GigaScience, BGI Hong Kong
 

More from GigaScience, BGI Hong Kong (20)

IDW2022: A decades experiences in transparent and interactive publication of ...
IDW2022: A decades experiences in transparent and interactive publication of ...IDW2022: A decades experiences in transparent and interactive publication of ...
IDW2022: A decades experiences in transparent and interactive publication of ...
 
Scott Edmunds: Preparing a data paper for GigaByte
Scott Edmunds: Preparing a data paper for GigaByteScott Edmunds: Preparing a data paper for GigaByte
Scott Edmunds: Preparing a data paper for GigaByte
 
STM Week: Demonstrating bringing publications to life via an End-to-end XML p...
STM Week: Demonstrating bringing publications to life via an End-to-end XML p...STM Week: Demonstrating bringing publications to life via an End-to-end XML p...
STM Week: Demonstrating bringing publications to life via an End-to-end XML p...
 
Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...
Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...
Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...
 
Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...
Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...
Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...
 
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
 
Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...
Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...
Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...
 
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project: Production...
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project:  Production...PAGAsia19 - The Digitalization of Ruili Botanical Garden Project:  Production...
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project: Production...
 
Democratising biodiversity and genomics research: open and citizen science to...
Democratising biodiversity and genomics research: open and citizen science to...Democratising biodiversity and genomics research: open and citizen science to...
Democratising biodiversity and genomics research: open and citizen science to...
 
Hong Kong Open Access & GigaScience: CCHK@10
Hong Kong Open Access & GigaScience: CCHK@10Hong Kong Open Access & GigaScience: CCHK@10
Hong Kong Open Access & GigaScience: CCHK@10
 
Ricardo Wurmus: Reproducible genomics analysis pipelines with GNU Guix
Ricardo Wurmus: Reproducible genomics analysis pipelines with GNU GuixRicardo Wurmus: Reproducible genomics analysis pipelines with GNU Guix
Ricardo Wurmus: Reproducible genomics analysis pipelines with GNU Guix
 
Anil Thanki at #ICG13: Aequatus: An open-source homology browser
Anil Thanki at #ICG13: Aequatus: An open-source homology browserAnil Thanki at #ICG13: Aequatus: An open-source homology browser
Anil Thanki at #ICG13: Aequatus: An open-source homology browser
 
Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...
Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...
Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...
 
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant scienceVenice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
 
Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...
Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...
Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...
 
Chris Armit at IDW2018: Democratising Data Publishing: A Global Perspective
Chris Armit at IDW2018: Democratising Data Publishing: A Global PerspectiveChris Armit at IDW2018: Democratising Data Publishing: A Global Perspective
Chris Armit at IDW2018: Democratising Data Publishing: A Global Perspective
 
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
 
Reproducible method and benchmarking publishing for the data (and evidence) d...
Reproducible method and benchmarking publishing for the data (and evidence) d...Reproducible method and benchmarking publishing for the data (and evidence) d...
Reproducible method and benchmarking publishing for the data (and evidence) d...
 
Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...
Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...
Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...
 
Laurie Goodman: Sharing and Reusing Cell Image Data, ASCB/EMBO 2017 Subgroup ...
Laurie Goodman: Sharing and Reusing Cell Image Data, ASCB/EMBO 2017 Subgroup ...Laurie Goodman: Sharing and Reusing Cell Image Data, ASCB/EMBO 2017 Subgroup ...
Laurie Goodman: Sharing and Reusing Cell Image Data, ASCB/EMBO 2017 Subgroup ...
 

Recently uploaded

Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 

Recently uploaded (20)

Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 

GigaDB and Galaxy: revolutionizing data dissemination, organization and analysis

  • 1. GigaDB and Galaxy: revolutionizing data dissemination, organization and analysis Peter Li GigaScience peter@gigasciencejournal.com
  • 2. Journal and database for large-scale data in conjunction with Editor-in-Chief: Laurie Goodman Editor: Scott Edmunds Commissioning Editor: Nicole Nogoy Lead Curator: Tam Sneddon Data Platform: Peter Li www.gigasciencejournal.com
  • 3.
  • 4. Why another *omics journal? Already many journals publishing research involving large data sets Results reproducibility
  • 5. Unrepeatability of scientific results Out of 18 microarray papers, results from 10 could not be reproduced Ioannidis et al., 2009. Repeatability of published microarray gene expression analyses. Nature Genetics 41: 149-155.
  • 6. How are we supporting data reproducibility? Data sets GigaScience paper Analyses Community tools for data reproduction and reuse
  • 7. Linking of papers and data by citation of DOIs Data set DOI Paper DOI
  • 9. GigaDB is a new database integrated with the GigaScience journal to meet the needs of a new generation of biological and biomedical research as it enters the era of “big-data”… (see more)
  • 11. BGI Datasets Get DOI®s Invertebrate Released pre-publication Ant Paper published in GigaScience - Florida carpenter ant Microbe - Jerdon’s jumping ant Vertebrates E. Coli O104:H4 TY-2482 - Leaf-cutter ant Darwin’s Finch T2D gut metagenome Roundworm Giant panda Macaque Schistosoma -Chinese rhesus Cell-Lines Silkworm -Crab-eating Chinese Hamster Ovary Parasitic nematode Mini-Pig Mouse methylomes Pacific oyster Naked mole rat Human Parrot, Puerto Rican PLANTS Asian individual (YH) Penguin Chinese cabbage - DNA Methylome - Emperor penguin Cucumber - Genome Assembly - Adelie penguin Foxtail millet - Transcriptome Pigeon, domestic Pigeonpea Cancer (14TB) Polar bear Potato Single cell bladder cancer Sheep Sorghum HBV infected exomes Tibetan antelope Ancient DNA - Saqqaq Eskimo 39 data sets - Aboriginal Australian
  • 12. Currently: 39 public datasets *10 citations in references* Humans Ancient DNA - Aboriginal Australian - Saqqaq Eskimo Asian individual (YH)
  • 13. What about the analyses? Data sets GigaScience paper Analyses How will we make analyses available for downloading and execution?
  • 14. Bioinformatics data analyses as workflows Example workflow: Investigate the evolutionary relationships between proteins Multiple Protein Query sequence sequences alignment
  • 15. Implement GigaScience workflows in a community-accepted format Open source Over 20,000 main Galaxy server users Over 500 papers citing Galaxy use Over 55 Galaxy servers deployed http://galaxyproject.org
  • 16. Tool list Tool parameterisation Results panel
  • 17. Pilot project - Integrate BGI SOAP package into Galaxy Enable SOAP tools to be used from within Galaxy workflows
  • 18. Integrate BGI SOAP package into Galaxy Data analysis pipelines Python Python Python Python Python Python wrapper wrapper wrapper wrapper wrapper wrapper SOAP1 SOAP2 SOAPdenovo1 SOAPdenovo2 SOAPsnp SOAPsplice
  • 19. GitHub open code repository https://github.com/gigascience
  • 20. Tool list Tool parameterisation Results panel
  • 23. Why publish in GigaScience? Benefit Added value • Data hosted in GigaDB • No need to use own servers • Allocation of DOIs to data • Citable data • Metadata in isa-tab format • Aids reuse of data • Galaxy tool integration • Supports reuse of tools • Use of tools in Galaxy • Improves documentation workflows • Shows how tool can be used with other bioinf. software
  • 24. Thanks to: • Tin-Lap Lee and Huayan Gao - CUHK • Tam, Jesse, Scott, Nicole & Laurie - GigaScience peter@gigasciencejournal.com

Editor's Notes

  1. Mini-ping genome published this month
  2. DOIsProvide example of a GigaScience paperMention DOI for the paper itselfHighlight data set generated and its DOI
  3. And now that you all want to submit to GigaDB, how do you do that and how will people search and find your data and, other than citing your DOI, what will they be able to do with the data? We have redesigned the underlying Giga database and we’re working on the front end which we hope to be public early next month so the following slides are a mix of screenshots from the development site overlaid with tweaks made in powerpoint to illustrate features you can hope to see when we go live.These include:a home page image slider for browsing datasetsa text box search which I will demonstrate shortly
  4. ***NEEDS REWORKING!!!!***This is an example landing page for DOI 10.5524/100015 for the YH genome dataset. These pages are still in development but you can see the date released, title and abstract and how the dataset should be cited.Additional information includes links to manuscripts and data accessions at EBI, NCBI or DDBJ.There is then information on the samples and files.
  5. A GigaDB dataset citation is also included in the YH Transcriptome paper published in Nature Biotechnology in February this year.As you can see the dataset was published in 2011 but this did not prevent subsequent publication of the analysis paper.
  6. Over 20,000 users on the main serverOver 500 papers citing the use of GalaxyOver 55 servers deployed on the Web
  7. Allows scientists who may not have programming skills to be able to compose data analysis pipelines.
  8. DOIs can now be tracked in the new Thomson Reuters Data Citation index - which gives form of credit and makes the data more discoverable (Scott)