SlideShare una empresa de Scribd logo
1 de 26
Descargar para leer sin conexión
Bringing Standards to Life:
 Software Development by the
          Genomics
    Standards Consortium




            Renzo Kottmann
             Microbial Genomics Group
     Max Planck Institute for Marine Microbiology

      M3 SIG Stockholm July 2009                1
Genomic Standards Consortium (GSC)

Goal
  • Promote mechanisms that
         standardize the description of genomes
         exchange and integrate genomic data

Open-membership, international working body
  • Established in Sept 2005
  • Participants include DDBJ, EMBL, GenBank, Sanger,
    JCVI, JGI, EBI and a range of US, UK and EU research
    institutions
  • Organized a series of workshops


                                                                             2       2
            http://gensc.org and http://gensc.org/gc_wiki/index.php/GSC_Membership
Minimum Information about a Genome Sequence
              (MIGS) Specification

MIGS extends what DDBJ/EMBL/GenBank request
 upon submission of a genome sequence
  • Examples:
       Description of geographic location of a sample and
        habitat
       “Minimum Information about a Metagenomic Sequence”
        (MIMS)
         – Temperature
         – pH
       Description of sequence generation
         – Sequencing method
         – Assembly method

                                                             3   3
                         Field et al. Nat Biotechnol. 2008
MIGS Checklist 2.0




                                      4   4
  Field et al. Nat Biotechnol. 2008
MIGS Checklist 2.0




                                          M = mandatory




                                      5              5
  Field et al. Nat Biotechnol. 2008
Software Development for MIGS/MIMS

Mechanisms for
 achieving compliance
 are needed:
  • Such mechanisms
    involve
       an appropriate reporting
        structure for capturing
        and exchanging data,
        software,
        databases
        and controlled
        vocabularies and/or
        ontologies for defining
        the terms used in the
        annotations.

                                         6
     Field et al. Nat Biotechnol. 2008
Software Development for MIGS/MIMS

Mechanisms for                          Supporting Projects:
 achieving compliance                      • Habitat-Lite (Ontology
 are needed:                                 specification)
  • Such mechanisms
    involve
       an appropriate reporting
        structure for capturing
        and exchanging data,
        software,
        databases
        and controlled
        vocabularies and/or
        ontologies for defining
        the terms used in the
        annotations.

                                                        7
     Field et al. Nat Biotechnol. 2008
Software Development for MIGS/MIMS

Mechanisms for                          Supporting Projects:
 achieving compliance                      • Habitat-Lite (Ontology
 are needed:                                 specification)
  • Such mechanisms                        • Genomic Rosetta Stone
    involve                                  (Identifier Mapping)
       an appropriate reporting
        structure for capturing
        and exchanging data,
        software,
        databases
        and controlled
        vocabularies and/or
        ontologies for defining
        the terms used in the
        annotations.

                                                       8
     Field et al. Nat Biotechnol. 2008
Software Development for MIGS/MIMS

Mechanisms for                          Supporting Projects:
 achieving compliance                      • Habitat-Lite (Ontology
 are needed:                                 specification)
  • Such mechanisms                        • Genomic Rosetta Stone
    involve                                  (Identifier Mapping)
       an appropriate reporting           • GCDML (MIGS/MIMS
        structure for capturing
        and exchanging data,                 specification in XML)
        software,
        databases
        and controlled
        vocabularies and/or
        ontologies for defining
        the terms used in the
        annotations.

                                                       9
     Field et al. Nat Biotechnol. 2008
Software Development for MIGS/MIMS

Mechanisms for                          Supporting Projects:
 achieving compliance                      • Habitat-Lite (Ontology
 are needed:                                 specification)
  • Such mechanisms                        • Genomic Rosetta Stone
    involve                                  (Identifier Mapping)
       an appropriate reporting           • GCDML (MIGS/MIMS
        structure for capturing
        and exchanging data,                 specification in XML)
        software,                         • Genomes Catalogue
        databases                           (Database and Web
        and controlled                      Server)
        vocabularies and/or
        ontologies for defining
        the terms used in the
        annotations.

                                                       10
     Field et al. Nat Biotechnol. 2008
Aquatic Aquatic: Freshwater Acquatic: Marine Terrestrial Air Fossil Food Organism-Associated Extreme Habitat Other


                                               Habitat-Lite (= EnvO-Lite)
        Easy-to-use (small) set of terms
                • Captures high-level information about habitat
                • Derived from the Environment Ontology (EnvO).

        Meet the needs of multiple users
                • Annotators, database providers, biologists, and
                  bioinformaticians alike who need to search and
                  employ such data in comparative analyses.




                                                                  Hirschman et al. OMICS. 2008                       11   11
Habitat-Lite

            1. Level                                  2. Level
Aquatic                              soil
 Aquatic: Freshwater                 sediment
 Aquatic: Marine                     sludge
Terrestrial                          waste water
Air                                  hot spring
Fossil                               hydrothermal vent
Food                                 biofilm
Organism-Associated                  microbial mat
Extreme Habitat
Other


                       < 20 terms

                       Hirschman et al. OMICS. 2008        12    12
Habitat-Lite applied




   http://www.megx.net/genomes   13   13
Genomic Rosetta Stone (GRS)
Create a unified mapping between different genomic
 resources
Improve navigation across these resources
Enable the integration of this information in the near
 future.




                    Van Brabant et al. OMICS. 2008   14   14
Genomic Rosetta Stone (GRS)




       Van Brabant et al. OMICS. 2008   15   15
Genomic Rosetta Stone (GRS)
Enable the integration of this information in the near
 future




                    Van Brabant et al. OMICS. 2008   16   16
Genomic Contextual Data
             Markup Language (GCDML)


An Extensible Markup Language (XML)


Aim
  • Implement MIGS/MIMS
  • Provide even more descriptors
  • Facilitate exchange and integration of genomic data




                      Kottmann et al. OMICS. 2008   17    17
GCDML Example (excerpt)



<gcdml:originalSample>
  <gcdml:physicalMaterial>
    <gcdml:samplingTime><gcdml:notGiven>unknown</gcdml:notGiven></gcdml:samplingTime>

    <gcdml:samplePointLocation>
      <gml:LocationKeyWord>Baltic Sea</gml:LocationKeyWord>
      <gml:LocationString>Kiel Fjord, Baltic Sea, Germany</gml:LocationString>
      <gcdml:pos2D>54.329 10.149</gcdml:pos2D>
      <gcdml:determinationMethod>derived from literature</gcdml:determinationMethod>
    </gcdml:samplePointLocation>

    <gcdml:marineHabitat>
      <gcdml:waterBody>
         <gcdml:depth>
           <gcdml:measure min="0.00" max="0.05“><gcdml:values uom="m">0.00 0.05</gcdml:values></gcdml:measure>
         </gcdml:depth>
      </gcdml:waterBody>
    </gcdml:marineHabitat>

     <gcdml:materialType>seawater</gcdml:materialType>
     <gcdml:amount><gcdml:measure><gcdml:values uom="ml">100</gcdml:values></gcdml:measure></gcdml:amount>
  </gcdml:physicalMaterial>
</gcdml:originalSample>                                                                 18
                                             Kottmann et al. OMICS. 2008                                         18
GCDML Example (excerpt)



<gcdml:originalSample>
  <gcdml:physicalMaterial>
    <gcdml:samplingTime><gcdml:notGiven>unknown</gcdml:notGiven></gcdml:samplingTime>

    <gcdml:samplePointLocation>
      <gml:LocationKeyWord>Baltic Sea</gml:LocationKeyWord>
      <gml:LocationString>Kiel Fjord, Baltic Sea, Germany</gml:LocationString>
      <gcdml:pos2D>54.329 10.149</gcdml:pos2D>
      <gcdml:determinationMethod>derived from literature</gcdml:determinationMethod>
    </gcdml:samplePointLocation>

    <gcdml:marineHabitat>
      <gcdml:waterBody>
         <gcdml:depth>
           <gcdml:measure min="0.00" max="0.05“><gcdml:values uom="m">0.00 0.05</gcdml:values></gcdml:measure>
         </gcdml:depth>
      </gcdml:waterBody>
    </gcdml:marineHabitat>

     <gcdml:materialType>seawater</gcdml:materialType>
     <gcdml:amount><gcdml:measure><gcdml:values uom="ml">100</gcdml:values></gcdml:measure></gcdml:amount>
  </gcdml:physicalMaterial>
</gcdml:originalSample>                                                                 19
                                             Kottmann et al. OMICS. 2008                                         19
GCDML Example (excerpt)



<gcdml:originalSample>
  <gcdml:physicalMaterial>
    <gcdml:samplingTime><gcdml:notGiven>unknown</gcdml:notGiven></gcdml:samplingTime>

    <gcdml:samplePointLocation>
      <gml:LocationKeyWord>Baltic Sea</gml:LocationKeyWord>
      <gml:LocationString>Kiel Fjord, Baltic Sea, Germany</gml:LocationString>
      <gcdml:pos2D>54.329 10.149</gcdml:pos2D>
      <gcdml:determinationMethod>derived from literature</gcdml:determinationMethod>
    </gcdml:samplePointLocation>

    <gcdml:marineHabitat>
      <gcdml:waterBody>
         <gcdml:depth>
           <gcdml:measure min="0.00" max="0.05“><gcdml:values uom="m">0.00 0.05</gcdml:values></gcdml:measure>
         </gcdml:depth>
      </gcdml:waterBody>
    </gcdml:marineHabitat>

     <gcdml:materialType>seawater</gcdml:materialType>
     <gcdml:amount><gcdml:measure><gcdml:values uom="ml">100</gcdml:values></gcdml:measure></gcdml:amount>
  </gcdml:physicalMaterial>
</gcdml:originalSample>                                                                 20
                                             Kottmann et al. OMICS. 2008                                         20
Genome Catalogue
Online system for capturing MIGS/MIMS compliant
 reports




                    Field et al. Nature 2008   21   21
Genome Catalogue
Requirements
  • A Rich toolkit/user-friendly
  • Designed to give credit to all contributors
  • XML-based (GCDML)
        Able to maintain all versions of GCDML schemas
  • Web services-based
        Supporting the automated exchange of content
  • Serve as the international GCAT identifier authority
  • Comprehensive
        Containing reports for all taxa and metagenomes
  • Ontology-supportive
  • Shared by the GSC

                                                 22        22
Current Status
We have specifications:
  • MIGS/MIMS
  • Habitat-Lite
  • Genomic Rosetta Stone
Work on supporting software is ongoing:
  • Genomes Catalogue is in prototype status
  • Funding
        This is a long-term endeavour that can not be done on a
         voluntary basis




                                                  23               23
Disscusion
Need of software for:
  • Creation of MIGS/MIMS data
  • Storage
  • Analysis
Expand standardization efforts to
  • Software specification/development
  • Work on a standardized genomic data management
    architecture / cyberinfrastructure
Data intensive science is successful if it works
 towards one community with one vision
  • World Wide Genomics project

                                          24         24
Acknowledgements

All Members of GSC incl.
       Dawn Field
       Peter Sterk
       Saul Kravitz
       Tanya Gray

Megx.net team
       Frank Oliver Glöckner
       Ivaylo Kostadinov
       Melissa Beth Duhaime
       Pier Luigi Buttigieg
       Wolfgang Hankeln
       Pelin Yilmaz


                                            25
END



Looking forward to the discussion

          Join the GSC
         http://gensc.org


                            26       26

Más contenido relacionado

Similar a Software Development by the Genomics Standards Consortium

The MIBBI Foundry and its Modules
The MIBBI Foundry and its ModulesThe MIBBI Foundry and its Modules
The MIBBI Foundry and its ModulesMIBBI Checklists
 
Functional and Architectural Requirements for Metadata: Supporting Discovery...
Functional and Architectural Requirements for Metadata: Supporting Discovery...Functional and Architectural Requirements for Metadata: Supporting Discovery...
Functional and Architectural Requirements for Metadata: Supporting Discovery...Jian Qin
 
The OptIPlanet Collaboratory Supporting Microbial Metagenomics Researchers Wo...
The OptIPlanet Collaboratory Supporting Microbial Metagenomics Researchers Wo...The OptIPlanet Collaboratory Supporting Microbial Metagenomics Researchers Wo...
The OptIPlanet Collaboratory Supporting Microbial Metagenomics Researchers Wo...Larry Smarr
 
Cambridge University Geospatial Metadata Workshop 20110524
Cambridge University Geospatial Metadata Workshop 20110524Cambridge University Geospatial Metadata Workshop 20110524
Cambridge University Geospatial Metadata Workshop 20110524EDINA, University of Edinburgh
 
2011Field talk at iEVOBIO 2011
2011Field talk at iEVOBIO 20112011Field talk at iEVOBIO 2011
2011Field talk at iEVOBIO 2011MIBBI Checklists
 
Tim Malthus_Towards standards for the exchange of field spectral datasets
Tim Malthus_Towards standards for the exchange of field spectral datasetsTim Malthus_Towards standards for the exchange of field spectral datasets
Tim Malthus_Towards standards for the exchange of field spectral datasetsTERN Australia
 
Leeds University Geospatial Metadata Workshop 20110617
Leeds University Geospatial Metadata Workshop 20110617Leeds University Geospatial Metadata Workshop 20110617
Leeds University Geospatial Metadata Workshop 20110617EDINA, University of Edinburgh
 
AI and Machine Learning for Secondary Metabolite Prediction
AI and Machine Learning for Secondary Metabolite PredictionAI and Machine Learning for Secondary Metabolite Prediction
AI and Machine Learning for Secondary Metabolite PredictionYannick Djoumbou
 
Oxford University Geospatial Metadata Workshop 20110415
Oxford University Geospatial Metadata Workshop 20110415Oxford University Geospatial Metadata Workshop 20110415
Oxford University Geospatial Metadata Workshop 20110415EDINA, University of Edinburgh
 
Northumbria University Geospatial Metadata Workshop 20110505
Northumbria University Geospatial Metadata Workshop 20110505Northumbria University Geospatial Metadata Workshop 20110505
Northumbria University Geospatial Metadata Workshop 20110505EDINA, University of Edinburgh
 
Geospatial Metadata and Spatial Data: It's all Greek to me!
Geospatial Metadata and Spatial Data: It's all Greek to me!Geospatial Metadata and Spatial Data: It's all Greek to me!
Geospatial Metadata and Spatial Data: It's all Greek to me!EDINA, University of Edinburgh
 
Henry&Hobbs, 'Developing long-term agro-ecological trial datasets for C and N...
Henry&Hobbs, 'Developing long-term agro-ecological trial datasets for C and N...Henry&Hobbs, 'Developing long-term agro-ecological trial datasets for C and N...
Henry&Hobbs, 'Developing long-term agro-ecological trial datasets for C and N...TERN Australia
 
Data cycle microbes
Data cycle microbesData cycle microbes
Data cycle microbesjyotikhadake
 
Human genome project the mitre corporation - jason program office
Human genome project   the mitre corporation - jason program officeHuman genome project   the mitre corporation - jason program office
Human genome project the mitre corporation - jason program officePublicLeaker
 
Human genome project the mitre corporation - jason program office
Human genome project   the mitre corporation - jason program officeHuman genome project   the mitre corporation - jason program office
Human genome project the mitre corporation - jason program officePublicLeaks
 
BioDec Srl Company Profile
BioDec Srl Company ProfileBioDec Srl Company Profile
BioDec Srl Company ProfileBioDec
 

Similar a Software Development by the Genomics Standards Consortium (20)

The MIBBI Foundry and its Modules
The MIBBI Foundry and its ModulesThe MIBBI Foundry and its Modules
The MIBBI Foundry and its Modules
 
Functional and Architectural Requirements for Metadata: Supporting Discovery...
Functional and Architectural Requirements for Metadata: Supporting Discovery...Functional and Architectural Requirements for Metadata: Supporting Discovery...
Functional and Architectural Requirements for Metadata: Supporting Discovery...
 
The OptIPlanet Collaboratory Supporting Microbial Metagenomics Researchers Wo...
The OptIPlanet Collaboratory Supporting Microbial Metagenomics Researchers Wo...The OptIPlanet Collaboratory Supporting Microbial Metagenomics Researchers Wo...
The OptIPlanet Collaboratory Supporting Microbial Metagenomics Researchers Wo...
 
Cambridge University Geospatial Metadata Workshop 20110524
Cambridge University Geospatial Metadata Workshop 20110524Cambridge University Geospatial Metadata Workshop 20110524
Cambridge University Geospatial Metadata Workshop 20110524
 
2011Field talk at iEVOBIO 2011
2011Field talk at iEVOBIO 20112011Field talk at iEVOBIO 2011
2011Field talk at iEVOBIO 2011
 
iRODS
iRODSiRODS
iRODS
 
Tim Malthus_Towards standards for the exchange of field spectral datasets
Tim Malthus_Towards standards for the exchange of field spectral datasetsTim Malthus_Towards standards for the exchange of field spectral datasets
Tim Malthus_Towards standards for the exchange of field spectral datasets
 
Leeds University Geospatial Metadata Workshop 20110617
Leeds University Geospatial Metadata Workshop 20110617Leeds University Geospatial Metadata Workshop 20110617
Leeds University Geospatial Metadata Workshop 20110617
 
AI and Machine Learning for Secondary Metabolite Prediction
AI and Machine Learning for Secondary Metabolite PredictionAI and Machine Learning for Secondary Metabolite Prediction
AI and Machine Learning for Secondary Metabolite Prediction
 
Oxford University Geospatial Metadata Workshop 20110415
Oxford University Geospatial Metadata Workshop 20110415Oxford University Geospatial Metadata Workshop 20110415
Oxford University Geospatial Metadata Workshop 20110415
 
Northumbria University Geospatial Metadata Workshop 20110505
Northumbria University Geospatial Metadata Workshop 20110505Northumbria University Geospatial Metadata Workshop 20110505
Northumbria University Geospatial Metadata Workshop 20110505
 
Geospatial Metadata and Spatial Data: It's all Greek to me!
Geospatial Metadata and Spatial Data: It's all Greek to me!Geospatial Metadata and Spatial Data: It's all Greek to me!
Geospatial Metadata and Spatial Data: It's all Greek to me!
 
Henry&Hobbs, 'Developing long-term agro-ecological trial datasets for C and N...
Henry&Hobbs, 'Developing long-term agro-ecological trial datasets for C and N...Henry&Hobbs, 'Developing long-term agro-ecological trial datasets for C and N...
Henry&Hobbs, 'Developing long-term agro-ecological trial datasets for C and N...
 
Data cycle microbes
Data cycle microbesData cycle microbes
Data cycle microbes
 
Cpascoe pimms or2012_
Cpascoe pimms or2012_Cpascoe pimms or2012_
Cpascoe pimms or2012_
 
Dr. Ying Xiao: Radiation Therapy Oncology Group Bioinformatics
Dr. Ying Xiao: Radiation Therapy Oncology Group BioinformaticsDr. Ying Xiao: Radiation Therapy Oncology Group Bioinformatics
Dr. Ying Xiao: Radiation Therapy Oncology Group Bioinformatics
 
Human genome project the mitre corporation - jason program office
Human genome project   the mitre corporation - jason program officeHuman genome project   the mitre corporation - jason program office
Human genome project the mitre corporation - jason program office
 
Human genome project the mitre corporation - jason program office
Human genome project   the mitre corporation - jason program officeHuman genome project   the mitre corporation - jason program office
Human genome project the mitre corporation - jason program office
 
BioDec Srl Company Profile
BioDec Srl Company ProfileBioDec Srl Company Profile
BioDec Srl Company Profile
 
Brizio rossibiodec
Brizio rossibiodecBrizio rossibiodec
Brizio rossibiodec
 

Último

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 

Último (20)

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 

Software Development by the Genomics Standards Consortium

  • 1. Bringing Standards to Life: Software Development by the Genomics Standards Consortium Renzo Kottmann Microbial Genomics Group Max Planck Institute for Marine Microbiology M3 SIG Stockholm July 2009 1
  • 2. Genomic Standards Consortium (GSC) Goal • Promote mechanisms that  standardize the description of genomes  exchange and integrate genomic data Open-membership, international working body • Established in Sept 2005 • Participants include DDBJ, EMBL, GenBank, Sanger, JCVI, JGI, EBI and a range of US, UK and EU research institutions • Organized a series of workshops 2 2 http://gensc.org and http://gensc.org/gc_wiki/index.php/GSC_Membership
  • 3. Minimum Information about a Genome Sequence (MIGS) Specification MIGS extends what DDBJ/EMBL/GenBank request upon submission of a genome sequence • Examples:  Description of geographic location of a sample and habitat  “Minimum Information about a Metagenomic Sequence” (MIMS) – Temperature – pH  Description of sequence generation – Sequencing method – Assembly method 3 3 Field et al. Nat Biotechnol. 2008
  • 4. MIGS Checklist 2.0 4 4 Field et al. Nat Biotechnol. 2008
  • 5. MIGS Checklist 2.0 M = mandatory 5 5 Field et al. Nat Biotechnol. 2008
  • 6. Software Development for MIGS/MIMS Mechanisms for achieving compliance are needed: • Such mechanisms involve  an appropriate reporting structure for capturing and exchanging data,  software,  databases  and controlled vocabularies and/or ontologies for defining the terms used in the annotations. 6 Field et al. Nat Biotechnol. 2008
  • 7. Software Development for MIGS/MIMS Mechanisms for Supporting Projects: achieving compliance • Habitat-Lite (Ontology are needed: specification) • Such mechanisms involve  an appropriate reporting structure for capturing and exchanging data,  software,  databases  and controlled vocabularies and/or ontologies for defining the terms used in the annotations. 7 Field et al. Nat Biotechnol. 2008
  • 8. Software Development for MIGS/MIMS Mechanisms for Supporting Projects: achieving compliance • Habitat-Lite (Ontology are needed: specification) • Such mechanisms • Genomic Rosetta Stone involve (Identifier Mapping)  an appropriate reporting structure for capturing and exchanging data,  software,  databases  and controlled vocabularies and/or ontologies for defining the terms used in the annotations. 8 Field et al. Nat Biotechnol. 2008
  • 9. Software Development for MIGS/MIMS Mechanisms for Supporting Projects: achieving compliance • Habitat-Lite (Ontology are needed: specification) • Such mechanisms • Genomic Rosetta Stone involve (Identifier Mapping)  an appropriate reporting • GCDML (MIGS/MIMS structure for capturing and exchanging data, specification in XML)  software,  databases  and controlled vocabularies and/or ontologies for defining the terms used in the annotations. 9 Field et al. Nat Biotechnol. 2008
  • 10. Software Development for MIGS/MIMS Mechanisms for Supporting Projects: achieving compliance • Habitat-Lite (Ontology are needed: specification) • Such mechanisms • Genomic Rosetta Stone involve (Identifier Mapping)  an appropriate reporting • GCDML (MIGS/MIMS structure for capturing and exchanging data, specification in XML)  software, • Genomes Catalogue  databases (Database and Web  and controlled Server) vocabularies and/or ontologies for defining the terms used in the annotations. 10 Field et al. Nat Biotechnol. 2008
  • 11. Aquatic Aquatic: Freshwater Acquatic: Marine Terrestrial Air Fossil Food Organism-Associated Extreme Habitat Other Habitat-Lite (= EnvO-Lite) Easy-to-use (small) set of terms • Captures high-level information about habitat • Derived from the Environment Ontology (EnvO). Meet the needs of multiple users • Annotators, database providers, biologists, and bioinformaticians alike who need to search and employ such data in comparative analyses. Hirschman et al. OMICS. 2008 11 11
  • 12. Habitat-Lite 1. Level 2. Level Aquatic soil Aquatic: Freshwater sediment Aquatic: Marine sludge Terrestrial waste water Air hot spring Fossil hydrothermal vent Food biofilm Organism-Associated microbial mat Extreme Habitat Other < 20 terms Hirschman et al. OMICS. 2008 12 12
  • 13. Habitat-Lite applied http://www.megx.net/genomes 13 13
  • 14. Genomic Rosetta Stone (GRS) Create a unified mapping between different genomic resources Improve navigation across these resources Enable the integration of this information in the near future. Van Brabant et al. OMICS. 2008 14 14
  • 15. Genomic Rosetta Stone (GRS) Van Brabant et al. OMICS. 2008 15 15
  • 16. Genomic Rosetta Stone (GRS) Enable the integration of this information in the near future Van Brabant et al. OMICS. 2008 16 16
  • 17. Genomic Contextual Data Markup Language (GCDML) An Extensible Markup Language (XML) Aim • Implement MIGS/MIMS • Provide even more descriptors • Facilitate exchange and integration of genomic data Kottmann et al. OMICS. 2008 17 17
  • 18. GCDML Example (excerpt) <gcdml:originalSample> <gcdml:physicalMaterial> <gcdml:samplingTime><gcdml:notGiven>unknown</gcdml:notGiven></gcdml:samplingTime> <gcdml:samplePointLocation> <gml:LocationKeyWord>Baltic Sea</gml:LocationKeyWord> <gml:LocationString>Kiel Fjord, Baltic Sea, Germany</gml:LocationString> <gcdml:pos2D>54.329 10.149</gcdml:pos2D> <gcdml:determinationMethod>derived from literature</gcdml:determinationMethod> </gcdml:samplePointLocation> <gcdml:marineHabitat> <gcdml:waterBody> <gcdml:depth> <gcdml:measure min="0.00" max="0.05“><gcdml:values uom="m">0.00 0.05</gcdml:values></gcdml:measure> </gcdml:depth> </gcdml:waterBody> </gcdml:marineHabitat> <gcdml:materialType>seawater</gcdml:materialType> <gcdml:amount><gcdml:measure><gcdml:values uom="ml">100</gcdml:values></gcdml:measure></gcdml:amount> </gcdml:physicalMaterial> </gcdml:originalSample> 18 Kottmann et al. OMICS. 2008 18
  • 19. GCDML Example (excerpt) <gcdml:originalSample> <gcdml:physicalMaterial> <gcdml:samplingTime><gcdml:notGiven>unknown</gcdml:notGiven></gcdml:samplingTime> <gcdml:samplePointLocation> <gml:LocationKeyWord>Baltic Sea</gml:LocationKeyWord> <gml:LocationString>Kiel Fjord, Baltic Sea, Germany</gml:LocationString> <gcdml:pos2D>54.329 10.149</gcdml:pos2D> <gcdml:determinationMethod>derived from literature</gcdml:determinationMethod> </gcdml:samplePointLocation> <gcdml:marineHabitat> <gcdml:waterBody> <gcdml:depth> <gcdml:measure min="0.00" max="0.05“><gcdml:values uom="m">0.00 0.05</gcdml:values></gcdml:measure> </gcdml:depth> </gcdml:waterBody> </gcdml:marineHabitat> <gcdml:materialType>seawater</gcdml:materialType> <gcdml:amount><gcdml:measure><gcdml:values uom="ml">100</gcdml:values></gcdml:measure></gcdml:amount> </gcdml:physicalMaterial> </gcdml:originalSample> 19 Kottmann et al. OMICS. 2008 19
  • 20. GCDML Example (excerpt) <gcdml:originalSample> <gcdml:physicalMaterial> <gcdml:samplingTime><gcdml:notGiven>unknown</gcdml:notGiven></gcdml:samplingTime> <gcdml:samplePointLocation> <gml:LocationKeyWord>Baltic Sea</gml:LocationKeyWord> <gml:LocationString>Kiel Fjord, Baltic Sea, Germany</gml:LocationString> <gcdml:pos2D>54.329 10.149</gcdml:pos2D> <gcdml:determinationMethod>derived from literature</gcdml:determinationMethod> </gcdml:samplePointLocation> <gcdml:marineHabitat> <gcdml:waterBody> <gcdml:depth> <gcdml:measure min="0.00" max="0.05“><gcdml:values uom="m">0.00 0.05</gcdml:values></gcdml:measure> </gcdml:depth> </gcdml:waterBody> </gcdml:marineHabitat> <gcdml:materialType>seawater</gcdml:materialType> <gcdml:amount><gcdml:measure><gcdml:values uom="ml">100</gcdml:values></gcdml:measure></gcdml:amount> </gcdml:physicalMaterial> </gcdml:originalSample> 20 Kottmann et al. OMICS. 2008 20
  • 21. Genome Catalogue Online system for capturing MIGS/MIMS compliant reports Field et al. Nature 2008 21 21
  • 22. Genome Catalogue Requirements • A Rich toolkit/user-friendly • Designed to give credit to all contributors • XML-based (GCDML)  Able to maintain all versions of GCDML schemas • Web services-based  Supporting the automated exchange of content • Serve as the international GCAT identifier authority • Comprehensive  Containing reports for all taxa and metagenomes • Ontology-supportive • Shared by the GSC 22 22
  • 23. Current Status We have specifications: • MIGS/MIMS • Habitat-Lite • Genomic Rosetta Stone Work on supporting software is ongoing: • Genomes Catalogue is in prototype status • Funding  This is a long-term endeavour that can not be done on a voluntary basis 23 23
  • 24. Disscusion Need of software for: • Creation of MIGS/MIMS data • Storage • Analysis Expand standardization efforts to • Software specification/development • Work on a standardized genomic data management architecture / cyberinfrastructure Data intensive science is successful if it works towards one community with one vision • World Wide Genomics project 24 24
  • 25. Acknowledgements All Members of GSC incl.  Dawn Field  Peter Sterk  Saul Kravitz  Tanya Gray Megx.net team  Frank Oliver Glöckner  Ivaylo Kostadinov  Melissa Beth Duhaime  Pier Luigi Buttigieg  Wolfgang Hankeln  Pelin Yilmaz 25
  • 26. END Looking forward to the discussion Join the GSC http://gensc.org 26 26