Bioschemas aims to improve data interoperability in life sciences. It does this by encouraging people in life science to use schema.org markup, so that their websites and services contain consistently structured information. This structured information then makes it easier to discover, collate and analyse distributed data.
This presentation gives an overview of the project and the ELIXIR funded Implementation Study running through 2017.
5. <div>
<h1>Classic potato salad</h1>
<div>
Nutrition facts:
<span>144 kcal</span>,
</div>
Ingredients:
- <span>800g small new potato</span>
Structured data markup for web pages
Recipe
Nutrition
Calories
Ingridients
Title
Without markup
6. <div itemscope itemtype="http://schema.org/Recipe">
<h1 itemprop="name">Classic potato salad</h1>
<div itemprop="nutrition” itemscope
itemtype="http://schema.org/NutritionInformation">
Nutrition facts:
<span itemprop="calories">144 kcal</span>,
</div>
Ingredients:
- <span itemprop="recipeIngredient">800g small new potato</span>
- <span itemprop="recipeIngredient">3 shallot</span>
Structured data markup for web pages
RDFa
JSON-LD
Microdata
With markup
9. Bioschemas
• Schema.org for biology
• Minimum properties for
–Finding data
–Presenting search results
Specification on top of schema.org
Layer of constrains + documentation +
extensions
Specification
Data model
Minimum information
Controlled vocabularies
Cardinality
Documentation
Examples
New (properties | types)
10. (Some) Life Sciences
Metadata Specifications
5 April 2017 @gray_alasdair www.macs.hw.ac.uk/~ajg33 10
Depth
Reach
model
HCLS DataDesc
15. ELIXIR/Bioschemas activities
planned for 2017
• Specifications and demonstrators
–Data repository, Dataset, Sample, Phenotype, Beacons and
Protein annotations
• Discovery and validation tools
• Support and community engagement
–Meetings, Hackathons, Knowledge dissemination, Training in
adoption
Better exposure of metadata
to search engines and registries
Better search
16. Implementation Study Objectives
Life Sciences
Content
Types
schema.org content types for life science Data
• Data repository, Dataset
• Samples, Protein annotations
• Phenotype annotations
Discovery
and
Validation
Discovery and validation of Bioschemas entries
Community
Support and
Promotion
• Support community and adoption
• Alignment of technical activities
• Working group within ELIXIR
• Collaboration between ELIXIR and BD2K
• Test benefits and issues
Hands-onworkshops
Smalldeliveryteam
• Publication of metadata
• Automated integration of metadata in
specialised registries
17. Bioschemas Community
Many stakeholders and work streams. Lots of enthusiasm.
•Good communication and coordination
• Among partners
• With Bioschemas community
• With schema.org
•Two major activities
• The Project
• The Community
ELIXIR
Implementation
Study
EOSCPilot
Bioschemas
Project
Schema.org
Bioschemas.org
Bioschemas
Project Bioschemas
Project
ELIXIR
19. Hands-on workshops
With online work in between
• Synchronise subprojects and groups with workshops
– Timetable
– Templates for working and comms
– Cross walks
• Subprojects and groups work through collaborative
documents
• Bioschemas Web site – http://bioschemas.org
– Googledocs
– Github
– Revise and consolidate groups
• W3C Community Group
– Mailing list
Planning
Agreement
Adoption
Application
1
2
3
4
March-April 2017
May-June 2017
July-Oct 2017
Nov-Feb 2018
21. • Low barrier & attention span
• Examples not manuals
• Example + Spec not Spec then example
• Schema.org + Just Enough NOT Just In Case
• Find Use Case.
• Bioschema is supported by but independent
of sponsors (ELIXIR, BD2K etc) and will apply
beyond those sponsors
• Mark up datasets for Google
Finding driven and Example driven