1. SEEK for Science
Alessandro Borsoi
11.09.2014, EDUCAFE, METID – Politecnico di Milano
2. what it is (1)
SEEK is a storage platform designed to
facilitate heterogeneous data and model
storage and sharing, across multi-group
scientific projects.
3. what it is (2)
SEEK is an open-source, web-based platform
and suite of software tools for the
management, linking, exploration and
exchange of Systems Biology data, models
and Standard Operating Procedures (SOPs).
SEEK is designed to facilitate data sharing
and collaborations between scientists.
4. who
Developed by:
- a team at the University of Manchester in the
United Kingdom
- the Heidelberg Institute for Theoretical Studies in
Germany
Founded by:
- the BBSRC in the UK
- the BMBF in Germany
as part of the SysMO-DB project
5. story
SEEK was conceived as part of SysMO, a pan-European
initiative to record and describe dynamic molecular
processes in unicellular organisms: from laboratory to
mathematical model.
SEEK grew organically with the projects needs, informed by
a core user-focus group known as the SysMO PALs.
SEEK is now the central hub for the SysMO community to
store and share a wide variety of data, from collection to
publication, for both laboratory and computational
experiments.
6. data (1)
SEEK ‘data’ type:
- data generated by high-throughput experiments.
- data arising from low throughput, cumulative experiments in the form of:
raw data, i.e. single pieces of data belonging to a larger data series, non-replicated
data, non quantified data.
experimental results, i.e. reliable, quantified and repeated data series, including high-throughput
data.
calculated data, i.e. involving further analysis of raw data.
image data.
- data arising from biological modelling.
- models generated by systems biology approach.
- parameterisations of models.
- validation data for models.
- metadata, i.e. data providing information about one or more pieces of data.
- processes used to design the experiments, generate the data, and generate the models,
i.e. standard operating procedures (SOPs), spreadsheets, workflows.
7. data (2)
Data Catalogue
The data catalogue in SEEK includes raw Datasets, Standard Operating Procedures
(SOPs), Models, Publications and Presentations. All data are grouped by projects, and
associated with the researchers who produced them. In order to encourage sharing of data
we allow researchers flexibility in the formats they upload and share their data in. This
means data formats in the SEEK catalogue can vary. We do offer a set of “best practice”
guidelines for researchers who want to make their data available and usable to the widest
possible audience.
Most common formats allow viewing within the browser, without a download, with additional
enhanced features for spreadsheets and SBML models.
As a dynamic service, SEEK aims to expand functionality provided for data types and
formats as the needs arise. Where SEEK does not appear to support a data type or format,
a request can be placed to extend SEEK for this data.
All data and information added to SEEK is searchable using key-words.
8. data (3)
Organise and store your
data
SEEK has adopted
an ISATAB style structure for
organising experiments and data.
9. data (4)
ISA and Interlinking
Data in SEEK gain increased value and usability when they are described within the
context of an experimental process. Multiple experiments will be carried out as part of a
single Study, and that study may be part of a wider overall funded Investigation. In SEEK
we adopt the ISATAB structure (Investigation, Studies, Assays) which is a community
standard for describing links between Omics experiments. We believe that many aspects of
the ISA framework are equally appropriate for describing experiments beyond Omics and
Biology, so allow this framework to be applied to all data.
Beyond the ISA framework, SEEK allows data to be interlinked within the site itself in order
to describe their relationship.
If research resulted in a publication, this can also be registered with SEEK (including
accreditation to relevant people) using a PUBMED identifier or DOI, and linked to the
assets involved in that research – allowing other researchers access to use, examine, or
validate the data that would otherwise be unavailable through the publication alone.
10. data (5)
Explore and annotate data
Excel spreadsheets can be explored and annotated without the need to
download.
11. data (6)
Semantic spreadsheet templates
Using RightField we have produced a wide collection of template files.
12. data (7)
Versioning
All data is stored using versioning, selectable privacy, and static URLs. Versioning and
privacy settings ensure that you can share your most recent data, with who you choose.
Static URLs ensure that you can be credited directly for all shared work.
13. data (8)
There is a lot of
flexibility and control
over who can see,
download or edit your
items.
Flexible sharing controls
14. data (9)
Access Control
Data will go through a research lifecycle between collection and publication. In a
competitive academic environment it is important that the data can be shared with
collaborators, and then the wider community at appropriate points within the life-cycle.
SEEK allows users to keep their uploaded data entirely private, to share between
individuals, then across entire projects, until eventually making it public upon publication.
15. SBML models (1)
Simulate SBML models
Most models that conforms to the SBML format can be simulated within
SEEK.
16. SBML models (2)
Model simulation and annotation
if models follow the SBML standard, they can be simulated, or annotated and re-added as a
new version, all within SEEK.
The JWS Online model simulator presents a schematic diagram of the model, and allows
parameters and reactions to be modified for the simulation.
Models can also be edited using JWS Online OneStop, and semantically annotated with
Miriam annotations, and then saved back to SEEK as a new version.
17. people (1)
Who's doing what, where?
You can find out what
people using, and
have expertise in, and
how to get in contact
with them.
18. people (2)
People index
SEEK contains an index of people where users can browse, or keyword-search, profiles of
the projects, groups and people that contribute to the data on the site. People can describe
their areas of expertise, which allows other users to quickly identify the right people to
approach regarding specialist enquiries and collaboration proposals.
19. people (3)
PALS
SEEK has a varied network of scientists, known as SysMO
PALs, who represent a wide but typical user base. Through
regular meetings with these PALs we have, and continue, to
develop a platform that is tailored in functionality and
usability to you, the scientist.