Botany krishna series 2nd semester Only Mcq type questions
Open Data: Sharing the Main Actor of a Scientific Story - Paola Masuzzo
1. CC BY-SA 4.0
OPEN DATA
sharing the main actor of a scientific story
pcmasuzzo
24 October 2016 paola masuzzo
2. CC BY-SA 4.0
What is exactly open data?
Why should you make your data open?
How can you make your data open?
My open data story
3. CC BY-SA 4.0
What is exactly open data?
Why should you make your data open?
How can you make your data open?
My open data story
4. CC BY-SA 4.0
Open data implies freedom to access, use
and re-use for any purpose
http://opendefinition.org/od/
5. CC BY-SA 4.0
Open data implies freedom to access, use
and re-use for any purpose
http://opendefinition.org/od/; http://opendefinition.org/licenses/
There are many open knowledge definition conformant licenses
CC0 waiver
https://creativecommons.org/publicdomain/zero/1.0/
CC BY (Attribution only)
https://creativecommons.org/licenses/by/4.0/
CC BY-SA (Attribution ShareAlike)
https://creativecommons.org/licenses/by-sa/4.0/
6. CC BY-SA 4.0
Open data implies freedom to access, use
and re-use for any purpose
http://opendefinition.org/od/; http://opendefinition.org/licenses/
There are many open knowledge definition conformant licenses
CC0 waiver
https://creativecommons.org/publicdomain/zero/1.0/
CC BY (Attribution only)
https://creativecommons.org/licenses/by/4.0/
CC BY-SA (Attribution ShareAlike)
https://creativecommons.org/licenses/by-sa/4.0/
7. CC BY-SA 4.0
What is exactly open data?
Why should you make your data open?
How can you make your data open?
My open data story
8. CC BY-SA 4.0
Research data need to be treated as
first-class citizens in science
Vines et al., Current Biology, 2014; image courtesy Auke Herrema
9. CC BY-SA 4.0
Research data need to be treated as
first-class citizens in science
Vines et al., Current Biology, 2014; image courtesy Auke Herrema
Data should
themselves be
considered the
primary output
of research
10. CC BY-SA 4.0
One could just argue that data produced
with public funds belong to the public
Image courtesy Auke Herrema
11. CC BY-SA 4.0
But there are so many more great reasons
for data to be open
develop new
analysis methods
improve research
practices
guarantee data
preservation
reduce cost of
science
engage
with citizens
increase visibility and
collaborations
science-driven motivations
society-driven motivations
data users
benefits
data producers
benefits
enhance reproducibility
ask new questions
advance science
12. CC BY-SA 4.0
Open data means more hands at work, more
brain power and faster innovations
Gina Kolata, The New York Times, 2010; SCIENCEMAG 2016 - Williamson et al., 2016
13. CC BY-SA 4.0
Open data creates a culture of transparency
and potentially discourages fraud
Wicherts et al., PloS one, 2011
“Willingness to share research data is related to the strength of
the evidence and the quality of reporting of statistical results”
14. CC BY-SA 4.0
Open data means more reproducibility and
better research practices
Monya Baker, Nature, 2016; image courtesy Auke Herrema
15. CC BY-SA 4.0
Open data means also visibility and a higher
chance to get cited
Piwowar et al., PeerJ, 2013
citation
advantage
16. CC BY-SA 4.0
What is exactly open data?
Why should you make your data open?
How can you make your data open?
My open data story
17. CC BY-SA 4.0
The Panton Principles are a pretty good
starting point
1. When publishing data, make an explicit and robust statement of your
wishes.
2. Use a recognized copyright waiver or license that is appropriate for data.
3. If you want your data to be effectively used and added to by others, it
should be open as defined by the Open Knowledge/Data Definition—in
particular, non-commercial and other restrictive clauses should not be used.
4. Explicit dedication of data underlying published science into the public
domain via PDDL (http://opendatacommons.org/licenses/pddl/1-0/) or
CCZero (http://creativecommons.org/publicdomain/zero/1.0/) is strongly
recommended and ensures compliance with both the Science Commons
Protocol for Implementing Open Access Data and the Open Knowledge/Data
Definition.
http://pantonprinciples.org
18. CC BY-SA 4.0
A lot of repositories are available to upload
research materials and data
19. CC BY-SA 4.0
A lot of repositories are available to upload
research materials and data
20. CC BY-SA 4.0
You certainly don’t need to know more than
1,500 repositories by heart
https://biosharing.org/databases/
22. CC BY-SA 4.0
Making data available is only one half of the
open data equation
intelligent access to the data and interoperability are crucial
Wilkinson et al., 2016, Scientific Data; https://www.force11.org/group/fairgroup
23. CC BY-SA 4.0
What is exactly open data?
Why should you make your data open?
How can you make your data open?
My open data story
24. CC BY-SA 4.0
Cell migration experiments are complex and
produce diverse and rich data sets
sample
preparation
image
acquisition
image
processing
data
analysis
Servier Medical Art, CC-BY 3.0; Cell Image Library, CC-BY 3.0
25. CC BY-SA 4.0
Cell migration experiments are complex and
produce diverse and rich data sets
Servier Medical Art, CC-BY 3.0; Cell Image Library, CC-BY 3.0
• paper laboratory
notebooks
• electronic
laboratory
notebooks
• spreadsheets
• text files
• protocols
• papers...
• raw files
• XML files
• proprietary
microscope or
acquisition software
files ND2 for
Nikon, LIF for Leica,
OIB or OIF for
Olympus, LSM or ZVI
for Zeiss
• OME-TIFF
• image files with
pixel values and
metadata
• png, jpeg, tiff, avi
• text files describing
processing
algorithms
• text files describing
extracted features
• graphs, plots
• analysis pipelines
• text files describing
computational
algorithms...
sample
preparation
image
acquisition
image
processing
data
analysis
26. CC BY-SA 4.0
CellMissy is our open-source tool for cell
migration data management and analysis
0 3h 6h
wound
cells
Experiment
Data Analyzer
Data Loader
Collective cell migration Single-cell migration
Experiment Manager
Masuzzo et al., Bioinformatics, 2013; https://github.com/compomics/cellmissy
people who use the data must credit whoever has published or generate the data (attribution)
copies or adaptations of the data must be released similarly as open data (share-alike)
There is little point in opening up data if it is not used; it does not intrinsically lead to better science in and of itself, although it could be argued that the open publication of datasets will directly discourage fraud.
More than 70% of researchers have tried and failed to reproduce another scientist's experiments, and more than half have failed to reproduce their own experiments. Those are some of the telling figures that emerged from Nature's survey of 1,576 researchers who took a brief online questionnaire on reproducibility in research.
Methods, code - raw data not available
It would be useful to evaluate the reuse of current open data, but evidence is limited due to issues in tracking data citations. However, it does appear that publicly sharing your data increases citation rate, at least in cancer microarray experiments, which is positive encouragement that open biological data is being reused
Note: no license does NOT mean that your data is open!
Data must be well described before others can use it and benefit from it.
Scientists who share data in a reusable manner deserve credit through citable publications.