What does open science mean? A stakeholder perspective
1. What does Open Science mean? A stakeholder
perspective.
Susan Reilly
Executive Director
LIBER: Ligue des Bibliothèques Européennes de Recherche
susan.reilly@kb.nl
@skreilly
2. I’ll be talking about…
• The stakeholders I represent
• Definition of Open Science
• Building blocks of Open Science
• What can libraries do to support Open Science?
3. LIBER: reinventing the library of the future
• Largest network of European research libraries: 410 in over 40
countries
Mission:
“To provide an information infrastructure to enable
research in LIBER institutions to be world class”
4. Libraries enabling Open Science
“We believe that the move towards openness will lead to
increased transparency, better quality research, a higher
level of citizen engagement, and will accelerate the pace
of scientific discovery through the facilitation of data-
driven innovation.”
http://libereurope.eu/wp-content/uploads/2014/09/LIBER_St
5. The problem with defining Open Science
• Means is often confused with the end
• Ultimate goal is to be a transient term i.e.
Open Science = Science
• Aims to bring coherence and vision to a range of
different open activities e.g. open access, open
data, open software
• Key is changing practice and culture, which is
different for every stakeholder
6. Open Science Definition
“The conduction of science in a way that
others can collaborate and contribute, where
research data, lab notes and other research
processes are freely available, with terms that
allow reuse, redistribution and reproduction of
the research”
https://www.fosteropenscience.eu/foster-taxonomy/open-science-definition
7. Open Science Goals
• Transparency in experimental methodology, observation,
and collection of data
• Public availability and reusability of scientific data
• Public accessibility and transparency of scientific
communication
• Using web-based tools to facilitate scientific
collaboration
Dan Gezelter, http://www.openscience.org/blog/?p=269
9. To an Open Science Landscape
Open access publishing
New forms of peer review
Open infrastructure
Research data management
Open educational resources
Massive Open Online Courses (MOOCs)
Open science
Collaboration
Coyright & licencing
Policy
Advocacy & training
Alternative Metrics
Open data
10. Building block: open access
“Wewriteto
communicatean
untenablesituation
facingtheHarvard
Library. Manylarge
journalpublishers
havemadethe
scholarly
communication
environmentfiscally
unsustainableand
academically
restrictive. ”
Harvard University
Library, 2012
Moved beyond the tipping point:
http://www.science-metrix.com/pdf/SM_EC_OA_Availability_2004-2011.pdf
11. OpenAire2020 Post FP7 Gold Open Access Pilot
EUR 4m funding provided by the EC to support Open Access publications
from post-grant FP7 projects finished no longer than 2 years ago.
Guidelines develop after consultation:
Maximum of three publications per project to be funded (research
articles, monographs, book chapters, contributions to conference
proceedings) which meet the requirements described in the Pilot policy
guidelines.
No publications in hybrid journals will be funded, only in fully Open
Access titles. A €2,000 funding cap is in place.
Pilot (soft) launched beginning of May 2015.
,https://goldoa-pilot.openaire.eu/
12. Building block: open data
• Need to release the value of data
• Benefits:
• Jobs (Copernicus= 50000 jobs)
• Research productivity (big bang)
• Help communities (flood hack)
• Cost of not sharing (bird flu)
1.7 million billion
bytes of date
every minute
X 34
13. Data must be..
• Open by default (G8,LERU)
• Usable by all
• Available
• Findable
• Interpretable
• Citable
• Curated/preserved
14. (1) Data
contained and
explained within
the article
(2) Further data
explanations in
any kind of
supplementary
files to articles
(3) Data
referenced from
the article and
held in data
centers and
repositories
(4) Data
publications,
describing
available
datasets
(5) Data in
drawers and on
disks at the
institute
The Data
Publication Pyramid
15. Building block: advocacy…
• Advocate for roadmaps and policies that promote open
science at institutional and national level
• Advocate for changes in practice e.g. data citation, use
of cc licences
16. and incentives
• Need to change system of incentives and assessment
• Move away from journal based metrics
• Consider value and impact of ALL research outputs (data,
software…)
• Align assessment with institutional values
• Only a change of system of incentives will truly change
practice and culture
17. Building block: infrastructure
• International
• Open
• Interoperable
• Cross disciplinary
• Facilitate collaboration
• Store & Share
• Sync & Exchange
• Replicate
• Compute
• Find
18. Building block: skills and
training
ReCODE Recommendation 10: Support the transition to open
research data through curriculum-development and
training.The transition to an open science paradigm where
research data plays a significant role requires training and
education for researchers and for data managers who support
open science. Courses for getting researchers and data
managers up-to date with current relevant issues are
necessary, as well as the development of curricula that
contribute towards the development of data science and
information management as distinct and legitimate career
paths.
•Need to embed training in post graduate education
•Invest in the development of the data professional
•Training provision as and when needed (importance of train
the trainer)
•Training and support for new tools and methods
19. Building block: policy and legislation
• Legal clarity
• Interoperabilty (WIPO solution?)
• Ensure researchers have right to secondary
publication
• Standard open access licences
• CC-by and CC0/PD
20. Copyright v TDM
• Because it involves the copying of content in
order to convert into machine readable format
TDM may infringe copyright
• European Database Directive
prohibits copying of substantial
parts of databases
• In US TDM is covered
by fair use, other parts of the
world have a specific exception
e.g. Japan, UK
https://www.flickr.com/photos/apelad/304195427/
21. Knowledge Discovery in the Digital Age
• Ultimate goal of text and data mining is to
extract high level knowledge from low level data
• Allows analysis across disciplines
• “Undiscovered public knowledge” (Swanson)
• Identifies patterns in the data to produce new
knowledge
• It’s not a new thing, it’s just digital information
makes it a whole lot more powerful and
relevant!
22. Elsevier TDM Policy
• Access through API only
• Text only- no images, tables
• Research must register details
• Click-through licence
• Terms can change any time
• Reproducibility of results
23. Key Principles
• Copyright not intended to govern access to facts,
ideas and data, nor should it
• Need to move beyond the tipping point of open
access
• Protect academic freedom (no monitoring)
• Human rights not to be undermined by contract
• Evolution of ethics (standards and legislation)
• No artificial restrictions on innovation
24. Libraries enabling Open Science
• Support Open Access
• Deposit XML in OpenAire compliant repositories
• Explore new business models- try publishing!
• Be transparent!
• Advise on use of licences
25. Libraries enabling Open Science
• Get started in research data management
• Research data management plans
• Partner with data centres (back-office/front-office model)
• Curate long tail data
26.
27.
28. Libraries’ Opportunities
Data Issue: Libraries and data centres opportunities (Chapter 4):
Availability Lower barriers to researchers to make their data available.
Integrate data sets into retrieval services.
Findability Support of persistent identifiers.
Engage in developing common metadescription schemas and common citation practices.
Promote use of common standards and tools among researchers
Interpretability Support crosslinks between publications and datasets.
Provide and help researchers understand metadescriptions of datasets.
Establish and maintain knowledge base about data and their context.
Re-usability Curate and preserve datasets.
Archive software needed for re-analysis of data.
Be transparent about conditions under which data sets can be re-used (expert knowledge needed, software needed).
Citability Engage in establishing uniform data citation standards.
Support and promote persistent identifiers.
Curation/Preservation Transparency about curation of submitted data.
Promote good data management practice.
Collaborate with data creators
Instruct researchers on discipline specific best practices in data creation (preservation formats, documentation of
experiment,…)
29. Libraries enabling Open Science
• Provide training and develop the workforce
• Best practice in research data managment
• Guidance on copyright & licences
• Training on tools
31. Libraries enabling Open Science
• Advocate & Engage
• Within institutions to develop policies and roadmaps
• Towards researchers to highlight benefits of open science
• With other stakeholders at insitutional level and internationally
• Gather and provide evidence for the need for changes in
legislation and policy
• Promote and engage with citizens
• Unite!
• Sign the Hague Declaration!
32. • The Hague Declaration: http://thehaguedeclaration.com/the-
hague-declaration-on-knowledge-discovery-in-the-digital-age/
• LERU Roadmap for Research Data
http://www.leru.org/index.php/public/news/press-release-leru-
roadmap-for-research-data/
• EUDAT http://eudat.eu/
• Research Data Alliance https://rd-alliance.org/
• LIBER 10 Recommendation on Getting Started in RDM
http://libereurope.eu/wp-content/uploads/The%20research
%20data%20group%202012%20v7%20final.pdf
• OpenAire https://www.openaire.eu/
• San Francisco Declaration
• http://www.ascb.org/dora-old/files/SFDeclarationFINAL.pdf
Notas del editor
Thank you, as introduced my name is Susan Reilly and I am Exeutive Director for LIBER, the Asociation of European Research Libraries.the Association of European Research Libraries. First of all I’d like to that the organisers of FESABID for inviting me here today. It’s an absolute honour to speak to such a large group and prestigeous grouping as FESABID, to be on the programme with such well known international library figures and indeed to be able to visit such as beautiful city as Gijon.
I was invited here today to speak about Open Science, which is such an enormous and disparated field that what I have decided to do is rather than tell you every single thing there is to know about Open Science, is to focus on particular areas that I think are important or chaging in Open Science and to do so from a particular stakeholder perspective, which is that of the library. I will talk a little more about the stakeholders I represent, the problem with trying to define open science, I will look at what some of the the building blocks of open science are. And finally I will present what LIBER says libraries can do to enable open science. I hope that throughout the presentation you will pick-up on my entusiasm for open science as the future of research and scholarship, but also a slight bit of cyincism about open science as a coherent movement, and a bias in terms of my views. Because I make the assumption that libraries have a key role in enabling open science. That is not necessarily a view shared by all as it is the reseachers who is at the heart of the change in practice and culture that is needed to make open science a reality.
LIBER represents over 400 research libraries (that is national, university, and other dedicated research libraries) in over 40 countries. Our mission is to create an information infrastructure to enable research in LIBER institutions to be world class. We believe that enabling open science will be key to the future production of world class research. The transparency of making research data and processes open will enable us to more easily recognised quality research. The increase in the availability of data and publications reduce duplication and and increase the scope of research. The growth in open infrastrucuture and implementation of open standards for interoperability will facilitate international collaboration and interdisciplinary research. We now know that collaborative research has a higher impact and is cited more widely and it has been recognised globally (G8, Science Europe) that the big break throughs in the future will result from interdisciplinary research to address global challenges such as climate change and poverty. As I have heard one researcher put it- we do not need more data from environmental scientists to prove that climate change is a problem that needs to be addressed, we need social scientists to analyse this data a figure out a way to change our behaviours in a way that afects climate change positively.
As I said, representing libraries and being a librarian myself I have a bias in that I believe that libraries have a key role in enabling open science, starting with advocacy and moving towards supporting practice, developing infra and standards, and providing access to tools. I don’t see how any other stakeholder can fill this space in a coherent way
So, let’s take a look at what exactly open science is. And this is where I become slightly cynical because first of all there are very few definitions of open science out there, and there is no single widely agreed definition. This is problematic given that it is something that we are all supposed to be working towards. Often the means to achiving open science, it’s components are confused with end itself e.g. making data open is not an end in itself it is a means. If you take this approach of confusing means with the end, then you risk alienating stakeholders, even the term itself can be problematic in some languages. And then the last point I would like to make is that for me and for many others our goal is for the practice of open science to become so much part of the research process that it becomes just ‘science’. In other words if you equate openness with excellence then why would you perform science in any other way?
So, after complaining about the problem of defining open science I will show you a definition that has been entered in the FOSTER taxonomy. This definition proves my point in that it partly confuses the means with the end. It’s also, given the group of people who are involved in FOSTER, is not particularly visionary. However I believe that the people represented in FOSTER are what open science is all about- multi stakeholder and multi-disciplinary, motivated, innovative, with a broad range of skills, working together to change practices and support a move to open science. And even though I know that this definition is far from perfect, it’s open and can act as a starting point for us to work together to create a more inclusive and visionary definition that other are free to reuse and adapt.
Another way to look at the problem of defining open science is to look at what it’s goals are. I particularly like this set of goals outlined by Dan Gezelter from OpenScience.org. The only thing I would add is an ‘open’before web-based tools.
What we are really doing is trying to move away from a 300 year old model which we now know is broken. In this model a discovery not published in a scientific journal/ monograph was not truly complete. Today we would argue that without seeing the processes or data underlying the publication how can we judge the completeness of a piece of research and for that matter is research not an on-going dialogue?
So we are moving from a black box environment to a landscape of open research and scholarship. Maybe now you can understand what a challenge Eva Mendez presented me with when she invited me to speak about open science. But I will focus, in this talk primarily on data and just touch on a few other topics. I think the important thing to note here is that instead of having an end point (the publication) open science turns research back into an ongoing dialogue, which is its natural state and moves it away from the artificial construct of publication as an end point
So, let’s move from theory to practice. Around roughly 2001, we began to think and formulate a strategy around open access to publications. I’m sure it’s an area that you are all more than familiar with but I just wanted to flag it as one of the key building block of open science. Originally open access was conceived as a means to address the problem of rising subscription prices, bundling of subscriptions causing a contraction in the market, and ineqality of access to research. In 2011 we reached the tipping point in open access, with 50% or articles being freely available. Today open access is changing how we think about publications-we aggregate them, we make them available in XML, we link them to data, they are dynamic. In fact, they are a huge data resource and the growth of open access represent huge potential for knowledge discovery across disciplines using digitial methods. Which I will go into later.
Being at the tipping point of open access means that we have to examine how we take a sustainable approach to the future. We cannot replace 1 bad system with other. This means that we as librarian have to explore different models for open access publishing, whether that be off-set, APCs, institutional presses and that we gather and start to share data and costs so we move ahead with our eyes open. LIBER is working with OpenAire
The development and coordination of policies and roadmaps for open access to publications and research data by funders and at institutional and national level is the first step in enabling Open Science. If funders mandate open access to publications, research data and tools, as well as the use of interoperable licences with clear reuse statements, such as CC-BY and CC0, this would increase the efficiency and practice of data-intensive science. Funding programmes such as H2020 are hugely important to help accelerated the development of Open Science. Mechanisms for the recognition of excellence in Open Science, e.g. via inclusion in ranking systems, would incentivise institutions to implement such roadmaps.
Advocacy for Open Science, and associated societal and economic benefits, can act as an enabler as it will engender buy-in from policy makers and from researchers themselves. Identification of the drivers and barriers at disciplinary level to a culture of openness and the addressing of these barriers will also support the cultural shift necessary to embed Open Science. We believe that libraries are uniquely placed to advocate for Open Science policy and practice at institutional level and beyond, and to increase the visibility of Open Science outputs. They have also been at the vanguard of citizen science engagement e.g. via crowdsourcing of content and metadata. It is important that awareness-raising mechanisms are developed which are targeted at policy makers, citizens, disciplinary communities, researchers at every career level, and at institutions. Open Science initiatives and outputs should be widely promoted and incentivised. Promotion of the benefits of Open Science should take place in parallel with the development of tools and services, and incentives and recognition mechanisms, that support excellence in Open Science. Emphasis should be shifted towards incentivising quality research outputs rather than metrics such as Impact Factors. Legal: Open Science does not recognise borders. It is founded on the principles of collaborat
Advocacy for Open Science, and associated societal and economic benefits, can act as an enabler as it will engender buy-in from policy makers and from researchers themselves. Identification of the drivers and barriers at disciplinary level to a culture of openness and the addressing of these barriers will also support the cultural shift necessary to embed Open Science. We believe that libraries are uniquely placed to advocate for Open Science policy and practice at institutional level and beyond, and to increase the visibility of Open Science outputs. They have also been at the vanguard of citizen science engagement e.g. via crowdsourcing of content and metadata. It is important that awareness-raising mechanisms are developed which are targeted at policy makers, citizens, disciplinary communities, researchers at every career level, and at institutions. Open Science initiatives and outputs should be widely promoted and incentivised. Promotion of the benefits of Open Science should take place in parallel with the development of tools and services, and incentives and recognition mechanisms, that support excellence in Open Science. Emphasis should be shifted towards incentivising quality research outputs rather than metrics such as Impact Factors. Legal: Open Science does not recognise borders. It is founded on the principles of collaborat
Advocacy for Open Science, and associated societal and economic benefits, can act as an enabler as it will engender buy-in from policy makers and from researchers themselves. Identification of the drivers and barriers at disciplinary level to a culture of openness and the addressing of these barriers will also support the cultural shift necessary to embed Open Science. We believe that libraries are uniquely placed to advocate for Open Science policy and practice at institutional level and beyond, and to increase the visibility of Open Science outputs. They have also been at the vanguard of citizen science engagement e.g. via crowdsourcing of content and metadata. It is important that awareness-raising mechanisms are developed which are targeted at policy makers, citizens, disciplinary communities, researchers at every career level, and at institutions. Open Science initiatives and outputs should be widely promoted and incentivised. Promotion of the benefits of Open Science should take place in parallel with the development of tools and services, and incentives and recognition mechanisms, that support excellence in Open Science. Emphasis should be shifted towards incentivising quality research outputs rather than metrics such as Impact Factors. Legal: Open Science does not recognise borders. It is founded on the principles of collaborat
Open collaborative and interoperable infrastructure for access to, exploitation, reuse, and the preservation of research outputs is a key enabler of Open Science. It empowers researchers, and citizens, with the ability to make their outputs available, to increase the visibility of and recognition for their research, to collaborate, and to innovate. It is essential, therefore, that such infrastructure continues to be built, that stakeholders can have trust in this infrastructure and that sustainable support for open pan-European initiatives (e.g. OpenAire & EUDAT) and national infrastructure (e.g. ATT in Finland) is provided. Institutional, national and international infrastructures should be supported to develop and adopt global standards for interoperability. International standards organisations should also be encouraged to engage with the Open Science community to develop interoperable standards for Open Science.
The move towards Open Science is signified by a changing stakeholder ecosystem in which new roles are emerging, such as that of the data steward, and new responsibilities are being defined. Roles and responsibilities in this new paradigm need to be clearly delineated, in particular for successful data management and open data. leadership and senior management, support services such as IT and library services, AND, of course, the researcher, are stakeholders in the open research data environment and all play a role in ensuring that open data is integrated into the way research is carried out in the future. Pilots that involve collaboration across stakeholders to explore new process, solutions and innovations are also necessary. Support for the development of new skills and curricula, as well as investment in the development of support services to help researchers fulfil their responsibilities, are absolutely crucial in enabling Open Science.
We are investing in changing practice, developing infrastructure, and publising data but we need to change some fundamentals before we can get a real return on this investment. What we need: A specific exception in EU law to allow TDMOpen Science does not recognise borders. It is founded on the principles of collaboration and universal access, and yet the lack of harmonisation of copyright law across Europe and globally is hampering access and collaboration. An issue which has been on the agenda of the European Commission for some time now is that the European copyright regime is not fit for the digital age. For example the US, which has a more favourable copyright regime for researchers, has produced over half of the text and data miningrelated publications and patents globally. It is estimated that reforms to the European copyright system that would enable TDM could result in a 2% increase in the real value of research output produced by the EU research budget, adding €5.3 billion to total €272.2 billion1 . Copyright reform must be an immediate priority in order to address this gap in competitiveness.
Text and data mining (TDM) is the process of deriving information from machine-read material. It works by copying large quantities of material, extracting the data, and recombining it to identify patterns. TDM is essentially another method of reading, done by the computer rather than the human eye. It is a natural next step for the research process, as more and more content is electronic. For libraries what this means is that researchers are able to extract more value from our vast collections- born digital and digitised. I’d like to show you some examples of the added value of TDM.
Text and data mining (TDM) is the process of deriving information from machine-read material. It works by copying large quantities of material, extracting the data, and recombining it to identify patterns. TDM is essentially another method of reading, done by the computer rather than the human eye. It is a natural next step for the research process, as more and more content is electronic. For libraries what this means is that researchers are able to extract more value from our vast collections- born digital and digitised. I’d like to show you some examples of the added value of TDM.
Text and data mining (TDM) is the process of deriving information from machine-read material. It works by copying large quantities of material, extracting the data, and recombining it to identify patterns. TDM is essentially another method of reading, done by the computer rather than the human eye. It is a natural next step for the research process, as more and more content is electronic. For libraries what this means is that researchers are able to extract more value from our vast collections- born digital and digitised. I’d like to show you some examples of the added value of TDM.
It is apparent that, apart from the lack of supporting institutional policies, what is also preventing libraries from meeting the demand for RDM support is a gap in skills. This is something that LIBER and LIBER libraries are working hard to address and is one of the areas where collaboration with the wider research data community is essential.