AFAIR in Astronomy Research - Slides. In this webinar ARDC is partnering with the ADACS project to explore the FAIR data principles in the context of Astronomy research and the ASVO and IVOA as a community exemplars of the implementation of the FAIR data principles.
These slides from: Keith Russell (ARDC): Looking at FAIR
In this talk Keith will provide an overview of the FAIR principles and how it was used in astronomy before it became official. He will conclude the talk by discussing what other disciplines can learn from their approach.
2. F.A.I.R. Data Principles
• Drafted in a workshop in 2015
• Nature article and support by FORCE11
• Received international recognition
• Technology agnostic
• Discipline independent
• Both the data and the metadata
• Human readable and machine readable
Image by Sanja Pundir CC-BY-SA
3. Why make your data FAIR?
● Enable reuse of research outputs
● Research is reproducible/verifiable
● Building a rich set of data assets
● Basis for collaboration with research partners
● Novel and innovative research, including data intensive
research
● Translation of research outcomes to achieve greater impact
4. Policy developments
● Publishers (Data availability policies)
● COPDESS statement of commitment
● FAIR access policy statement
● International funders:
○ Data sharing statements
○ European Commission Expert Group on FAIR data: Turning FAIR data
into reality: interim report
5. Findable
F1. (meta)data are assigned a globally unique and eternally
persistent identifier.
F2. data are described with rich metadata.
F3. (meta)data are registered or indexed in a searchable
resource.
F4. metadata specify the data identifier.
● Describe your data
● Give it a persistent identifier
● Make it findable through discipline
specific search routes and generic
ones
6. Accessible
A1 (meta)data are retrievable by their
identifier using a standardized
communications protocol.
A1.1 the protocol is open, free, and
universally implementable.
A1.2 the protocol allows for
an authentication and authorization
procedure, where necessary.
A2 metadata are accessible, even when
the data are no longer available.
● Open where possible, closed where required
● Deposit in repository
● Services over the data
● If closed, provide information how the researcher
can get access to the data and background
information (e.g. codebooks, methods section)
7. Interoperable
I1. (meta)data use a formal,
accessible, shared, and broadly
applicable language for knowledge
representation.
I2. (meta)data
use vocabularies (and ontologies)
that follow FAIR principles.
I3. (meta)data include qualified
references to other (meta)data.
● Use a standard file format
● Use a community agreed vocabulary
● Link to relevant information
8. Reusable
R1. meta(data) have a plurality of accurate and
relevant attributes.
R1.1. (meta)data are released with a clear and
accessible data usage license.
R1.2. (meta)data are associated with
their provenance.
R1.3. (meta)data meet domain-relevant
community standards.
Other aspects on top of F.A.I. :
● Discipline specific information
about the output
● Information on how the data
was created
● A machine readable licence
(Creative Commons
recommended)
10. With the exception of third party images or where otherwise indicated, this work is licensed under the
Creative Commons 4.0 International Attribution Licence
Keith Russell
E: keith.Russell@ardc.edu.au
M: 04 2745 23 42
T: @kgrussell
The ARDC is supported by the
Australian Government through the
National Collaborative Research
Infrastructure Strategy Program
Notas del editor
Thank you for inviting me to speak.
I will give you a brief introduction to the FAIR principles. I will not attempt to explain how resources can best be made FAIR in Astronomy.
I will leave that to Katrina Sealey and Luke Davies
Excuse the corny background, I just could not resist the reference to a FAIR under a starry sky
The FAIR data principles were drafted in a workshop at the Lorentz centre in Leiden in 2015
They were broadcast more broadly in a Nature article and by FORCE11
They have since received recognition from all around the globe as a very useful way of thinking how you can make your data available for reuse in a meaningful way.
I think there are a few reasons for this.
For one they are technology agnostic
They are also not linked to one discipline and can be applied across a range of disciplines
They describe what you can do both to the data and the metadata to make it more reusable
And finally, and very important in the current day with a focus on data intensive research, they describe how data cannot only be human readable but also machine readable, this will enable machines to pick up large volumes of data and analyse these to identify patterns and structures that are too hard to be picked out by mere humans.
So what are the arguments for making data FAIR?
Well first of all it will enable the reuse of research outputs
Which means data is reproducible and verifiable
But making your data FAIR means you will build a rich collection of data assets
This can form the basis for collaboration with research partners nationally and internationally
As I mentioned earlier, especially by making the data Interoperable you can enable new and innovative research
And finally it can help the Translation of research outcomes to achieve greater impact from the research. Allowing the findings to be picked up by business, policy makers, communities and the general public.
The FAIR principles have now been picked up by a range of organisations and are starting to be incorporated in policies
For example:
Publishers have for a while had Data Availability policies
But now the Coalition for Publishing Data in the Earth and Space Sciences has released a statement of commitment to make more data FAIR. This has been signed by a range of publishers and data facilities. These include some larger publishers like Elsevier, Nature, Springer and Wiley.
Here in Australia under the auspices of the Universities Australia Deputy Vice Chancellors (Research) Committee a working group was convened,
developed a policy statement on access to research outputs.
Internationally funders had already set up data sharing statements, but now are also thinking about how they can ask researchers to make their research outputs FAIR.
For example the European commission has set up an expert group that is looking at how FAIR data can be turned into reality and have recently released an interim report.,
So what are the principles exactly?
Now to give you a quick run through of the principles.
The principles go beyond the four letters and there is some very useful in the detail below the four letters
The F is for Findable.
When making your data Findable think about assigning a persistent identifier to the data, so that if the data moves, people or machines will still be able to find the data
Make it discoverable through well described metadata and make it findable through relevant search routes, disciplinary portals, etc.
Make the data accessible
That does not always mean that the data has to be Open.
In some disciplines there are very good reasons why data cannot be made open, for example because it contains information on individuals, culturally sensitive data or is commercially sensitive.
Make sure that you place the data in a place where it can be accessed through standard protocols.
It does not have to just be downloadable, in some cases especially if it is large data it makes more sense to make it accessible through services so you can pull out the parts of data you need for your analysis
If the data is not open, do provide information about how access to the data can be obtained and provide relevant information so the researcher can get a sense of what the data is about.
A very important element in combining and bringing data together is under Interoperable.
Use standard file formats that are commonly used in the discipline
Use community agreed vocabularies to set the content of the data and the metadata so others can easily it to other data that also use these vocabularies.
These vocabularies themselves should also be FAIR so they don’t get lost
Use links to all sorts of relevant information that will help provide context to the data.
The last letter is R for reusable
First of all it is important to note that all the things under FAI are also crucial to make the data reusable
But there are a few more things to think of:
Include more detailed discipline specific information about the output. This will help the researcher better understand the data and what it means
Also include provenance information on how the data was created. This includes information about who created it, when, off which instrument, what the settings were, which analysis was done to come to the results, etc.
Finally attach a machine readable licence. If there is no licence attached to the data nobody knows what can and cannot be done with the data which makes it unusable. If the licence is machine readable a computer can pick up the data and assess whether it can be included in analysis and the appropriate attribution can provided.
If you are interested in general what it means to make your data more FAIR we have a range of materials on our website that can be useful. This includes a range a training resources and materials broken down according to the Principles
We also have a self assessment tool you can use to assess how FAIR your data already is and if there are things you can do to make your data more FAIR.
Please note these are all generic and not specific to Astrono,=my
I would now like to hand over to Katrina and Luke as they can explain much better what it means to make data FAIR in Astronomy.