SlideShare a Scribd company logo
1 of 50
The Future of FAIR Data:
An international social, legal and technological
infrastructure for data discovery and reuse
by humans and machines
@micheldumontier::S.NET:2018-06-271
Michel Dumontier, Ph.D.
Distinguished Professor of Data Science
Director, Institute of Data Science
An increasing number of
discoveries are made using other
people’s data
@micheldumontier::S.NET:2018-06-272
3
A common rejection module (CRM) for acute rejection across multiple organs identifies
novel therapeutics for organ transplantation
Khatri et al. JEM. 210 (11): 2205
DOI: 10.1084/jem.20122709
@micheldumontier::S.NET:2018-06-27
1. Based on gene expression data that is free to use
2. CRM genes correlated with the extent of graft injury and predicted future injury
to a graft
3. Mice treated with drugs against the CRM genes extended graft survival
However, significant effort was
needed to find the right datasets,
make sense of them, and ultimately
use them for a new purpose
@micheldumontier::S.NET:2018-06-274
Poor quality (meta)data hampers research
@micheldumontier::S.NET:2018-06-276
Lambin et al. Radiother Oncol. 2013. 109(1):159-64. doi: 10.1016/j.radonc.2013.07.007
If we are ever to realize the full
potential of content we create
then we must find ways to reduce the
barrier to publish, find and reuse
their content in a responsible manner
@micheldumontier::S.NET:2018-06-277
Why does this matter?
@micheldumontier::S.NET:2018-06-278
9 @micheldumontier::S.NET:2018-06-27
Most published research findings are false.
- John Ioannidis, Stanford University
Reproducibility of landmark studies is shockingly low:
39% (39/100) in psychological studies1
21% (14/67) in pharmacological studies2
11% (6/53) in cancer studies3
PLoS Med 2005;2(8): e124.
1doi:10.1038/nature.2015.17433 2doi:10.1038/nrd3439-c1 3doi:10.1038/483531a
@micheldumontier::S.NET:2018-06-2710
Published online 28 September 2011 | Nature 477, 526-528 (2011) | doi:10.1038/477526a
@micheldumontier::S.NET:2018-06-2711
we need new ways to think
about discovery science
We need to improve our
confidence in an result
by using more data
and by using multiple lines of
evidence
Grand Challenge:
Automatically
uncover evidence
that supports and
disputes a hypothesis
using the totality of
available data, tools
and scientific
knowledge
@micheldumontier::S.NET:2018-06-2712
We must build a social, legal and
technological infrastructure that
facilitates the discovery and reuse
of digital resources
for people and machines
@micheldumontier::S.NET:2018-06-2713
Why machines?
• Can make sense of vast amounts of
information to make personalized, evidence-
based decisions to maximize desired
outcomes
@micheldumontier::S.NET:2018-06-2714
Big Data
for Medicine
@micheldumontier::S.NET:2018-06-2715
Multiple sources of heterogeneous
data, including experimental
evidence, bioinformatics databases,
lifestyle measurements, electronic
health records, environmental
influences, and biobank findings, can
be incorporated together using
machine learning algorithms to
identify causal disease networks,
stratify patients, and ultimately
predict more efficacious therapies.
Why machines?
• Can make sense of vast amounts of
information to make personalized, evidence-
based decisions to maximize desired
outcomes
• Can identify and minimize bias in research and
in real world applications in a systematic and
robust manner
• Will eventually be able to independently
contribute to all aspects of human interest
@micheldumontier::S.NET:2018-06-2716
AI Generated Pictures
@micheldumontier::S.NET:2018-06-27
https://deepdreamgenerator.com
17
AI Generated Music
@micheldumontier::S.NET:2018-06-27
Performance RNN
https://magenta.tensorflow.org/performance-rnn
18
@micheldumontier::S.NET:2018-06-2719
@micheldumontier::S.NET:2018-06-2720
Principles to enhance the value of all digital resources
data, images, software, web services, repositories,…
Developed and endorsed by researchers, publishers,
funding agencies, industry partners.
@micheldumontier::S.NET:2018-06-2721
@micheldumontier::S.NET:2018-06-2722
@micheldumontier::S.NET:2018-06-2723
http://www.nature.com/articles/sdata201618
Many of these citations are communities
discussing what this means for them
June 2018
@micheldumontier::S.NET:2018-06-2724
4 Principles (F,A,I,R) and 15 sub-principles.
http://www.nature.com/articles/sdata201618
FAIR Principles - summarized
Findable
• Globally unique, resolvable, and persistent identifiers
• Machine-readable descriptions to support structured search
Accessible
• Clearly defined access and security protocols
• Metadata is always accessible beyond the lifetime of the digital resource
Interoperable
• Extensible machine interpretable formats for data + metadata
• Use FAIR vocabularies and link to other resources
Reusable
• Provide licensing, provenance, and use community-standards
@micheldumontier::S.NET:2018-06-2725
Improving the FAIRness of digital
resources will increase their quality and
their potential and ease for reuse.
@micheldumontier::S.NET:2018-06-2726
What is FAIRness?
FAIRness reflects the extent to which a digital
resource addresses the FAIR principles as per the
expectations defined by a community.
@micheldumontier::S.NET:2018-06-2727
How it might look at DANS
@micheldumontier::S.NET:2018-06-2728
FAIRShake
@micheldumontier::S.NET:2018-06-2729
Communities, such as yours,
must make clear your expectations
@micheldumontier::S.NET:2018-06-2730
Measuring FAIRness
• A metric is a standard of measurement.
• It must provide clear definition of what is being
measured, why one wants to measure it.
• It must describe what a valid result is and how
one obtains it, so that it can be reproduced by
others.
@micheldumontier::S.NET:2018-06-2731
Qualities of a Good Metric
• Clear: anyone can understand the purpose of the metric
• Realistic: compliance should not be unduly complicated
• Discriminating: the measure can distinguish between
those resources that meet the criteria and those that do
not
• Measurable: the assessment can be made in an
objective, quantitative, machine-interpretable, scalable
and reproducible manner
• Universal: The metric should be applicable to all digital
resources
@micheldumontier::S.NET:2018-06-2732
• 14 universal metrics covering each of the FAIR sub-principles. The metrics
demand evidence from the community, some of which may require specific
new actions.
• Digital resource providers must provide a web-accessible document with
machine-readable metadata (FM-F2, FM-F3), detail identifier management
(FM-F1B), metadata longevity (FM-A2), and any additional authorization
procedures (FM-A1.2).
• They must ensure the public registration of their identifier schemes (FM-
F1A), (secure) access protocols (FM-A1.1), knowledge representation
languages (FM-I1), licenses (FM-R1.1), provenance specifications (FM-
R1.2), and community standards (FM-R1.3).
• They must provide evidence of ability to find the digital resource in search
results (FM-F4), linking to other resources (FM-I3), FAIRness of linked
resources (FM-I2), and meeting community standards (FM-R1.3)
@micheldumontier::S.NET:2018-06-2733
@micheldumontier::S.NET:2018-06-2734
http://www.w3.org/TR/hcls-dataset/
Evidence:
standard is also
registered in
FAIRsharing
http://hw-swel.github.io/Validata/
@micheldumontier::S.NET:2018-06-2735
RDF constraint validation tool
Configurable to any profile
Declarative reusable schema description
Shape Expression (ShEx) constraints
Open source javascript implementation
A first assessment using the metrics
• Used a simple form to ask for the information
needed as input to the FAIR metrics
• Questions either require one or more URL or
true/false
@micheldumontier::S.NET:2018-06-2736
@micheldumontier::S.NET:2018-06-2737
@micheldumontier::S.NET:2018-06-2738
@micheldumontier::S.NET:2018-06-2739
Automated assessments
are (currently) unforgiving
@micheldumontier::S.NET:2018-06-2740
@micheldumontier::S.NET:2018-06-2741
@micheldumontier::S.NET:2018-06-2742
H2020 EG: Turning FAIR Data into Reality -
Report and Action Plan Consultation
Recommendations include:
• Sustainable Funding for FAIR components (#5)
• Strategic and evidence-based funding (#6)
• Cross-Disciplinary FAIRness (#8)
• Encourage and incentivise data reuse (#19)
• Facilitate automated processing (#25)
@micheldumontier::S.NET:2018-06-2743
Consultation until 5 August
Consultation is being conducted on the interim report and Action Plan until 5 August 2018. A commentable version of the report is
available on Google Drive. Structured comments on the Action Plan and specific recommendations and actions may be made via a
dedicated GitHub repository.
· Comment on Report: http://bit.ly/interim_FAIR_report
· Comment on Action Plan: https://github.com/FAIR-Data-EG/action-plan
Analyzing partitioned FAIR health data responsibly
Maastricht Study + MUMC CBS
Key Aspects:
• Goal: Enable the use of federated machine learning to uncover positive and negative
determinants of type 2 diabetes using FAIR data from both the Maastricht Study and
Statistics Netherlands.
• Establish a broader and scalable governance structure and consent define and
underpin the responsible use of Big Data in Health.
• Create a new social, legal, and technological infrastructure for discovery science in
and across a health and non-health settings, with many future uses
@micheldumontier::S.NET:2018-06-2744
Maastricht University
will become the first
‘FAIR’ university
By 2023
• Launch of a one-stop shop that will educate
and provide support for FAIR-compliant data
management and responsible data science
• Establishment of an cross-faculty Community
of Data-Driven Insights (CDDI) that have
routinely made their research products FAIR,
and demonstrated the use of other people’s data
and services to further their research
• Creation of a new social, legal and
technological infrastructure to enable and
encourage reuse of contributed digital resources
• Coordination of digital research infrastructure,
teaching and training, e-science capability, and
user support.
Key Benefits
• Increased researcher productivity in using
their and other people’s data and services
• Improved communication among
stakeholders to deploy the latest research data
management practices and data science
methodology
• Efficient coupling of institutional resources with
research funding to maximize return on
investments
@micheldumontier::S.NET:2018-06-2745
Responsible Data Science
I see data stewardship as a part of responsible data
science
It’s critical that we find the right incentives for people to
participate
• credit, funding, promotion, etc
• New insights for their own work
Automated systems should be ethical-by-design, but we
need research and eventually a common infrastructure
for this!
@micheldumontier::S.NET:2018-06-2746
Next steps
• Further development of FAIR infrastructure
• Development of training, and support for
implementation and adoption
• Measuring the impact of FAIR for research and
innovation
@micheldumontier::S.NET:2018-06-2747
Summary
• FAIR represents a global initiative to enhance the
discovery and reuse of all kinds of digital resources
• The FAIR concept is maturing quickly. Make sure that
your voice is heard.
• Huge benefits to be had, particularly in machine
processing, but needs to be coupled with the proper
incentives and ethics.
• It demands a new social, legal and technological
infrastructure that currently doesn’t exist in whole, but
has to be built for and tested by various communities!
@micheldumontier::S.NET:2018-06-2748
Acknowledgements
@micheldumontier::SEBD:2018-06-2549
FAIR
FAIR metrics
Dumontier Lab (Maastricht University, Stanford University, Carleton University)
MU: Seun Adekunle, Dorina Claessens, Pedro Hernandez Serrano, Andine Havelange, Lianne Ippel, Alexander
Malic, Kody Moodley, Stuti Nayak, Nadine Rouleaux, Claudia van open, Chang Sun, Amrapali Zaveri
SU: Sandeep Ayyar, Shima Dastgheib, Remzi Celebi, Maulik Kamdar, David Odgers, Maryam Panahiazar
CU: Alison Callahan, Jose Toledo-Cruz, Natalia Villaneuva-Rosales
michel.dumontier@maastrichtuniversity.nl
Website: http://maastrichtuniversity.nl/ids
50 @micheldumontier::S.NET:2018-06-27
The mission of the Institute of Data Science at Maastricht
University is to foster a collaborative environment for multi-
disciplinary data science research, interdisciplinary training,
and data-driven innovation .
We tackle key scientific, technical, social, legal, ethical
issues that advance our understanding and strengthen our
communities in the face of these developments.

More Related Content

What's hot

Big Data Analytics government healthcare
Big Data Analytics government healthcareBig Data Analytics government healthcare
Big Data Analytics government healthcareData Science Thailand
 
Using Neo4j 4.0 to Maintain Security with COVID-19 HealthCheck
Using Neo4j 4.0 to Maintain Security with COVID-19 HealthCheckUsing Neo4j 4.0 to Maintain Security with COVID-19 HealthCheck
Using Neo4j 4.0 to Maintain Security with COVID-19 HealthCheckNeo4j
 
Accelerating Biomedical Research with the Emerging Internet of FAIR Data and ...
Accelerating Biomedical Research with the Emerging Internet of FAIR Data and ...Accelerating Biomedical Research with the Emerging Internet of FAIR Data and ...
Accelerating Biomedical Research with the Emerging Internet of FAIR Data and ...Michel Dumontier
 
Reveal Hidden Patterns in Healthcare Data: Graph Analytics and the Opioid Crisis
Reveal Hidden Patterns in Healthcare Data: Graph Analytics and the Opioid CrisisReveal Hidden Patterns in Healthcare Data: Graph Analytics and the Opioid Crisis
Reveal Hidden Patterns in Healthcare Data: Graph Analytics and the Opioid CrisisNeo4j
 
Identifying Drug Interaction Candidates in Real-World Data
Identifying Drug Interaction Candidates in Real-World DataIdentifying Drug Interaction Candidates in Real-World Data
Identifying Drug Interaction Candidates in Real-World DataNeo4j
 
Neo4j for Healthcare & Life Sciences
Neo4j for Healthcare & Life SciencesNeo4j for Healthcare & Life Sciences
Neo4j for Healthcare & Life SciencesNeo4j
 
Blockchain and Patient-Centered Outcomes Measures - Goldwater
Blockchain and Patient-Centered Outcomes Measures - GoldwaterBlockchain and Patient-Centered Outcomes Measures - Goldwater
Blockchain and Patient-Centered Outcomes Measures - GoldwaterSean Manion PhD
 
Blockchain in Health Research Overview - Manion
Blockchain in Health Research Overview - ManionBlockchain in Health Research Overview - Manion
Blockchain in Health Research Overview - ManionSean Manion PhD
 
Pistoia Alliance conference April 2016: Big Data: Eric Little
Pistoia Alliance conference April 2016: Big Data: Eric LittlePistoia Alliance conference April 2016: Big Data: Eric Little
Pistoia Alliance conference April 2016: Big Data: Eric LittlePistoia Alliance
 
Clinical Data Models - The Hyve - Bio IT World April 2019
Clinical Data Models - The Hyve - Bio IT World April 2019Clinical Data Models - The Hyve - Bio IT World April 2019
Clinical Data Models - The Hyve - Bio IT World April 2019Kees van Bochove
 
Pistoia Alliance Debates: Moving Research Informatics into the Cloud: 25th Ma...
Pistoia Alliance Debates: Moving Research Informatics into the Cloud: 25th Ma...Pistoia Alliance Debates: Moving Research Informatics into the Cloud: 25th Ma...
Pistoia Alliance Debates: Moving Research Informatics into the Cloud: 25th Ma...Pistoia Alliance
 
Towards the Digital Research Enterprise
Towards the Digital Research EnterpriseTowards the Digital Research Enterprise
Towards the Digital Research EnterprisePhilip Bourne
 
How much is that data in the window : Healthcare data valuation
How much is that data in the window : Healthcare data valuationHow much is that data in the window : Healthcare data valuation
How much is that data in the window : Healthcare data valuationSean Manion PhD
 
Supporting the community-owned open scholarly communications ecosystem
Supporting the community-owned open scholarly communications ecosystemSupporting the community-owned open scholarly communications ecosystem
Supporting the community-owned open scholarly communications ecosystemJisc
 
Open science and medical evidence generation - Kees van Bochove - The Hyve
Open science and medical evidence generation - Kees van Bochove - The HyveOpen science and medical evidence generation - Kees van Bochove - The Hyve
Open science and medical evidence generation - Kees van Bochove - The HyveKees van Bochove
 
apidays LIVE Australia 2021 - APIs enable global collaborations and accelerat...
apidays LIVE Australia 2021 - APIs enable global collaborations and accelerat...apidays LIVE Australia 2021 - APIs enable global collaborations and accelerat...
apidays LIVE Australia 2021 - APIs enable global collaborations and accelerat...apidays
 
Blockchain Healthcare Situation Report (BC/HC SITREP) Volume 2 Issue 19, 07 -...
Blockchain Healthcare Situation Report (BC/HC SITREP) Volume 2 Issue 19, 07 -...Blockchain Healthcare Situation Report (BC/HC SITREP) Volume 2 Issue 19, 07 -...
Blockchain Healthcare Situation Report (BC/HC SITREP) Volume 2 Issue 19, 07 -...Sean Manion PhD
 
LASYR Slides IEEE event 07 APR 2021
LASYR Slides IEEE event 07 APR 2021LASYR Slides IEEE event 07 APR 2021
LASYR Slides IEEE event 07 APR 2021Sean Manion PhD
 

What's hot (20)

Evaluating FAIRness
Evaluating FAIRnessEvaluating FAIRness
Evaluating FAIRness
 
Big Data Analytics government healthcare
Big Data Analytics government healthcareBig Data Analytics government healthcare
Big Data Analytics government healthcare
 
Using Neo4j 4.0 to Maintain Security with COVID-19 HealthCheck
Using Neo4j 4.0 to Maintain Security with COVID-19 HealthCheckUsing Neo4j 4.0 to Maintain Security with COVID-19 HealthCheck
Using Neo4j 4.0 to Maintain Security with COVID-19 HealthCheck
 
Accelerating Biomedical Research with the Emerging Internet of FAIR Data and ...
Accelerating Biomedical Research with the Emerging Internet of FAIR Data and ...Accelerating Biomedical Research with the Emerging Internet of FAIR Data and ...
Accelerating Biomedical Research with the Emerging Internet of FAIR Data and ...
 
Reveal Hidden Patterns in Healthcare Data: Graph Analytics and the Opioid Crisis
Reveal Hidden Patterns in Healthcare Data: Graph Analytics and the Opioid CrisisReveal Hidden Patterns in Healthcare Data: Graph Analytics and the Opioid Crisis
Reveal Hidden Patterns in Healthcare Data: Graph Analytics and the Opioid Crisis
 
Identifying Drug Interaction Candidates in Real-World Data
Identifying Drug Interaction Candidates in Real-World DataIdentifying Drug Interaction Candidates in Real-World Data
Identifying Drug Interaction Candidates in Real-World Data
 
Neo4j for Healthcare & Life Sciences
Neo4j for Healthcare & Life SciencesNeo4j for Healthcare & Life Sciences
Neo4j for Healthcare & Life Sciences
 
Innovative project1
Innovative project1Innovative project1
Innovative project1
 
Blockchain and Patient-Centered Outcomes Measures - Goldwater
Blockchain and Patient-Centered Outcomes Measures - GoldwaterBlockchain and Patient-Centered Outcomes Measures - Goldwater
Blockchain and Patient-Centered Outcomes Measures - Goldwater
 
Blockchain in Health Research Overview - Manion
Blockchain in Health Research Overview - ManionBlockchain in Health Research Overview - Manion
Blockchain in Health Research Overview - Manion
 
Pistoia Alliance conference April 2016: Big Data: Eric Little
Pistoia Alliance conference April 2016: Big Data: Eric LittlePistoia Alliance conference April 2016: Big Data: Eric Little
Pistoia Alliance conference April 2016: Big Data: Eric Little
 
Clinical Data Models - The Hyve - Bio IT World April 2019
Clinical Data Models - The Hyve - Bio IT World April 2019Clinical Data Models - The Hyve - Bio IT World April 2019
Clinical Data Models - The Hyve - Bio IT World April 2019
 
Pistoia Alliance Debates: Moving Research Informatics into the Cloud: 25th Ma...
Pistoia Alliance Debates: Moving Research Informatics into the Cloud: 25th Ma...Pistoia Alliance Debates: Moving Research Informatics into the Cloud: 25th Ma...
Pistoia Alliance Debates: Moving Research Informatics into the Cloud: 25th Ma...
 
Towards the Digital Research Enterprise
Towards the Digital Research EnterpriseTowards the Digital Research Enterprise
Towards the Digital Research Enterprise
 
How much is that data in the window : Healthcare data valuation
How much is that data in the window : Healthcare data valuationHow much is that data in the window : Healthcare data valuation
How much is that data in the window : Healthcare data valuation
 
Supporting the community-owned open scholarly communications ecosystem
Supporting the community-owned open scholarly communications ecosystemSupporting the community-owned open scholarly communications ecosystem
Supporting the community-owned open scholarly communications ecosystem
 
Open science and medical evidence generation - Kees van Bochove - The Hyve
Open science and medical evidence generation - Kees van Bochove - The HyveOpen science and medical evidence generation - Kees van Bochove - The Hyve
Open science and medical evidence generation - Kees van Bochove - The Hyve
 
apidays LIVE Australia 2021 - APIs enable global collaborations and accelerat...
apidays LIVE Australia 2021 - APIs enable global collaborations and accelerat...apidays LIVE Australia 2021 - APIs enable global collaborations and accelerat...
apidays LIVE Australia 2021 - APIs enable global collaborations and accelerat...
 
Blockchain Healthcare Situation Report (BC/HC SITREP) Volume 2 Issue 19, 07 -...
Blockchain Healthcare Situation Report (BC/HC SITREP) Volume 2 Issue 19, 07 -...Blockchain Healthcare Situation Report (BC/HC SITREP) Volume 2 Issue 19, 07 -...
Blockchain Healthcare Situation Report (BC/HC SITREP) Volume 2 Issue 19, 07 -...
 
LASYR Slides IEEE event 07 APR 2021
LASYR Slides IEEE event 07 APR 2021LASYR Slides IEEE event 07 APR 2021
LASYR Slides IEEE event 07 APR 2021
 

Similar to The Future of FAIR Data: An international social, legal and technological infrastructure for data discovery and reuse by humans and machines

Accelerating biomedical discovery with an Internet of FAIR data and services ...
Accelerating biomedical discovery with an Internet of FAIR data and services ...Accelerating biomedical discovery with an Internet of FAIR data and services ...
Accelerating biomedical discovery with an Internet of FAIR data and services ...Platform Linked Data Netherlands (PLDN)
 
IRJET- Scope of Big Data Analytics in Industrial Domain
IRJET- Scope of Big Data Analytics in Industrial DomainIRJET- Scope of Big Data Analytics in Industrial Domain
IRJET- Scope of Big Data Analytics in Industrial DomainIRJET Journal
 
Acclerating biomedical discovery with an internet of FAIR data and services -...
Acclerating biomedical discovery with an internet of FAIR data and services -...Acclerating biomedical discovery with an internet of FAIR data and services -...
Acclerating biomedical discovery with an internet of FAIR data and services -...Michel Dumontier
 
KIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdf
KIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdfKIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdf
KIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdfDr. Radhey Shyam
 
FAIR principles and metrics for evaluation
FAIR principles and metrics for evaluationFAIR principles and metrics for evaluation
FAIR principles and metrics for evaluationMichel Dumontier
 
Introduction to Data Analytics and data analytics life cycle
Introduction to Data Analytics and data analytics life cycleIntroduction to Data Analytics and data analytics life cycle
Introduction to Data Analytics and data analytics life cycleDr. Radhey Shyam
 
IRJET- A Survey on Big Data Frameworks and Approaches in Health Care Sector
IRJET- A Survey on Big Data Frameworks and Approaches in Health Care SectorIRJET- A Survey on Big Data Frameworks and Approaches in Health Care Sector
IRJET- A Survey on Big Data Frameworks and Approaches in Health Care SectorIRJET Journal
 
Towards metrics to assess and encourage FAIRness
Towards metrics to assess and encourage FAIRnessTowards metrics to assess and encourage FAIRness
Towards metrics to assess and encourage FAIRnessMichel Dumontier
 
Governance of trustworthy AI
Governance of trustworthy AIGovernance of trustworthy AI
Governance of trustworthy AIsamossummit
 
Big data analytics and its impact on internet users
Big data analytics and its impact on internet usersBig data analytics and its impact on internet users
Big data analytics and its impact on internet usersStruggler Ever
 
A Survey on Big Data Analytics
A Survey on Big Data AnalyticsA Survey on Big Data Analytics
A Survey on Big Data AnalyticsBHARATH KUMAR
 
CTO Perspectives: What's Next for Data Management and Healthcare?
CTO Perspectives: What's Next for Data Management and Healthcare?CTO Perspectives: What's Next for Data Management and Healthcare?
CTO Perspectives: What's Next for Data Management and Healthcare?Health Catalyst
 
Joint Pistoia Alliance & PRISME AI in pharma webinar 18 Oct 2018
Joint Pistoia Alliance & PRISME AI in pharma webinar 18 Oct 2018Joint Pistoia Alliance & PRISME AI in pharma webinar 18 Oct 2018
Joint Pistoia Alliance & PRISME AI in pharma webinar 18 Oct 2018Pistoia Alliance
 
IRJET- Big Data Management and Growth Enhancement
IRJET- Big Data Management and Growth EnhancementIRJET- Big Data Management and Growth Enhancement
IRJET- Big Data Management and Growth EnhancementIRJET Journal
 
IRJET- A Survey on Mining of Tweeter Data for Predicting User Behavior
IRJET- A Survey on Mining of Tweeter Data for Predicting User BehaviorIRJET- A Survey on Mining of Tweeter Data for Predicting User Behavior
IRJET- A Survey on Mining of Tweeter Data for Predicting User BehaviorIRJET Journal
 

Similar to The Future of FAIR Data: An international social, legal and technological infrastructure for data discovery and reuse by humans and machines (20)

Accelerating biomedical discovery with an Internet of FAIR data and services ...
Accelerating biomedical discovery with an Internet of FAIR data and services ...Accelerating biomedical discovery with an Internet of FAIR data and services ...
Accelerating biomedical discovery with an Internet of FAIR data and services ...
 
IRJET- Scope of Big Data Analytics in Industrial Domain
IRJET- Scope of Big Data Analytics in Industrial DomainIRJET- Scope of Big Data Analytics in Industrial Domain
IRJET- Scope of Big Data Analytics in Industrial Domain
 
Acclerating biomedical discovery with an internet of FAIR data and services -...
Acclerating biomedical discovery with an internet of FAIR data and services -...Acclerating biomedical discovery with an internet of FAIR data and services -...
Acclerating biomedical discovery with an internet of FAIR data and services -...
 
KIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdf
KIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdfKIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdf
KIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdf
 
FAIR principles and metrics for evaluation
FAIR principles and metrics for evaluationFAIR principles and metrics for evaluation
FAIR principles and metrics for evaluation
 
Introduction to Data Analytics and data analytics life cycle
Introduction to Data Analytics and data analytics life cycleIntroduction to Data Analytics and data analytics life cycle
Introduction to Data Analytics and data analytics life cycle
 
IRJET- A Survey on Big Data Frameworks and Approaches in Health Care Sector
IRJET- A Survey on Big Data Frameworks and Approaches in Health Care SectorIRJET- A Survey on Big Data Frameworks and Approaches in Health Care Sector
IRJET- A Survey on Big Data Frameworks and Approaches in Health Care Sector
 
Towards metrics to assess and encourage FAIRness
Towards metrics to assess and encourage FAIRnessTowards metrics to assess and encourage FAIRness
Towards metrics to assess and encourage FAIRness
 
Big data
Big dataBig data
Big data
 
Big Data Analytics (1).ppt
Big Data Analytics (1).pptBig Data Analytics (1).ppt
Big Data Analytics (1).ppt
 
Governance of trustworthy AI
Governance of trustworthy AIGovernance of trustworthy AI
Governance of trustworthy AI
 
Big data analytics and its impact on internet users
Big data analytics and its impact on internet usersBig data analytics and its impact on internet users
Big data analytics and its impact on internet users
 
A Survey on Big Data Analytics
A Survey on Big Data AnalyticsA Survey on Big Data Analytics
A Survey on Big Data Analytics
 
CTO Perspectives: What's Next for Data Management and Healthcare?
CTO Perspectives: What's Next for Data Management and Healthcare?CTO Perspectives: What's Next for Data Management and Healthcare?
CTO Perspectives: What's Next for Data Management and Healthcare?
 
Joint Pistoia Alliance & PRISME AI in pharma webinar 18 Oct 2018
Joint Pistoia Alliance & PRISME AI in pharma webinar 18 Oct 2018Joint Pistoia Alliance & PRISME AI in pharma webinar 18 Oct 2018
Joint Pistoia Alliance & PRISME AI in pharma webinar 18 Oct 2018
 
Complete-SRS.doc
Complete-SRS.docComplete-SRS.doc
Complete-SRS.doc
 
IRJET- Big Data Management and Growth Enhancement
IRJET- Big Data Management and Growth EnhancementIRJET- Big Data Management and Growth Enhancement
IRJET- Big Data Management and Growth Enhancement
 
Applications of Big Data
Applications of Big DataApplications of Big Data
Applications of Big Data
 
IRJET- A Survey on Mining of Tweeter Data for Predicting User Behavior
IRJET- A Survey on Mining of Tweeter Data for Predicting User BehaviorIRJET- A Survey on Mining of Tweeter Data for Predicting User Behavior
IRJET- A Survey on Mining of Tweeter Data for Predicting User Behavior
 
ppt1.pptx
ppt1.pptxppt1.pptx
ppt1.pptx
 

More from Michel Dumontier

A metadata standard for Knowledge Graphs
A metadata standard for Knowledge GraphsA metadata standard for Knowledge Graphs
A metadata standard for Knowledge GraphsMichel Dumontier
 
Data-Driven Discovery Science with FAIR Knowledge Graphs
Data-Driven Discovery Science with FAIR Knowledge GraphsData-Driven Discovery Science with FAIR Knowledge Graphs
Data-Driven Discovery Science with FAIR Knowledge GraphsMichel Dumontier
 
Keynote at the 2018 Maastricht University Dinner
Keynote at the 2018 Maastricht University DinnerKeynote at the 2018 Maastricht University Dinner
Keynote at the 2018 Maastricht University DinnerMichel Dumontier
 
The future of science and business - a UM Star Lecture
The future of science and business - a UM Star LectureThe future of science and business - a UM Star Lecture
The future of science and business - a UM Star LectureMichel Dumontier
 
Building a Network of Interoperable and Independently Produced Linked and Ope...
Building a Network of Interoperable and Independently Produced Linked and Ope...Building a Network of Interoperable and Independently Produced Linked and Ope...
Building a Network of Interoperable and Independently Produced Linked and Ope...Michel Dumontier
 
Model Organism Linked Data
Model Organism Linked DataModel Organism Linked Data
Model Organism Linked DataMichel Dumontier
 
2016 ACS Semantic Approaches for Biochemical Knowledge Discovery
2016 ACS Semantic Approaches for Biochemical Knowledge Discovery2016 ACS Semantic Approaches for Biochemical Knowledge Discovery
2016 ACS Semantic Approaches for Biochemical Knowledge DiscoveryMichel Dumontier
 
Making it Easier, Possibly Even Pleasant, to Author Rich Experimental Metadata
Making it Easier, Possibly Even Pleasant, to Author Rich Experimental MetadataMaking it Easier, Possibly Even Pleasant, to Author Rich Experimental Metadata
Making it Easier, Possibly Even Pleasant, to Author Rich Experimental MetadataMichel Dumontier
 
Link Analysis of Life Sciences Linked Data
Link Analysis of Life Sciences Linked DataLink Analysis of Life Sciences Linked Data
Link Analysis of Life Sciences Linked DataMichel Dumontier
 
Making the most of phenotypes in ontology-based biomedical knowledge discovery
Making the most of phenotypes in ontology-based biomedical knowledge discoveryMaking the most of phenotypes in ontology-based biomedical knowledge discovery
Making the most of phenotypes in ontology-based biomedical knowledge discoveryMichel Dumontier
 
W3C HCLS Dataset Description Guidelines
W3C HCLS Dataset Description GuidelinesW3C HCLS Dataset Description Guidelines
W3C HCLS Dataset Description GuidelinesMichel Dumontier
 
Semantic approaches for biomedical knowledge discovery - Discovery Science 20...
Semantic approaches for biomedical knowledge discovery - Discovery Science 20...Semantic approaches for biomedical knowledge discovery - Discovery Science 20...
Semantic approaches for biomedical knowledge discovery - Discovery Science 20...Michel Dumontier
 
1st Network-of-BioThings Hackathon
1st Network-of-BioThings Hackathon1st Network-of-BioThings Hackathon
1st Network-of-BioThings HackathonMichel Dumontier
 
Powering Scientific Discovery with the Semantic Web (VanBUG 2014)
Powering Scientific Discovery with the Semantic Web (VanBUG 2014)Powering Scientific Discovery with the Semantic Web (VanBUG 2014)
Powering Scientific Discovery with the Semantic Web (VanBUG 2014)Michel Dumontier
 

More from Michel Dumontier (17)

A metadata standard for Knowledge Graphs
A metadata standard for Knowledge GraphsA metadata standard for Knowledge Graphs
A metadata standard for Knowledge Graphs
 
Data-Driven Discovery Science with FAIR Knowledge Graphs
Data-Driven Discovery Science with FAIR Knowledge GraphsData-Driven Discovery Science with FAIR Knowledge Graphs
Data-Driven Discovery Science with FAIR Knowledge Graphs
 
Keynote at the 2018 Maastricht University Dinner
Keynote at the 2018 Maastricht University DinnerKeynote at the 2018 Maastricht University Dinner
Keynote at the 2018 Maastricht University Dinner
 
The future of science and business - a UM Star Lecture
The future of science and business - a UM Star LectureThe future of science and business - a UM Star Lecture
The future of science and business - a UM Star Lecture
 
Data Science for the Win
Data Science for the WinData Science for the Win
Data Science for the Win
 
2016 bmdid-mappings
2016 bmdid-mappings2016 bmdid-mappings
2016 bmdid-mappings
 
Ontologies
OntologiesOntologies
Ontologies
 
Building a Network of Interoperable and Independently Produced Linked and Ope...
Building a Network of Interoperable and Independently Produced Linked and Ope...Building a Network of Interoperable and Independently Produced Linked and Ope...
Building a Network of Interoperable and Independently Produced Linked and Ope...
 
Model Organism Linked Data
Model Organism Linked DataModel Organism Linked Data
Model Organism Linked Data
 
2016 ACS Semantic Approaches for Biochemical Knowledge Discovery
2016 ACS Semantic Approaches for Biochemical Knowledge Discovery2016 ACS Semantic Approaches for Biochemical Knowledge Discovery
2016 ACS Semantic Approaches for Biochemical Knowledge Discovery
 
Making it Easier, Possibly Even Pleasant, to Author Rich Experimental Metadata
Making it Easier, Possibly Even Pleasant, to Author Rich Experimental MetadataMaking it Easier, Possibly Even Pleasant, to Author Rich Experimental Metadata
Making it Easier, Possibly Even Pleasant, to Author Rich Experimental Metadata
 
Link Analysis of Life Sciences Linked Data
Link Analysis of Life Sciences Linked DataLink Analysis of Life Sciences Linked Data
Link Analysis of Life Sciences Linked Data
 
Making the most of phenotypes in ontology-based biomedical knowledge discovery
Making the most of phenotypes in ontology-based biomedical knowledge discoveryMaking the most of phenotypes in ontology-based biomedical knowledge discovery
Making the most of phenotypes in ontology-based biomedical knowledge discovery
 
W3C HCLS Dataset Description Guidelines
W3C HCLS Dataset Description GuidelinesW3C HCLS Dataset Description Guidelines
W3C HCLS Dataset Description Guidelines
 
Semantic approaches for biomedical knowledge discovery - Discovery Science 20...
Semantic approaches for biomedical knowledge discovery - Discovery Science 20...Semantic approaches for biomedical knowledge discovery - Discovery Science 20...
Semantic approaches for biomedical knowledge discovery - Discovery Science 20...
 
1st Network-of-BioThings Hackathon
1st Network-of-BioThings Hackathon1st Network-of-BioThings Hackathon
1st Network-of-BioThings Hackathon
 
Powering Scientific Discovery with the Semantic Web (VanBUG 2014)
Powering Scientific Discovery with the Semantic Web (VanBUG 2014)Powering Scientific Discovery with the Semantic Web (VanBUG 2014)
Powering Scientific Discovery with the Semantic Web (VanBUG 2014)
 

Recently uploaded

How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 

Recently uploaded (20)

How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 

The Future of FAIR Data: An international social, legal and technological infrastructure for data discovery and reuse by humans and machines

  • 1. The Future of FAIR Data: An international social, legal and technological infrastructure for data discovery and reuse by humans and machines @micheldumontier::S.NET:2018-06-271 Michel Dumontier, Ph.D. Distinguished Professor of Data Science Director, Institute of Data Science
  • 2. An increasing number of discoveries are made using other people’s data @micheldumontier::S.NET:2018-06-272
  • 3. 3 A common rejection module (CRM) for acute rejection across multiple organs identifies novel therapeutics for organ transplantation Khatri et al. JEM. 210 (11): 2205 DOI: 10.1084/jem.20122709 @micheldumontier::S.NET:2018-06-27 1. Based on gene expression data that is free to use 2. CRM genes correlated with the extent of graft injury and predicted future injury to a graft 3. Mice treated with drugs against the CRM genes extended graft survival
  • 4. However, significant effort was needed to find the right datasets, make sense of them, and ultimately use them for a new purpose @micheldumontier::S.NET:2018-06-274
  • 5. Poor quality (meta)data hampers research
  • 6. @micheldumontier::S.NET:2018-06-276 Lambin et al. Radiother Oncol. 2013. 109(1):159-64. doi: 10.1016/j.radonc.2013.07.007
  • 7. If we are ever to realize the full potential of content we create then we must find ways to reduce the barrier to publish, find and reuse their content in a responsible manner @micheldumontier::S.NET:2018-06-277
  • 8. Why does this matter? @micheldumontier::S.NET:2018-06-278
  • 9. 9 @micheldumontier::S.NET:2018-06-27 Most published research findings are false. - John Ioannidis, Stanford University Reproducibility of landmark studies is shockingly low: 39% (39/100) in psychological studies1 21% (14/67) in pharmacological studies2 11% (6/53) in cancer studies3 PLoS Med 2005;2(8): e124. 1doi:10.1038/nature.2015.17433 2doi:10.1038/nrd3439-c1 3doi:10.1038/483531a
  • 10. @micheldumontier::S.NET:2018-06-2710 Published online 28 September 2011 | Nature 477, 526-528 (2011) | doi:10.1038/477526a
  • 11. @micheldumontier::S.NET:2018-06-2711 we need new ways to think about discovery science We need to improve our confidence in an result by using more data and by using multiple lines of evidence
  • 12. Grand Challenge: Automatically uncover evidence that supports and disputes a hypothesis using the totality of available data, tools and scientific knowledge @micheldumontier::S.NET:2018-06-2712
  • 13. We must build a social, legal and technological infrastructure that facilitates the discovery and reuse of digital resources for people and machines @micheldumontier::S.NET:2018-06-2713
  • 14. Why machines? • Can make sense of vast amounts of information to make personalized, evidence- based decisions to maximize desired outcomes @micheldumontier::S.NET:2018-06-2714
  • 15. Big Data for Medicine @micheldumontier::S.NET:2018-06-2715 Multiple sources of heterogeneous data, including experimental evidence, bioinformatics databases, lifestyle measurements, electronic health records, environmental influences, and biobank findings, can be incorporated together using machine learning algorithms to identify causal disease networks, stratify patients, and ultimately predict more efficacious therapies.
  • 16. Why machines? • Can make sense of vast amounts of information to make personalized, evidence- based decisions to maximize desired outcomes • Can identify and minimize bias in research and in real world applications in a systematic and robust manner • Will eventually be able to independently contribute to all aspects of human interest @micheldumontier::S.NET:2018-06-2716
  • 18. AI Generated Music @micheldumontier::S.NET:2018-06-27 Performance RNN https://magenta.tensorflow.org/performance-rnn 18
  • 21. Principles to enhance the value of all digital resources data, images, software, web services, repositories,… Developed and endorsed by researchers, publishers, funding agencies, industry partners. @micheldumontier::S.NET:2018-06-2721
  • 23. @micheldumontier::S.NET:2018-06-2723 http://www.nature.com/articles/sdata201618 Many of these citations are communities discussing what this means for them June 2018
  • 24. @micheldumontier::S.NET:2018-06-2724 4 Principles (F,A,I,R) and 15 sub-principles. http://www.nature.com/articles/sdata201618
  • 25. FAIR Principles - summarized Findable • Globally unique, resolvable, and persistent identifiers • Machine-readable descriptions to support structured search Accessible • Clearly defined access and security protocols • Metadata is always accessible beyond the lifetime of the digital resource Interoperable • Extensible machine interpretable formats for data + metadata • Use FAIR vocabularies and link to other resources Reusable • Provide licensing, provenance, and use community-standards @micheldumontier::S.NET:2018-06-2725
  • 26. Improving the FAIRness of digital resources will increase their quality and their potential and ease for reuse. @micheldumontier::S.NET:2018-06-2726
  • 27. What is FAIRness? FAIRness reflects the extent to which a digital resource addresses the FAIR principles as per the expectations defined by a community. @micheldumontier::S.NET:2018-06-2727
  • 28. How it might look at DANS @micheldumontier::S.NET:2018-06-2728
  • 30. Communities, such as yours, must make clear your expectations @micheldumontier::S.NET:2018-06-2730
  • 31. Measuring FAIRness • A metric is a standard of measurement. • It must provide clear definition of what is being measured, why one wants to measure it. • It must describe what a valid result is and how one obtains it, so that it can be reproduced by others. @micheldumontier::S.NET:2018-06-2731
  • 32. Qualities of a Good Metric • Clear: anyone can understand the purpose of the metric • Realistic: compliance should not be unduly complicated • Discriminating: the measure can distinguish between those resources that meet the criteria and those that do not • Measurable: the assessment can be made in an objective, quantitative, machine-interpretable, scalable and reproducible manner • Universal: The metric should be applicable to all digital resources @micheldumontier::S.NET:2018-06-2732
  • 33. • 14 universal metrics covering each of the FAIR sub-principles. The metrics demand evidence from the community, some of which may require specific new actions. • Digital resource providers must provide a web-accessible document with machine-readable metadata (FM-F2, FM-F3), detail identifier management (FM-F1B), metadata longevity (FM-A2), and any additional authorization procedures (FM-A1.2). • They must ensure the public registration of their identifier schemes (FM- F1A), (secure) access protocols (FM-A1.1), knowledge representation languages (FM-I1), licenses (FM-R1.1), provenance specifications (FM- R1.2), and community standards (FM-R1.3). • They must provide evidence of ability to find the digital resource in search results (FM-F4), linking to other resources (FM-I3), FAIRness of linked resources (FM-I2), and meeting community standards (FM-R1.3) @micheldumontier::S.NET:2018-06-2733
  • 35. http://hw-swel.github.io/Validata/ @micheldumontier::S.NET:2018-06-2735 RDF constraint validation tool Configurable to any profile Declarative reusable schema description Shape Expression (ShEx) constraints Open source javascript implementation
  • 36. A first assessment using the metrics • Used a simple form to ask for the information needed as input to the FAIR metrics • Questions either require one or more URL or true/false @micheldumontier::S.NET:2018-06-2736
  • 40. Automated assessments are (currently) unforgiving @micheldumontier::S.NET:2018-06-2740
  • 43. H2020 EG: Turning FAIR Data into Reality - Report and Action Plan Consultation Recommendations include: • Sustainable Funding for FAIR components (#5) • Strategic and evidence-based funding (#6) • Cross-Disciplinary FAIRness (#8) • Encourage and incentivise data reuse (#19) • Facilitate automated processing (#25) @micheldumontier::S.NET:2018-06-2743 Consultation until 5 August Consultation is being conducted on the interim report and Action Plan until 5 August 2018. A commentable version of the report is available on Google Drive. Structured comments on the Action Plan and specific recommendations and actions may be made via a dedicated GitHub repository. · Comment on Report: http://bit.ly/interim_FAIR_report · Comment on Action Plan: https://github.com/FAIR-Data-EG/action-plan
  • 44. Analyzing partitioned FAIR health data responsibly Maastricht Study + MUMC CBS Key Aspects: • Goal: Enable the use of federated machine learning to uncover positive and negative determinants of type 2 diabetes using FAIR data from both the Maastricht Study and Statistics Netherlands. • Establish a broader and scalable governance structure and consent define and underpin the responsible use of Big Data in Health. • Create a new social, legal, and technological infrastructure for discovery science in and across a health and non-health settings, with many future uses @micheldumontier::S.NET:2018-06-2744
  • 45. Maastricht University will become the first ‘FAIR’ university By 2023 • Launch of a one-stop shop that will educate and provide support for FAIR-compliant data management and responsible data science • Establishment of an cross-faculty Community of Data-Driven Insights (CDDI) that have routinely made their research products FAIR, and demonstrated the use of other people’s data and services to further their research • Creation of a new social, legal and technological infrastructure to enable and encourage reuse of contributed digital resources • Coordination of digital research infrastructure, teaching and training, e-science capability, and user support. Key Benefits • Increased researcher productivity in using their and other people’s data and services • Improved communication among stakeholders to deploy the latest research data management practices and data science methodology • Efficient coupling of institutional resources with research funding to maximize return on investments @micheldumontier::S.NET:2018-06-2745
  • 46. Responsible Data Science I see data stewardship as a part of responsible data science It’s critical that we find the right incentives for people to participate • credit, funding, promotion, etc • New insights for their own work Automated systems should be ethical-by-design, but we need research and eventually a common infrastructure for this! @micheldumontier::S.NET:2018-06-2746
  • 47. Next steps • Further development of FAIR infrastructure • Development of training, and support for implementation and adoption • Measuring the impact of FAIR for research and innovation @micheldumontier::S.NET:2018-06-2747
  • 48. Summary • FAIR represents a global initiative to enhance the discovery and reuse of all kinds of digital resources • The FAIR concept is maturing quickly. Make sure that your voice is heard. • Huge benefits to be had, particularly in machine processing, but needs to be coupled with the proper incentives and ethics. • It demands a new social, legal and technological infrastructure that currently doesn’t exist in whole, but has to be built for and tested by various communities! @micheldumontier::S.NET:2018-06-2748
  • 49. Acknowledgements @micheldumontier::SEBD:2018-06-2549 FAIR FAIR metrics Dumontier Lab (Maastricht University, Stanford University, Carleton University) MU: Seun Adekunle, Dorina Claessens, Pedro Hernandez Serrano, Andine Havelange, Lianne Ippel, Alexander Malic, Kody Moodley, Stuti Nayak, Nadine Rouleaux, Claudia van open, Chang Sun, Amrapali Zaveri SU: Sandeep Ayyar, Shima Dastgheib, Remzi Celebi, Maulik Kamdar, David Odgers, Maryam Panahiazar CU: Alison Callahan, Jose Toledo-Cruz, Natalia Villaneuva-Rosales
  • 50. michel.dumontier@maastrichtuniversity.nl Website: http://maastrichtuniversity.nl/ids 50 @micheldumontier::S.NET:2018-06-27 The mission of the Institute of Data Science at Maastricht University is to foster a collaborative environment for multi- disciplinary data science research, interdisciplinary training, and data-driven innovation . We tackle key scientific, technical, social, legal, ethical issues that advance our understanding and strengthen our communities in the face of these developments.

Editor's Notes

  1. Abstract Using meta-analysis of eight independent transplant datasets (236 graft biopsy samples) from four organs, we identified a common rejection module (CRM) consisting of 11 genes that were significantly overexpressed in acute rejection (AR) across all transplanted organs. The CRM genes could diagnose AR with high specificity and sensitivity in three additional independent cohorts (794 samples). In another two independent cohorts (151 renal transplant biopsies), the CRM genes correlated with the extent of graft injury and predicted future injury to a graft using protocol biopsies. Inferred drug mechanisms from the literature suggested that two FDA-approved drugs (atorvastatin and dasatinib), approved for nontransplant indications, could regulate specific CRM genes and reduce the number of graft-infiltrating cells during AR. We treated mice with HLA-mismatched mouse cardiac transplant with atorvastatin and dasatinib and showed reduction of the CRM genes, significant reduction of graft-infiltrating cells, and extended graft survival. We further validated the beneficial effect of atorvastatin on graft survival by retrospective analysis of electronic medical records of a single-center cohort of 2,515 renal transplant patients followed for up to 22 yr. In conclusion, we identified a CRM in transplantation that provides new opportunities for diagnosis, drug repositioning, and rational drug design.
  2. G20: http://europa.eu/rapid/press-release_STATEMENT-16-2967_en.htm EOSC: https://ec.europa.eu/research/openscience/pdf/realising_the_european_open_science_cloud_2016.pdf H2020: https://goo.gl/Strjua
  3. G20: http://europa.eu/rapid/press-release_STATEMENT-16-2967_en.htm EOSC: https://ec.europa.eu/research/openscience/pdf/realising_the_european_open_science_cloud_2016.pdf H2020: https://goo.gl/Strjua