Edge Informatics is an approach to accelerate collaboration in the BioPharma pipeline. By combining technical and social solutions knowledge can be shared and leveraged across the multiple internal and external silos participating in the drug development process. This is accomplished by making data assets findable, accessible, interoperable and reusable (FAIR). Public consortia and internal efforts embracing FAIR data and Edge Informatics are highlighted, in both preclinical and clinical domains.
This talk was presented at the Molecular Medicine Tri-Conference in San Francisco, CA on February 20, 2017
Topic 9- General Principles of International Law.pptx
Edge Informatics and FAIR (Findable, Accessible, Interoperable and Reusable) Data
1. Edge Informatics and FAIR* Data
Tom Plasterer, PhD
Research & Development Information (RDI); US Cross-Science Director 20 February 2017
Integrated Pharma Informatics
* Findable, Accessible, Interoperable and Reusable
2. The right data is there when I need it
Your data and my data are mutually understandable
Our data can be effortlessly combined
I am permitted to use any data I can access
Data can be reshaped for a different purpose
Data sharing is rewarded
‘I’ can be a human or a machine
3
We Want Data Nirvana!
7. 9
FAIR Data: Overview
To be Findable:
• Globally unique, resolvable and persistent identifiers
• Machine-actionable contextual information supporting discovery
To be Accessible:
• Clearly defined access protocol
• Clearly defined rules for authorization/authentication
To be Interoperable:
• Use shared vocabularies and/or ontologies
• Syntactically and semantically machine-accessible format
To be Reusable:
• Be compliant with the F, A and I Principles
• Contextual information, allowing proper interpretation
• Rich provenance information facilitating accurate citation
Mark Wilkinson,
Data Interoperability and FAIRness Through Existing Web Technologies
8. 10
FAIR Data: A Brief History
Moving away from Narrative
• Nanopublications
Incubating Standards in Open PHACTS
• VoID, PROV-O
Lorentz Center Workshop
• FORCE 11 FAIR Guiding Principles
• Participants: IMI members, US researchers,
Content providers, ELIXIR; European Open
Science Cloud, Big Data to Knowledge (BD2K)
Current Status:
• FAIR Data Workshops (EU-ELIXIR nodes)
• Inclusion in Horizon 2020, NIH Advocacy
• IMI2 Data FAIR-ification Call
• Vendors getting up to speed
9. 11
FAIR Data: Systems Biology Survey
Molecular Systems Biology
Volume 11, Issue 12, 28 DEC 2015 DOI: 10.15252/msb.20156053
http://onlinelibrary.wiley.com/doi/10.15252/msb.20156053/full#msb156053-fig-0001
10. 12
FAIR Data & Biopharma?
Collaborative & Competitive Intelligence:
• Who do we want to partner with? Are there complementary assets to our portfolio?
• What space is too crowded and not our area of expertise?
• Greenfield situations?
Mergers, Acquisitions, Partnerships:
• How do we efficiently and deeply absorb data generated elsewhere into our systems? How
do we efficiently share?
• Does this make a smaller biotech/start-up a more viable partner?
Improved Patient Care:
• Can we share data and outcomes more efficiently in complicated trial settings (basket trials,
adaptive trials) to better engage opinion leaders and foster dialog?
• Along with Differential Privacy approaches, can we have the broader research community
help mine our data?
Data (Ir)-reproducibility:
• Is preclinical data reproducible?
• Can we utilize data credentialization? (thanks to Dan Crowther @ Sanofi)
11. 13
Differential Privacy (DP): Clinical Data Anonymization
• A quantifiable method for anonymizing data by modifying data fields identified
as those that can aid in the identification of individuals.
• Adapted by large corporations like Apple and Google
to protect the privacy of users of their services.
AZ Differential Privacy Efforts:
• Developed and publishing a DP algorithm designed to anonymize clinical data.
• Developing open source software in R (and Mathematica)
FAIR — DP helps support these guiding principle for scientific data:
• Findable DP may facilitate pharma patient data transparency
• Accessible
• Interoperable Analysis of private and DP data yield the same statistics
• Reusable Enable reuse inside as well as outside the pharma company
firewalls.
Enabling FAIR Guiding Principles for Scientific Data
14. 16
Translate Questions into Concepts: Team Modeling Domain Expert
Concept Map
“Where are the key clinical studies in NSCLC and who are the principle investigators?”
15. 17
Challenge with Data: Remodel
“Where are the key clinical studies in NSCLC and who are the principle investigators?”
(one example)
Challenge with
Linked Data
Source: https://clinicaltrials.gov/ct2/show/NCT02027428
16. 18
Refine the Answer: Configurable Interfaces Examine with a
Faceted Browser
“What are the open trials in metastatic breast cancer and what drugs are being tested?”
17. 19
Share Insights as a Community: Nanopublish
“Can a biomarker defined population be added to a trial record?”
Share insights
with a Knowledge
Base
18. 20
Is CI360 FAIR?
Findable:
• Resources named with URIs, with a defined policy
• Dataset descriptions published with VoID on intranet
To be Accessible:
• Data reachable via REST and SPARQL APIs
• Application secured via SSO
To be Interoperable:
• Uses well-described internal and public ontologies
• All data is linked data (RDF)
To be Reusable:
• Daily updates tracked with VoID and PROV-O
• Vocabularies used in CI360 already reused in four other applications
19. R&D | RDI
Get your plumbing right
• And your data won’t be stuck in a silo
Use Edge Informatics
• Consider handoffs—you don’t know how your data will be used in the future
Leverage working public solutions
• Don’t reinvent the wheel (OK—Ontology…)
Invest in FAIR Data Stewardship
• Investment to future-proof your efforts
FAIR Data and Edge Informatics: Take-aways
20. R&D | RDI
Thanks
Key Influencers
In Linked Data Community
Molecular Medicine Tri-Con 2017
Conference Organizers
AZ/MedImmune Linked
Data Community