Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.

Toward F.A.I.R. Pharma. PhUSE Linked Data Initiatives Past and Present

502 visualizaciones

Publicado el

In recent years, the PhUSE organization has supported several Linked Data initiatives. The CDISC Foundational Standards as RDF is an early example of one such initiative. The results are available on the CDISC website. Subsequent proof of concept projects enjoyed marginal success at a time when pharma’s familiarity with the technology was still very limited. A recent surge in interest in F.A.I.R. data and Knowledge Graphs has sparked renewed interest in Linked Data within PhUSE and the industry at large. The recently completed “Clinical Trials Data as RDF (CTDasRDF)” spawned a new project, “Going Translational With Linked Data (GoTWLD).” GoTWLD extends the project scope of its predecessor beyond SDTM into the non-clinical domain.
Educational initiatives at PhUSE include an introductory, interactive workshop at the annual European conference (EU-Connect) and at the US Computational Science Symposium (CSS). A side-project of GoTWLD is investigating the potential use of URIs as study identifiers to promote adoption of Linked Data. Challenges remain, including the need for demonstrable return on investment and the development of user-friendly, intuitive interfaces for graph data. These challenges can be overcome if pharmaceutical companies cooperate in the pre-competitive space.
Presented at Semantics@Roche, Basel 2019-04-04

Publicado en: Tecnología
  • Sé el primero en comentar

Toward F.A.I.R. Pharma. PhUSE Linked Data Initiatives Past and Present

  1. 1. Toward F.A.I.R. Pharma PhUSE Linked Data Initiatives Past and Present Semantics @ Roche 2019-04-04 9:30-10:00
  2. 2. Outline I. Introduction - What is PhUSE? II. How FAIR is (late phase) Pharma? III. Toward FAIR Pharma with PhUSE 2
  3. 3. Tim Williams  Statistical Solutions Lead  UCB Biosciences, Raleigh North Carolina PhUSE  Steering Committee, Computational Sciences Symposium (CSS)  Co-lead  Analysis Results Model (RDF Data Cubes) (2016)  Clinical Trials Data as RDF (2018)  Going Translational with Linked Data (present)  Instructor: Linked Data Hands-on Workshop 3
  4. 4. PhUSE Pharmaceutical Users Software Exchange • Membership: >8,700 spanning 30 countries • Annual Conferences: • EUConnect (November, Amsterdam) • USConnect • Single Day Events • Computational Sciences Symposium (CSS) • A working conference 4
  5. 5. PhUSE Pharmaceutical Users Software Exchange Mission • Provide a welcoming, neutral platform for creating and sharing ideas... exploring innovative methodologies, techniques, and technologies. Working Groups Mission •, transparent, and collaborative forum in a non-competitive environment 5
  6. 6. PhUSE Linked Data Projects Recent Work • CDISC Foundational Standards in RDF • CDISC Conformance Checks • Reusing Medical Summaries for Enabling Clinical Research • Analysis Results and Metadata (RDF Data Cube) • Regulatory Guidance in RDF • Clinical Program Design in RDF • CDISC Protocol Representation Model in RDF 6
  7. 7. II. How FAIR is (late phase) Pharma? 7
  8. 8. Magic mirror on the wall… … is Pharma F.A.I.R. at all?? 8
  9. 9. 5 Star Open Data Principles Web, open license, +/- format Structured, machine readable Non-proprietary format URIs Linked to other data @NovasTaylor 9
  10. 10. F.A.I.R Data Principles Findability F1. globally unique, persistent id F2. rich metadata F3. searchable source F4. metadata specify data id Accessibility A1. retrievable by id using standard protocol A1.1 protocol open, free, universal A1.2 protocol allows authentication A2 metadata avail. when data is not Interoperability I1. formal, accessible, shared, broadly applicable language I2. uses FAIR vocabularies I3. qualified references to other data Reusability R1. plurality of accurate and relevant attributes R1.1 clear and accessible usage license R1.2 provenance R1.3 meets domain-relevant standards10
  11. 11. How Does Pharma Fare on FAIR? Findability F1. globally unique, persistent id Human Study Subject "Bob" • PharmaCo • Study 1, Drug A • Study 2, Drug B • DrugCo • Study 3, Drug C Merge Bob’s data from all studies. PhUSE Project: “Study URI” 11
  12. 12. How Does Pharma Fare on FAIR? Accessibility A2. Metadata available when data is not Biostatisticians and Medical Writers do not have easy access to the metadata from data collection and transformation processes. Data changes form during its journey from collection to analysis. 12
  13. 13. How Does Pharma Fare on FAIR? Interoperability I1. shared language for knowledge Lack • Knowledge representation • language Core Challenge : Change of mindset • Currently: Data modeled to industry standards for submission to regulatory authorities • Future: Models of the process and the entities in the data. Linked Data / Knowledge Graph adoption is helping! 13
  14. 14. How Does Pharma Fare on FAIR? Reusability R1.3 data meet domain-relevant community standards Positive: • We have standards! Negative: • We have standards • Historically are row x column structure • New initiatives are not 5 Star Open or FAIR 14
  15. 15. New CDISC “Library” Initiative API behind membership paywall JSON, No RDF download JSON. XML planned JSON Not possible with above restrictions… 15
  16. 16. III. Toward FAIR Pharma with PhUSE 16
  17. 17. Risk-averse Pharma has a skills deficit 17
  18. 18. Database Popularity 18
  19. 19. Skills Deficit Is it more practical to train: A Knowledge Graph expert in Clinical Trials? A Clinical Trials expert in Knowledge Graphs? or 19
  20. 20. CSS Workshop “Let’s Make a Knowledge Graph!” • Build your own RDF Linked Data Knowledge Graph • Sunday Evening, June 9th • Introduction to Linked Data concepts Also at PhUSE EUConnect, Amsterdam (November) 20
  21. 21. Knowledge Graph Workshop 21
  22. 22. Workshop : Merged Studies 22
  23. 23. Enterprise Knowledge Graphs for Pharma 23 How?
  24. 24. The Roofshot / Moonshot Manifesto Concept & Image Attribution: 1. Study URI 2. Clinical Trial Results Domains 3. Open Ontology Development Invent and apply state- of-the-art Roofshot Incremental impacts in production Moonshot • Enterprise and Industry Knowledge Graphs • Across the Pharma Data Life Cycle
  25. 25. Roofshot 1 : Study URI Based on: “Study URI” – K. Forsberg, D. Goude. PhUSE EUConnect 18 • Easy entry point for Pharma • Familiar Concept: NCT Number (, EudraCT number 25
  26. 26. Study URI and across repositories bind them. One URI to rule them all, One URI to find them, One URI to bring them all 26
  27. 27. Study URI • Value for Patients, Researchers, Agencies, Companies • A unique, immutable Study ID • Links across repositories • Returns information • Context dependent! 27
  28. 28. Wikidata as a Study URI Platform? • Existing Infrastructure • Community • Process • 27k studies already in Wikidata 28 Study URI as a WikiProject? • • Setup a 3 day workshop with stakeholders?
  29. 29. Roofshot 2 : Clinical Trial Results Domains 1. Clinical Trials Data (SDTM) as RDF (CTDasRDF) 2. Going Translational with Linked Data (GoTWLD) 29
  30. 30. Clinical Trials Data as RDF Project Co-leads Dr. Armando Oliva Medical Informatics Consulting Semantica LLC Tim Williams Statistical Solutions Lead UCB Biosciences 30
  31. 31. 31
  32. 32. Data Challenges  Submissions non-conformance* o 32% of submissions at least 1 conformance flag o 20% of uploads to JANUS fail  Variability in standards implementation o Multiple interpretations of implementation guides  Version-conversion problems & costs  Lack of intrinsic metadata  Challenges linking data and standards  Limitations in the row-by-column data model * Conservative estimates 32
  33. 33. Machine-readable, Machine-interpretable Data With built-in: • Meaning (semantics) Interpret • Context • Content • Structure • Purpose • Rules + Traceability • Validity Trustworthiness 33
  34. 34. Project Philosophy • Model the concepts represented in SDTM • Map instance data to the graph model • Re-use where possible – Ontologies, Terminologies • Open Source, pre-competitive, cooperative environment – Results available to everyone Resource Description Framework (RDF) 34
  35. 35. Re-use Existing Sources SDTM Terminology study “mini” study Ontology SNOMED Systematized Nomenclature of Med. Indication Condition MedDRA Med. Dictionary for Regulatory Activities Adverse Events UNII Unique Ingredient Identifier Substances ND-FRT National Drug File Ref. Terminology Pharmacologic Class WHO Drug Dictionary World Health Org. Drug Dictionary Medical Products RDF non-RDF 35
  36. 36. Study HumanStudySubject Medical ConditionStudy Activity DrugProduct hasStudyParticipant afflictedBy participatesIn hasProcotol SpecifiedActivity administered Drug participatesInStudy Concept 36
  37. 37. Adverse Event Representation SDTM & BRIDG Our Model Observation Medical Condition temporally associated with an Intervention 37
  38. 38. Adverse EventInterventionFinding Assessment Adverse Event Hypertension input output Assessor Intervention Drug X 10mg qd 2019-09-30 Observation e.g. BP 160/110 2019-10-10 Medical Condition Hypertension SDTM, BRIDG Our Model Observation 38
  39. 39. 39
  40. 40. Controlled Terminology Study Protocol DM SUPPDM EX VS TS AE Clinical Trials Data as RDF SDTM Modeling and Data Conversion 40
  41. 41. Project on GitHub 41
  42. 42. Going Translational with Linked Data 42
  43. 43. Going Translational with Linked Data Project Co-leads Drashtti Vasant R&D IT Business Partner Translational Sciences at Bayer Business Services Dr. Armando Oliva Medical Informatics Consulting Semantica LLC Tim Williams Statistical Solutions Lead UCB Biosciences 43
  44. 44. Project Focus • Ontology Development • Data Conversion to RDF • Define XML • Non-clinical models & data [new] 44
  45. 45. Study Protocol DM SUPPDM EX VS TS AE Clinical Trials Data as RDF Going Translational With Linked Data DM +SEND SUPPDM +SEND EX +SEND VS +SEND TS +SEND AE +SEND Protocol +SEND Study+SEND ? +SEND ? +SEND Controlled Terminology Controlled Terminology +SEND Modeling and Data Conversion 45
  46. 46. Deliverables Objective Timeline Project Initiation February 2019 Supporting Ontologies PhUSE CSS 2020 Instance Data, Define XML PhUSE CSS 2020 Presentations PhUSE CSS 2019, 2020 PhUSE EUConnect 2019 Conclusion PhUSE CSS 2020 46
  47. 47. Additional Initiatives! • SPARQL Endpoint for Project Data o Query data online, from your desktop Related Philosophy: Current : Data Submission Future : Data Sharing • MedDRA Modeling and Conversion Process 47
  48. 48. Roofshot 3: Cooperative Ontology Development • The challenges cannot be solved by any one person, department, company, agency, or organization. • We must cooperate as an industry. 48
  49. 49. Existing Ontologies 49
  50. 50. How do we ‘open source’ ontology development? • Github? • Existing pre-competitive organizations? • PhUSE • TransCelerate • Pistoia Alliance 50
  51. 51. Open Source Ontology Challenges • Gate keeper • Conflict resolution (approach, code) • Company • Participation • Contribution • Volunteers 51
  52. 52. Ontology Availability and Curation Leverage Existing Portals? • Open PHACTS • OBO Foundry • BioPortal • ….others? 52
  53. 53. Ontologies must be open and Accessible Don’t hide my OWL ! 53
  54. 54. Conclusion 54
  55. 55. Linked Data Adoption in Pharma • Companies adopting and adapting CTDasRDF outputs • Knowledge Graph initiatives at: – AstraZeneca – Bayer – Roche – Sanofi – ….and more • External datasets published as RDF – EBI RDF platform – openPhacts – OpenTargets – …and more 55
  56. 56. The Future Knowledge Graphs are here to stay “The application of graph processing and graph DBMSs will grow at 100 percent annually through 2022..." – Gartner “Top 10 Data and Analytics Technology Trends for 2019” 56
  57. 57. Magic mirror on the wall… … Pharma is getting F.A.I.R-er after all. 57
  58. 58. Thank you! Contact Email: LinkedIn: Twitter : @NovasTaylor 58