Más contenido relacionado Similar a Bringing Clinical Data Together with Neo4j (20) Bringing Clinical Data Together with Neo4j1. Covid–19 Graph!
Bringing Clinical Data
Together with Neo4j
Dave Iberson-Hurst
Partner, S-cubed
Kirsten Walther Langendorf
Principal Consultant, S-cubed
17th September 2020
Copenhagen
2. 2 | ©2020 S-cubed
S-cubed
• A3 Suite platform
• MDR
• SWB
• Linked Data Services
• CDISC Training and
Support
• Regulatory Development
Strategy
• Clinical Trial Documentation
• Marketing Authorisation
Applications & Licence
Maintenance
• EU SME Status, EU
OMPD Holder
• QA & GXP Services
• Statistical Consultancy
• SAS Programming
• Data Management
• CDISC services
• Statistical Analysis and
Reporting
• Quality Assurance
• Biostatistics
• Clinical Data Management
• Pharmacovigilance
• Medical Monitoring
• Risk Based Monitoring
• Operational Reporting
• Qlik Extensions Data Analytics
(Qlik)
Biometrics
Clinical
Standards
Management
Regulatory
Affairs
3. 3 | ©2020 S-cubed
Our “Study” World
Collect Organize Analyse ResultsPlan
7. 7 | ©2020 S-cubed
Electronic Health Records
9. 9 | ©2020 S-cubed
And Some Cypher …
o We use a lot of rectangular
structures, but we can recreate these
with Cypher queries
10. 10 | ©2020 S-cubed
Change in daily life due to COVID-19
11th March 2020
How can I
help?
Letter from authorities
11. 11 | ©2020 S-cubed
Covidgraph.org
Mycontribution
12. 12 | ©2020 S-cubed
The source data
ClinicalTrials.gov API
Limit
More studies than limit
Study counter
13. 13 | ©2020 S-cubed
Looping with Cypher
Allowing for all studies to be included
Getting the total number of
studies and divide by 1000 to
get the number of loops needed
14. 14 | ©2020 S-cubed
Looping with Cypher
Allowing for all studies to be included
Getting the total number of
studies returned by the looping
15. 15 | ©2020 S-cubed
Modelling ClinicalTrials
Converting tabular info to a graph
My pharma
experience
with Clinical
Trials
16. 16 | ©2020 S-cubed
{
"NCTId": [
"NCT04366271"
],
"LocationFacility": [
"Hospital Universitario de Getafe",
"Hospital Universitario de Cruces",
"Hospital Universitario de La Princesa",
"Hospital Infantil Universitario Niño Jesus",
"Hospital Ramón Y Cajal",
"Complejo Universitario La Paz"
],
"Rank": 2,
"LocationCity": [
"Getafe",
"Barakaldo",
"Madrid",
"Madrid",
"Madrid",
"Madrid"
],
"LocationState": [
"Madrid"
],
"LocationCountry": [
"Spain",
"Spain",
"Spain",
"Spain",
"Spain",
"Spain"
]
}
}
From Json to nodes and relationships
From lists to graph
1
1
1
UNWIND study_metadata.NCTId as Id
match(ct:ClinicalTrial{NCTId:Id})
WITH Id, ct, study_metadata, RANGE(0,size(study_metadata.LocationFacility)-1) as
nfacil
FOREACH(i in nfacil |
MERGE(fa:Facility{name:study_metadata.LocationFacility[i]})
MERGE(ci:City{name:study_metadata.LocationCity[i]})
MERGE(c:Country{name:study_metadata.LocationCountry[i]})
MERGE(ct)-[:CONDUCTED_AT]->(fa)
MERGE(fa)-[:LOCATED_IN]->(ci)
)
WITH Id, study_metadata, RANGE(0,size(study_metadata.LocationCity)-1) as ncity
FOREACH(i in ncity |
MERGE(ci:City{name:study_metadata.LocationCity[i]})
MERGE(c:Country{name:study_metadata.LocationCountry[i]})
MERGE(ci)-[:LOCATED_IN]->(c)
)
17. 17 | ©2020 S-cubed
Missing values
In json input
?
18. 18 | ©2020 S-cubed
Missing values
In json input
Split in two
19. 19 | ©2020 S-cubed
o Neo4j Professional certification in 2019
• The task with covidgraph was a great opportunity to get
some cypher and graph experience
o Modelling can be improved
• Data consistency can be a challenge when creating
nodes and relationships
• Could consider some more advanced techniques to
‘clean’ the data (ML)
• Need to be aware of missing values in json when
creating queries
• Next steps: adding trial results
o Great to be part of a helpful team
• Even though I am not a healthcare professional I can
help to provide a better understanding of COVID-19
Lessons Learned
Virtual collaboration