Daedalus develops technology to extract the meaning and structure all types of multimedia content. In the field of Healthcare or e-Health, Daedalus' semantic technology allows to exploit automatically the information featured in the Electronic Health Record (EHR).
This presentation covers Daedalus experience in:
• Online health content monitoring
• Semantic enrichment (tagging) of medical records
• Anonymization of medical records
• Multimedia search in medical records
• Detection of interactions between drugs
• Text analytics and data analytics in the health sector
2. Daedalus Technology in the Health sector
! Daedalus develops technology to extract the meaning and structure all types of
multimedia content. Our customers can monetize their content automatically.
! In the field of Healthcare, Daedalus' semantic technology allows to exploit
automatically the information featured in the Electronic Health Record (EHR).
! Multilingual environment: English, Spanish, Portuguese, Catalan
2
DAEDALUS in Healthcare
3. Daedalus Technology in the Health sector
DAEDALUS in Healthcare
CONTENTS
! Presentation
! Projects and Experiences
• Online health content monitoring
• Pilot experience in the detection of interactions between drugs
• Semantic enrichment of medical records
• Anonymization of medical records
• Multimedia search in medical records
• Pilot experience tagging medical reports
! Product Features
! Who we are
eHealth
4. Daedalus Technology in the Health sector
Operations
• How many structured data from the
Electronic Health Record are
processed? What happens with the
unstructured ones?
• Applications:
• Support to codifications ICD9/10,
SNOMED CT, CIMA…
• Support systems for human
operators: codification processes
(e.g. diagnoses registered in parts
in the emergency room)
Unstructured
DAEDALUS in Healthcare
Structured
4
5. Daedalus Technology in the Health sector
DAEDALUS in Healthcare
Monitoring
! In the U.S.A.
75%
Internet
Information about healthcare
Social Networks
Information about healthcare
42%
5
7. Daedalus Technology in the Health sector
DAEDALUS in Healthcare
Monitoring
! What to monitor?
Drugs
Diseases
Reactions to
drugs
7
8. Daedalus Technology in the Health sector
DAEDALUS in Healthcare
Monitoring
! Who’s interested?
Drugs
companies
Health centers
(hospitals, private
clinics)
Administrators of
blogs, forums…
8
9. Daedalus Technology in the Health sector
! Problems that DAEDALUS can help solving
! Is the information of the Electronic Health Record (EHR) all structured?
! Is it well codified?
! Are there fields in which users can type any text, without restrictions?
! How much manual work is required to introduce information in the EHR?
! Can that information be reused?
! Are the archetypes enough to define a semantic interpretation?
! Location of information in different formats
! Considering the amount of information that is generated, both in EHRs and in
scientific literature, tools to ease the search are necessary
! Interaction by means of natural language, including voice
! Analysis processes for Big Data tasks on health data
9
DAEDALUS in Healthcare
10. Daedalus Technology in the Health sector
DAEDALUS in Healthcare
PROJECTS AND EXPERIENCES
11. Daedalus Technology in the Health sector
! Online Health Content Monitoring
! Detection of interactions between drugs mentioned in biomedical literature
! Semantic enrichment of Medical Records
! Anonymization of Medical Records
! Multimedia search on Medical Records
! Pilot experience tagging medical reports
11
DAEDALUS in Healthcare
13. Daedalus Technology in the Health sector
Online Health Content Monitoring
Health Dashboard
! Reputation in Pharma
• Drugs and diseases mentions
• Adverse drug reaction identification
• Trends detection
13
14. Daedalus Technology in the Health sector
PILOT EXPERIENCE IN THE
DETECTION OF INTERACTIONS
BETWEEN DRUGS
15. Daedalus Technology in the Health sector
Pilot experience in the detection of interactions
between drugs
Objective:
! Application of Textalytics eHealth to the detection of interactions between drugs
mentioned in biomedical literature.
! Within the framework of Challenge DDIExtraction 2013, organized as part of the
conference SemEval, experiments related to the identification of interactions between
drugs in medical texts are performed, in the style of the summaries available in
MedLine.
! Model of hybrid analysis that combines Natural Language Processing techniques
(based on Textalytics eHealth) with machine learning techniques.
15
16. Daedalus Technology in the Health sector
Pilot experience in the detection of interactions
between drugs
! Syntactic information
obtained by means of:
16
eHealth
17. Daedalus Technology in the Health sector
17
Process for detecting interactions between drugs
Drugs
Relations
Drugs
Relations Evaluation
Models:
Detection
Effect
Mechanism
Int
Advise
Negations
Train
Documents
SemEval
Sentence
Simplification
jSRE x 5
Appositions
Coordinates
Clause
splitting
Test
Documents
Sentence
Simplification
Ddi Detection
jSRE
Ddi
Classification
Cross-Validation jSRE
Negations
Pos-sintact
eHealth
eHealth
18. Daedalus Technology in the Health sector
Pilot experience in the detection of interactions
between drugs
Evaluation
! The quality of the recognition process is measured in terms of:
! Precision: number of drugs and relationships identified correctly
! Recall: number of drugs and relationships extracted compared to the total in the
existing test texts
! F-Score: weighting of the previous two.
! In the task, systems capabilities are measured in:
! DEC: detection of interactions between drugs
! CLA: classification of the type of drug (can be a drug, a brand, a chemical or
pharmacological relation among a group of drugs and chemical agents that affect
living organisms)
! MEC, EFF, ADV, INT: depending on the type of interaction: mechanisms (MEC),
effects (EFF), notices (ADV) and interactions (INT)
! MAVG: average value of F-Score for the 4 types of interactions
18
19. Daedalus Technology in the Health sector
Pilot experience in the detection of interactions
between drugs
Evaluation
19
! The measure F-Score comes to represent how good an information extraction system
is by taking into account both the precision (correct detection) and the coverage
(wanted elements that have been extracted compared to the ones gone unnoticed).
! Results:
! In the detection of interactions between drugs from text, F-Score values
greater than 70% are obtained (67% precision, 77% recall)
! In the classification in terms of the type of drug that is being referenced (a
medical product, a brand or a compound) F-Score values around the 50% are
obtained (51% precision, 57% recall)
20. Daedalus Technology in the Health sector
! Corpora employed in the evaluation:
20
SPilot experience in the detection of interactions
between drugs
Corpus Description
GENIA 2,000 summaries, 400,000 words and 100,000
annotations of biological terms
Cincinnati 600,000 words with anonymized clinical data
MedLine 200 MedLine summaries noted
BioText 3,500 phrases in which diseases, treatments and
semantic relations among them have been tagged.
EBI diseases 600 phrases in which diseases and symptoms have been
tagged (around 350 UMLS terms)
EDGAR: 100 MedLine summaries in which more than 400 genes
and more than 350 drugs have been tagged
DDi 2,800 phrases in which more than 11,000 drugs and 2,400
interactions between them have been tagged
22. Daedalus Technology in the Health sector
! Objective: Semantic interoperability
! Elements:
• Vocabularies: UMLS " SNOMED CT, ICD-9, ICD-10, CIE-9, CIE-10, LOINC
• Archetypes: reusable clinical models, openEHR
• Templates: views of the archetypes, HL7
• Reference models: specification for the definition of the archetype, ISO13606
! Automatic linguistic treatment helps to structure the Medical Record providing:
• Automatic tagging according to vocabularies
• Links between medical reports with templates
• Multilingual treatment based on Daedalus’ technology
22
Semantic enrichment of Medical Records
23. Daedalus Technology in the Health sector
MK-2012-15-DAEDALUS-01 -23
Semantic enrichment of Medical Records
Use case: automatic classification of Medical Records
! Example of application: automatic assignation of ICD codes to radiology reports.
• ICD (International Statistical Classification of Diseases and Related Health
Problems), standard by the World Health Organization
! Objective:
• Analysis of the justification of medical tests for insurance companies
! Case data:
• Data from urology reports
• Period 1 year
• 978 documents and 45 ICD-9-CM tags with 94 combinations
• Provided by the Department of Radiology at Cincinnati Children's Hospital
Medical Center
24. Daedalus Technology in the Health sector
! Analysis process
24
Semantic enrichment of Medical Records
Morphological
Analysis
• Pre-processing
• Part-of-
Speech (POS)
tagging
Identification of
medical concepts
• Semantic
tagging
(domain
dictionaries)
• Treatment of
acronyms
• Specific
vocabularies
Evaluation
• Measurement
of the quality
of the
resulting
tagging
Result
Text
26. Daedalus Technology in the Health sector
! Why?
! To fully exploit the information already collected on multiple dimensions.
Information to:
! Improve control panels
! Ease the development of clinical tests
! Big Data environment
! Presents the 3 main characteristics of this type of problem:
! Volume: large amounts of data
! Speed: very dynamic
! Variety: very different types
! Privacy
! It is necessary to ensure that the privacy of patient data is not violated.
26
Anonymization of Medical Records
27. Daedalus Technology in the Health sector
! Objective: to ease the analysis and exploitation of the information contained in
Medical Records.
! Linguistic processing technology for the detection of names of persons, addresses,
phone numbers with the purpose of hiding the identity of patients in medical
transactions.
27
Anonymization of Medical Records
29. Daedalus Technology in the Health sector
Information search by voice
! Voice access to the information:
• Voice recognition applied to systems of data search in medical records and
documentation in general:
o Diagnosis indication by voice
o Treatment indication by voice
o Immediate access to the EHR of the patient by voice
29
Multimedia Search in Medical Records
30. Daedalus Technology in the Health sector
Medical Record search by voice
! Voice interaction:
30
Multimedia Search in Medical Records
Transcription
Archive
Search
31. Daedalus Technology in the Health sector
Search on audio or video content
! Example of application:
! Multimedia Search - Search of videos
31
Multimedia Search in Medical Records
32. Daedalus Technology in the Health sector
Search on Medical Records from text
! Location of information:
• Offers alternative search options in situations in which results cannot be obtained.
• Construction of alternatives that correct common orthographic mistakes,
calculating the similarity between search terms and the indexed ones, also offering
the user selection possibilities (e.g. “Did you mean...?")
• Semantic search using domain ontologies as UMLS.
32
Multimedia Search in Medical Records
33. Daedalus Technology in the Health sector
Use case: search on medical records and images
! Searches over a collection of medical cases consisting of:
• Images (50,000 approx.)
• Textual descriptions of the cases (in English and French)
! To search, only images are used (X-rays, scanners...) and, occasionally, text
! Context of work: experiments at the European Forum CLEF (Cross Language
Evaluation Forum) on search for information
33
Multimedia Search in Medical Records
34. Daedalus Technology in the Health sector
Use case: search on medical records and images
! Experiments in ImageCLEFMed (CLEF European Forum)
34
Multimedia Search in Medical Records
35. Daedalus Technology in the Health sector
Multimedia Search in Medical Records
Use case: search on medical records and images
35
! Examples of multilingual information search on ImageCLEFMed experiments
(European Forum CLEF)
37. Daedalus Technology in the Health sector
37
Pilot experience in tagging medical reports
Steps:
! Obtaining resources in the appropriate format for Textalytics infrastructure. Based
on UMLS.
! Building a tagger able to analyze the input text, extract noun phrases and get the
corresponding ICD9 code according to their similarity to the resources’ entries.
! Actual reports provided by a hospital have been transcribed combining OCR
techniques and manual processes. Codes have been noted down and used to
evaluate the tagging prototype.
38. Daedalus Technology in the Health sector
38
Pilot experience in tagging medical reports
Linguistic resources
UMLS
• Terms in Spanish
• Combination of SNOMED in Spanish and
SNOMED in English
• Use of semantic relationships (same_as)
referring to concepts
ICD9 ES
Dict.
39. Daedalus Technology in the Health sector
39
Pilot experience in tagging medical reports
Linguistic resources
! Filtering of UMLS to obtain terms in Spanish and their respective ICD9 code.
! Filtering of the resulting thesaurus consisting of more than 45,000 terms.
! Many of these are common polysemic words leading to a top labelling.
! The frequency of appearance in the thesaurus is considered to filter words
with poor semantic content.
! An additional dictionary with acronyms and abbreviations of the medical domain
has been included.
40. Daedalus Technology in the Health sector
40
Pilot experience in tagging medical reports
Architecture of the solution
! Some elements:
1. Preprocessing:
Linguistic analysis of the input text by means of Textalytics to identify noun
phrases.
2. Rules
Inference to identify ICD9 codes by characterizations.
Example: if a phrase contains the structure “number”+ “measurement unit”,
at least the name of a drug and the word ‘treatment’, then its code will be
V58.69
42. Daedalus Technology in the Health sector
! Daedalus technology for semantic enrichment in Healthcare:
eHealth
! Functionality:
! Semantic tagging according to ontologies in the domain of healthcare (UMLS):
diseases, procedures, drugs, symptoms... relations between elements
! Treatment of linguistic variants: gender and number, acronyms and abbreviations,
aliases
! Multilingual environment: English, Spanish, Portuguese, Catalan
42
Textalytics eHealth
43. Daedalus Technology in the Health sector
! Specific multilingual dictionaries for the domain of healthcare based on UMLS:
43
Textalytics eHealth
Dictionary Coverage (terms)
Diseases 81.119
Symptoms 5.505
Organisms 23.941
Organs/Body parts 73.863
Functional concepts 3.885
Treatments and procedures 134.782
Drugs/Chemicals 264.709
Proteins 42.117
Genes 58.300
TOTAL 688.221
44. Daedalus Technology in the Health sector
! Daedalus technology for semantic enrichment:
eHealth
44
Textalytics eHealth
45. Daedalus Technology in the Health sector
Use case: integration in other linguistic processing
platforms: GATE
45
Textalytics eHealth
! GATE: General Architecture for Text Engineering
! GATE is a tool aimed at non-technical staff for the analysis of large collections of texts
through the combination of different linguistic processes.
47. Daedalus Technology in the Health sector
Who we are
! Since 1998 we offer solutions and services for the information society.
! Private limited company.
! Our main line of activity focuses on the extraction of meaning from multimedia content
in order to monetize to the maximum the content managed by our customers.
! Clients: big companies in all sectors: media, defense, telecommunication, energy, public
administration, etc.
! Vocation: innovation, with active participation in national and European R&D projects.
47
DAEDALUS in Healthcare
48. Daedalus Technology in the Health sector
DAEDALUS, S.A.
Head Office:
López de Hoyos 15
28006 Madrid
Technical Department:
Edificio Vallausa II
Albufera 321
28031 Madrid
Tel: +34 913.32.43.01
info@daedalus.es
http://www.daedalus.es
48
DAEDALUS in Healthcare