This document proposes a framework to map clinician-specified form terms to standardized SNOMED CT concepts by leveraging the semantic structure of clinical forms. It presents a hybrid approach that uses both linguistic matching and structural context to address challenges from term diversity and context. An empirical study on 26 real-world forms shows the hybrid method improves mapping precision by up to 18% and recall by up to 30% compared to baselines. The work demonstrates how semantic form structures can help address context and improve mapping between clinical terms and standardized concepts.
ICT role in 21st century education and it's challenges.
Exploiting Semantic Structure for Mapping Clinician-specified Form Terms to SNOMED CT Concepts
1. Exploiting Semantic Structure for Mapping
Clinician‐specified Form Terms to SNOMED CT Concepts
Ritu Khare1,3, Yuan An3, Jiexun Li3, Il‐Yeol Song3, Xiaohua Hu3, Michele Follen1,2
, Yuan An Jiexun Il‐Yeol Xiaohua Michele Follen ,
College of Medicine Center for Women’s Health Research 1, and Obstetrics and Gynecology2 , College of Information Science and Technology3
Motivation, Problem, and Challenges Structure‐based SNOMED‐CT Mapping Framework
The elements of clinical databases are usually named after the clinical terms
used in various design artifacts. These terms are instinctively supplied by the Form Semantic Semantic
users, and hence, different users often use different terms to describe the same X Information
Training Data
Y
Form Tree
clinical concept. This term diversity makes future database integration and Extraction
analysis a huge challenge.
Semantic SNOMED CT
Semantic Structure –based Category
Terms Category SNOMED CT
SNOMED Form Term Structure Classification Specific
(in Clinical Picker Concept
Mapping/ CT Analyzer Model Mapping (API)
Forms) (configurable)
Standardization Concepts
Fig. 3. Overall Mapping Framework: (1) The form tree structure is analyzed to derive the form context, (2) The
Patient History Form
Diversity Challenge Context Challenge classification model (Naïve Bayes) ranks the SNOMED CT semantic categories suitable for the form context, (3) A
PATIENT (Well Addressed) (Less Explored) category is picked, (4) The most linguistically matching concept in this category is selected as the winner concept.
Name:
Gender: M F
Different clinicians The same form
Exploit the local semantic structure of form tree Select a winner semantic category , and map the
DOB: MRN: specify different term when used in term to the linguistically matching concept within
to determine the term context, and candidate Key Ideas
HISTORY form terms to different contexts, SNOMED CT semantic categories. the determined semantic category.
Chief specify the same may map to
Complaints
l l
clinical concept. d ff
different SNOMED How
H can we
Review of Systems:
e.g., CT concepts. leverage the Results and Contributions
Eyes MRN, or Med.Rec.#. e.g., the term semantic structure Empirical Study with Clinician‐designed Forms
ENMT VitalSigns, Respiratory in Fig. 1 Future Work
Respiratory Constitutional, or and 2. of clinical forms to About the Data About the Methods Leverage other relationships of
Physical status map the form terms The data includes 26 forms collected from 5 BASELINE: Linguistic comparison SNOMED CT and test with other
Fig 1. A Sample Clinician Designed Form into standard healthcare institutions. The forms contain HYBRID: Linguistic as well as
vocabularies from the UMLS.
SNOMED CT over 1500 terms, out of which 954 (63%) are Structural (Contextual) Test within larger frameworks
mappable to SNOMED CT concepts. comparison (See Fig. 3) of health information systems.
Preliminaries: SNOMED CT and Semantic Form Trees
Preliminaries: SNOMED CT and Semantic Form Trees concepts?
Mapping Precision HYBRID++: Linguistic as well as Apply other classification
The Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT) is a 0.89 0.92
0.87
advanced structural comparison techniques and employ
0.89 0.84
widely used medical terminology. It comprises 360,000 clinical CONCEPTS 0.76 0.73
0.78
0.72
sophisticated linguistic
0.69
belonging to various SEMANTIC CATEGORIES. Each concept is represented using 0.63 0.64 0.65 0.66 techniques.
a CONCEPT ID and a FULLY SPECIFIED NAME. A simple search for the term Eyes 0.51
Findings Implications
across the UMLS SNOMED CT browser leads to the following top results: Improvement due to
90 Structural Knowledge
Concept Id Fully‐specified Name Semantic Category Precision structure (Fig 4) has the ability to
80
63342001 Sunsetting eyes Finding (R = recall, P=Precision) address the context Conclusion
Recall
371110006 Immature eyes Disorder Set1 Set2 Set3 Set4 Set5
70 Hybrid over Baseline: challenge, and
18% (P); 2%(R) It is desirable to
362508001 Both eyes, entire Body Structure Baseline Hybrid Hybrid++ 60 Precision improve the overall develop hybrid
Observable
with Term Hybrid++ over Hybrid: mapping
Person Procedure Entity 50
Processing
Recall with 16% (P); 23%(R) approaches that can
Patient Examination Form root 0.74 Mapping Recall Term performance. address both the
0.69 40 Processing
PATIENT
Observable 0.57
Baseline Hybrid Hybrid++
Improvement due to Linguistic Techniques challenges & lead to a
Name: Observable Patient Examination
Entity Entity 0.52
0.49 0.51
0.52
Linguistics (Fig 5) can improve the recall superior performance
0.43 0.43 0.43 0.43
0.45
0.43 Fig 5. Change in Results with
Gender: M F T Respiratory 0.37 2‐3% (P), >30%(R) and address the
Name Gender 0.31 the term processing,
EXAMINATION diversity challenge to a
advanced linguistic technique
T large extent.
Respiratory Observable
Symmetric chest Entity symm. nl perc. Acknowledgements
M F Set1 Set2 Set3 Set4 Set5
expansion Qualifier expan.
Normal Percussion Value Qualifier Fig 4. Mapping Results for 3 Methods National Cancer Institute (National Biomedical Imaging Branch): Grant #P01‐CA‐82710‐09
Value Finding Finding National Science Foundation Grants: NSF CCF 0905291, NSF CCF 1049864, and NSFC 90920005
Fig. 2. A clinical form and its equivalent Semantic Form Tree. Each
node in the tree is tagged with SNOMED CT semantic categories.