Learn how to use Pathway Studio to explore biomarkers and brain regions. With the addition of highly sophisticated visualization tools, users can interactively explore the vast number of connections created to help unravel disease biology. In addition, an innovative new taxonomy based on brain region identifications will be presented. Together, these innovations can be applied to rapidly increase the knowledge of diseases based on published findings.
2. | 2
Outline of this discussion
• Introduction to Pathway Studio®
• CellEffect™ Module
• Biomarker Identification
• Literature Quality Metrics
• Brain Regions
4. | 4
The volume of life scientific literature is exploding
Rapidly approaching 1M new
citations/year in Medline –
HOW TO KEEP UP?
One way is to automatically extract
relevant information from scientific
publications on a massive scale
Using Elsevier’s proprietary
MedScan® NLP technology
5. | 5
How does it all work?
Natural language
processing (NLP)
• syntactic and semantic
analysis of text
• synthesize a structured
representation.
Essential facts are extracted
• predefined fact types
• information triplets
(subject–verb–object).
Domain ontologies
identify types, properties, and interrelationships
of relevant entities in the biomedical literature.
6. | 6
Where does it all come from?
25M+ abstracts from Medline®
and 10,000 journal titles covered
4M+ full text journal articles from
Elsevier and other leading publishers
6.1M+ unique relations (biological facts)
supported by 35M+ references (articles)
Big Data Updated weekly
8. | 8
• What is done by IE?
Take a natural language text from a document source, and extract essential facts
about one or more predefined fact types
Represent each fact with a template whose slots are filled on the basis of what is
found from the text
Information Extraction (IE)
15. | 15
Have you seen this cell?
full name
nickname
aka
formerly known as
scars
and
marks
for short
16. | 16
Defining cell types: from inconsistent names to standard
names
Epitope Basic cell“Attribute”
CD4+ CD25+ regulatory T cell
T-lymphocyte leukocyte
T-cell leucocyte
hemopoetic
hemopoietic
haemopoetic
haemopoietic
hematopoetic
hematopoietic
haematopoetic
haematopoietic
regulatory
immunoregulatory
CD4+CD25+
CD25+FOXP3+
CD4+ CD25+ FOXP3+
17. | 17
Adding cell processes to the mixture
proliferation of
death of
migration of human polarization
cytotoxicity
quantity
Standard
cell
name
Allows Pathway Studio to:
• Have more specific information about cells and
associated cell processes in the database
• Assign specific cell processes to rare cell types
18. | 18
Recognizing cell processes in text
• Information about more specific cell types
• Doubles the number of cell processes compared to Gene Ontology + EmTree
21. | 21
The common denominator: biomarker candidates for behavior
are present across multiple psychiatric disorders
22. | 22
Not all relations are equal!
0
50
100
150
200
250
0 500 1000 1500 2000
Reference#
Relation #
Majority of Biomarker Relations are
supported by a single reference
Anxiety Disorders Depressive Disorder, Major Bipolar Disorder Schizophrenia
1
10
100
1000
Biomarker
Biomarker
GeneticChange
GeneticChange
GeneticChange
GeneticChange
GeneticChange
GeneticChange
GeneticChange
GeneticChange
GeneticChange
GeneticChange
QuantitativeChange
QuantitativeChange
QuantitativeChange
QuantitativeChange
QuantitativeChange
QuantitativeChange
QuantitativeChange
QuantitativeChange
StateChange
Reference#
Behavioral gene relations: reference
support
Relation Type
A large percentage (83.3±9.3%) of the relations reported in
Pathway Studio have only one reference; and most of the
relations (94.7±5.7%) have 1~3 references.
23. Three weights and three scores
23
Three weights for each article:
Quality weight (QW), evaluated by publication age
Citation weight (CW), evaluated by citation number
Novelty weight (NW) , specifies the novelty
Note: All weights use [0,1] scale
Three scores for each relation:
Quality Score (QScore), related to #reference, CW, and QW
Novelty Score (NScore), related to #reference, CW , and NW
Citation Score (CScore), related to #reference and #citations
Note: The scores of a relation are defined by the weights of the
associated articles
When relations are supported by 1-3 references, how to evaluate
(filter) for the most reliable observations without losing valuable
information, for example novelty etc.
25. | 25
Novelty score of a relation
Balances both the number of references and how often they have been cited,
restricted to those relations with references appearing only in the last (n)years.
Table1 Top novel biomarker candidates for Schizophrenia by NScore (NoveltyAge = 2)
NScore: the novelty of a literature-search-based relation
CScore: the total citation number of the supporting references
𝑁𝑆𝑐𝑜𝑟𝑒 = (𝐶𝑊𝑖 + 𝑁𝑊𝑖)𝑛
𝑖=1 ∗ 𝑁𝑊𝑖
𝑛
𝑖=1
𝐶𝑆𝑐𝑜𝑟𝑒 = 𝑁𝑐𝑖𝑡𝑒 𝑖
𝑛
𝑖=1
Entity Entity Type NScore Reference # citNum CScore PubYear PMID
MIR212 Disease -> Protein 3.811 2 12, 5 17 2014, 2015 24694668;25487174
CAMKK2 Disease -> Protein 3.697 2 8, 3 11 2014, 2015 23958956;25497042
STXBP1 Disease -> Protein 2.743 2 0, 2 2 2014, 2015 25069615;25662103
MIAT Disease -> Protein 1.973 1 38 38 2014 23628989;23628989
MIR181B1 Disease -> Protein 1.914 1 12 12 2014 24694668
HIVEP2 Disease -> Protein 1.870 1 8 8 2014 24525328
MIR26A1 Disease -> Protein 1.793 1 5 5 2014 24416161
FKBP5 Disease -> Protein 1.743 1 2 2 2015 25459892
Oligodendroglia Disease -> Cell 1.667 1 3 3 2014 25173695
r_LOC102547241 Disease -> Protein 1.541 1 1 1 2015 25667193
RAB3A Disease -> Functional Class 1.325 1 1 1 2014 25063582
28. | 28
Brain regions are identified using neuroanatomical labels for
locations in 3D space, Automated Anatomical Labeling (AAL)
AAL is a software package and digital atlas of the human brain
typically used in functional neuroimaging-based research.
Amygdala Pallidum
Angular gyrus Paracentral lobule
Anterior cingulate cortex Parahippocampal gyrus
Calcarine fissure Postcentral gyrus
Caudate nucleus Posterior cingulate cortex
Cerebelum Precentral gyrus
Cuneus Precuneus
Fusiform gyrus Putamen
Gyrus rectus Rolandic operculum
Heschl gyrus Superior frontal gyrus, dorsolateral
Hippocampus Superior frontal gyrus, medial
Inferior frontal gyrus, opercular Superior frontal gyrus, medial orbital
Inferior frontal gyrus, orbital Superior frontal gyrus, orbital
Inferior frontal gyrus, triangular Superior occipital gyrus
Inferior occipital gyrus Superior parietal lobule
Inferior parietal lobule Superior temporal gyrus
Inferior temporal gyrus Supplementary motor area
Lingual gyrus Supramarginal gyrus
Median cingulate cortex Temporal pole: middle
Middle frontal gyrus Temporal pole: superior
Middle frontal gyrus, orbital Thalamus
Middle occipital gyrus Vermis
Middle temporal gyrus Insulary cortex
Olfactory cortex
29. | 29
Network connections between ‘smoking’ and the brain region
‘Precentral gyrus’.
Hundreds of individual
literature references
underlie the many
relations presented
here.
31. | 31
Enabling neuroscientists to better access high quality data related
to brain regions and imaging biomarkers
Estimates of as many as 150,000 new relations using 350-
400 unique brain regions will be added to the complete
complement of the Pathway Studio mammalian database
along with ChemEffect, DiseaseFx, and CellFx (in addition to
the 6M+ relations there already).
• Will be available as a customized Enterprise web-
based solution stored on the Amazon cloud
• Pilot tested at a discounted rate in the neuroimaging
community (starting with the NIMH)
• Starting point for the development of a new
dedicated Neuroscience PS module including, for
example, taxonomies for the recently defined NIMH
RDoC biotypes (biomarker-based categories).
32. | 32
• Massive amounts of literature-based, biologically relevant
information available in Pathway Studio
• Over 1800 manually curated pathways
• Highly interactive, dynamic user interface for de novo pathway
construction
• Manually curated domain ontologies identify defined categories
of information (e.g. Biomarkers)
• Coming soon – Literature Quality Metrics, Brain Region
Database
Discussion Summary
33. Thank you for your attention!
c.cheadle@elsevier.com
elsevier.com/solutions/pathway-
studio-biological-research