This document discusses using network and semantic analysis to map disciplinary structures in cognitive neuroscience. It provides examples of contemporary meta-analyses tools like Neurosynth and the Cognitive Atlas that synthesize knowledge in the field using semantic terminology and brain locations. The document outlines applying network analysis techniques like text network analysis to represent relations between anatomy and concept terms found in cognitive neuroscience literature. It describes generating networks from a corpus of cognitive neuroscience articles and analyzing the conceptual, anatomical, and functional network structures that emerge. Limitations and future directions are also discussed.
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
Mapping the Structure of Cognitive Neuroscience
1. Mapping Disciplinary Structures
Using Network & Semantic Analysis
Greg Appelbaum
&
Elizabeth Beam
Duke University Library
Digital Scholarship Series
November 1, 2012
3. Cognitive Neuroscience Timeline
“Cognitive Revolution”
1950s - Broadbent, Chomsky, Miller…
PET TMS
EEG is developed
Reivich (1979) Barker et al (1985)
(Berger 1929)
1925 1950 1975 2000
Action potentials discovered MRI developed
Hodgkin & Huxley (1938) Lauterbur (1973)
fMRI
BOLD response first measured
Ogawa et al (1990)
Gazzaniga and Miller coined the name
“cognitive neuroscience” over martinis
at the Rockefeller Faculty Club (1976)
Adapted from: The Student's Guide to Cognitive Neuroscience by Jamie Ward, 2006
4. Information Curve
Number of studies with fMRI of Published entries by indexing
functional MRI in their title/abstract service per year
Blowup shows new influx of sources (Wikipedia)
Literature review is now somewhat of an intractable problem…
There is a profound need for tools that can synthesize.
7. Scientometrics
• The science of measuring and analyzing science.
• Is a robust field, that incorporates numerous statistical and mathematical
methods to reveal quantitative features of science.
• The lion’s share of these involve citation-based methods and clustering
algorithms to arrive at ‘knowledge-bearing units’
8. Scientometrics
2007 ”UCSD Map of Science”, Boyack and Klavans
• The largest maps of science to date (ISI and Scopus databases).
• 7.2 million papers published in more than 16,000 journals between 2001-2005.
9. Content-Based Mapping
• As a result of the informatics revolution (particularly digital archiving) it is possible
to scrutinize the content of science on large scales with relative ease.
• In turn new meta-analytic techniques can provide powerful tools for the critical
scrutiny of what is known…. Metaknowledge!
Evans and Foster (2011) Metaknowledge, Science.
10. Network Analysis
• A network consists of nodes connected by links.
• Formal network analysis was developed by sociologists, who have
studied links between friends, terrorist cells, and disease carriers.
• You may be familiar with other popular applications of network concepts.
11. Why Apply Network Analysis
to Cognitive Neuroscience
• Cognitive neuroscience aims to understand relations between brain
anatomy and behavioral function.
• These relations are linked through experimental findings to create a
network consisting of the core empirical support of the discipline.
• The rhetoric used to express these relations should be a suitable target
for meta-analysis.
12. N e t w o r k Te x t A n a l y s i s
In a source memory study, we used two novel approaches to data
analysis that allowed item memory strength and source memory
strength to be assessed independently. First, we identified regions in both
hippocampus and perirhinal_cortex in which activity varied as a
function of subsequent item memory strength while source memory
strength was held constant at chance levels. Second, we identified regions
in prefrontal_cortex in which activity varied as a function of
subsequent source memory strength while item memory strength
was held constant. These findings suggest that activity in the
medial_temporal_lobe is predictive of subsequent memory
strength, whereas activity in prefrontal_cortex is predictive of
subsequent recollection.
• Network text analysis takes words as nodes.
• Two nodes are linked if they co-occur in the same window of
adjacent words in the text.
• To represent the field of cognitive neuroscience, we visualized
links between anatomy and concepts terms.
13. The Corpus
• Source:
• Nature Neuroscience, Neuron, Neuroimage, Journal of Neuroscience, Journal
of Cognitive Neuroscience
• January 1, 2008 through June 30, 2010
• 7,675 articles
• Criteria for inclusion:
• Use of functional magnetic resonance imaging (fMRI) for primary data collection.
• Stated goals of understanding links between the human brain and some function.
• A report of empirical data collected for the current article.
• Final Corpus: Abstracts and Titles from 1,127 fMRI studies
• Versus: EEG (346), PET (120), and TMS (109)
14. Term Classification
• Two semantic categories were defined for Anatomy and
Concept terms.
– Anatomy terms referred to one of the following:
1. A brain structure (e.g., “hippocampus”)
2. A functionally defined region (e.g., “fusiform face area”)
– Concept terms belonged to one of the following categories:
1. A domain of cognitive neuroscience (e.g., “memory”)
2. A process within a domain (e.g., “working memory”)
3. A stimulus property (e.g., “face” or “risk”)
15. Text Preprocessing
• Normalized for grammatical variants of terms
• Because standard thesauri do not cover neuroanatomical
terms, nor the jargon of cognitive neuroscience, we authored
custom thesauri.
– BIGRAM THESAURUS:
• prefrontal cortex prefrontal_cortex
– GENERALIZATION THESAURUS:
• pfc, prefrontal, prefrontal cortices prefrontal_cortex
16. T h e To p Te r m s
Frequency Frequency
1. Vision 637 1. PFC 356
2. Memory 556 2. Amygdala 329
3. Behavior 497 3. ACC 272
4. Information 490 4. Hippocampus 269
5. Attention 488 5. Parietal Cortex 227
6. Representation 450 6. Visual Cortex 177
7. Control 449 7. Intraparietal Sulcus 152
8. Object 442 8. mPFC 138
9. Perception 439 9. Insula 131
10. Cognition 434 10. Cerebellum 129
17. Network Generation
6 word sliding window
These results support
the hypothesis that
specific subregions in
the MTL are associated
with item memory and
memory for context.
~50 highest weighted links
18. 3 Networks
Conceptual Structure:
-- concept x concept
Anatomical Structure:
-- anatomy x anatomy
Functional Structure:
-- (anatomy x concept + concept x anatomy)
21. Centrality Measures
• DEGREE: The number of direct connections that a node has.
– Nodes with high degree are “connectors” or “hubs.”
– is highly correlated (r~.8) with frequency
• BETWEENNESS: It is equal to the number of shortest paths
from all vertices to all other paths that pass through that node.
– Nodes with high betweenness are “brokers.”
22. Conceptual Structure
Betweenness vs. Frequency
0.06
Emotion
Cognition
Selection Learning
0.05
Control
Betweenness Centrality
Recognition Spatial
0.04 Word
Movement
0.03
Inhibition
Representation
Priming
Encoding Prediction
Sensory Action
Auditory Face Observation Vision
0.02 Error Memory
Suppression Perception
Working Memory
Reward
Verbal Category Speech Attention
0.01 Semantic
Load Motion Retrieval Motor Object
Social
Executive
Future Risk
0
0 100 200 300 400 500 600
Frequency
28. Structural Synonymity
Conceptual Network Anatomical Network
• Second-level Positional Analyses
– Second-order projections link terms that occupy similar positions in
the network and therefore represent semantic synonyms.
– Computed as the correlation coefficient between each row in the
adjacency matrix of link weights
29. Observations
• Positive structure. Whereas concepts terms organize
around hubs for perception/attention, representation, and
control, a few highly central anatomy terms lead into
branches representing processing streams.
• Negative structure. Islands appear as collections of
isolated terms on the networks, while gaps are revealed
by network measures as terms with high betweenness
centrality relative to frequency.
30. Limitations #1
• A not B problem
– The co-occurrence of two terms in text could reflect a positive association, a negative
association, or even the speculation about an unknown association.
– Without further (difficult) coding, we cannot resolve this problem.
– See however, typical Google search queries…
• Curration
– How to move this to an autonomous process?
32. Future Directions
• Larger longitudinal corpus to
map relationships over time.
• Extend these tool beyond
cognitive neuroscience.
• Develop web-based tools
…that interface with other existing
meta-analysis tools.
Contour density maps show a birds eye
view of the landscape.
33. Conclusions
• Using this approach we are able to endogenously map the
knowledge space of cognitive neuroscience.
• We are able to identify terms that are understudied
compared to their importance.
• These results can provide prescriptive recommendations
for topics whose further study will most efficiently build new
links between structure and function.
36. An Integrated Approach to the Collection and Analysis of Network Data*
Kathleen M. Carley, Jana Diesner, Jeffrey Reminga, Maksim Tsvetovat
37. Cognitive Neuroscience:
Journal Citation and Topic Maps
• All articles from authors who contributed to the Summer Institute Cognitive
Neuroscience. (5 year intervals)
Journal Citation Map
1988 2007
CNS Topic Maps
Bruer (2004) Mapping Cognitive Neuroscience: Two‐dimensional perspectives on twenty years of cognitive neuroscience research
38. Prospective Trace of a Sub-Discipline (Bruer, 2010)
• The author set consists of 28
authors highly active in attentional
research in the mid-1980s.
• “By 1990 a distinct cognitive
neuroscience specialty cluster
emerges, dominated by authors
engaged in brain imaging research”
Bruer, J.T. (2010) Can we talk? How the cognitive neuroscience of attention emerged from
neurobiology and psychology, 1980–2005. Scientometrics.
39. Neuroeconomics
the neural basis of decision making
Levallois et al (under review) Translating Upwards: Linking the Neural and Social
Sciences via Neuroeconomics,
41. Scientometric Interpretations of
Network Structure
Future Directions
① Intrinsic structure
– Connectivity and position of nodes
– Clusters within the network
② Global structure
– Network density and topology
– Changes in networks over time
The local neighborhood of our 5
Contour density maps show a birds eye
journals (around here)
view of the landscape.
42. Scientometrics
2002 Map of Science: Boyack and Klavans
Bibliographic Coupling of SCI & SSCI to arrive at „Neighborhoods‟ and „Disciplines‟.
– 730,000 papers, 7,300 journals, 671 “disciplines”
43. References
Carley, K. M. and Reminga, J. (2004) ORA: Organization Risk Analyzer. CASOS Technical
Report CMU-ISRI-04-106, 1-45.
Carley, K. M. (2006) A dynamic network approach to the assessment of terrorist
groups and the impact of alternative courses of action. Visualizing Network
Information KN1, 1-10.
Moody, J. and Light, R. (2006) A view from above: the evolving sociological landscape.
The American Sociologist 37.2, 67-86.
Moody, J. (2011) Introduction to Social Network Analysis. Social Science Research
Institute Workshop.
Moreno, J. L. (1953) Who shall survive? New York: Beacon House.
Newman, M. E. J. (2004) Co-authorship networks and patterns of collaboration.
Proceedings of the National Academy of Sciences 101.s1, 5200-5205.
I show people stuff, record their brain activity, and try to figure out how the stuff in between works…The techniques measure changes in electrical voltage potentials or regional blood oxygenation when the brain is subjected to various stimuli
And things have really taken off at this point for CNS… this plot just goes to 2007 but the growth is exponentialWhat is more, there is evidence that historically fMRI studies are cited ~3 times as often as other CNS methods (Fellows et al JOCN 2005)These information curves are reflective of the larger literature.. Need meta analysis tools
http://neurosynth.org/Activation coordinates are extracted from published neuroimaging articles using an automated parser. The full text of all articles is parsed, and each article is 'tagged' with a set of terms that occur at a high frequency in that article. A list of several thousand terms that occur at high frequency in 20 or more studies is generated. For each term of interest (e.g., 'emotion', 'language', etc.), the entire database of coordinates is divided into two sets: those that occur in articles containing the term, and those that don't. A giant meta-analysis is performed comparing the coordinates reported for studies with and without the term of interest. In addition to producing statistical inference maps (i.e., z and p value maps), we also compute posterior probability maps, which display the likelihood of a given term being used in a study if activation is observed at a particular voxel.Cognitiveatlas.orgThe Cognitive Atlas is a collaborative knowledge building project that aims to develop a knowledge base (or ontology) that characterizes the state of current thought in cognitive science. The project is led by Russell Poldrack, Professor of Psychology and Neurobiology at the University of Texas at Austin in collaboration with the UCLA Center for Computational Biology (A. Toga, PI) and UCLA Consortium for Neuropsychiatric Phenomics (R. Bilder, PI). It is supported by grant RO1MH082795 from the National Institute of Mental Health.
http://neurosynth.org/Activation coordinates are extracted from published neuroimaging articles using an automated parser. The full text of all articles is parsed, and each article is 'tagged' with a set of terms that occur at a high frequency in that article. A list of several thousand terms that occur at high frequency in 20 or more studies is generated. For each term of interest (e.g., 'emotion', 'language', etc.), the entire database of coordinates is divided into two sets: those that occur in articles containing the term, and those that don't. A giant meta-analysis is performed comparing the coordinates reported for studies with and without the term of interest. In addition to producing statistical inference maps (i.e., z and p value maps), we also compute posterior probability maps, which display the likelihood of a given term being used in a study if activation is observed at a particular voxel.Cognitiveatlas.orgThe Cognitive Atlas is a collaborative knowledge building project that aims to develop a knowledge base (or ontology) that characterizes the state of current thought in cognitive science. The project is led by Russell Poldrack, Professor of Psychology and Neurobiology at the University of Texas at Austin in collaboration with the UCLA Center for Computational Biology (A. Toga, PI) and UCLA Consortium for Neuropsychiatric Phenomics (R. Bilder, PI). It is supported by grant RO1MH082795 from the National Institute of Mental Health.
Meta data about institutions, countries of origin, patents, etc… can be included
Can be used to map institutional strategies… e.g. NSF and NIMHAre used to determine which fields are most closely connected, which produce the most patents, and which are the most intellectually vitalConnectedness coefficients between fields are calculated year-by-year to measure change. Authors found that connectivity is going up in ‘distant’ fields. This indicates that science is in a state of change.
A network consists of nodes connected by edges, in the mathematical lingo, or what we call “links.” When you’re clicking through pages on Facebook, you can imagine that you are a node linked to your friend John, another node, and that John is linked to a few of your friends in addition to people that you aren’t friends with. This sort of social network is a complex, hierarchical structure, in which you can be indirectly connected to millions of other people (750 million users) around the world through your few hundred friends. The idea of social networks is not new at all to sociologists, who in the 1930’s first used what they referred to as “sociograms” to describe interpersonal relations. More recently, social network analysis has been applied to study networks of terrorist cells and scientific journal co-authors. Sociologists have also developed mathematical tools to measure network properties, as well as computer programs to visualize large networks. The concept of networks is also important for neuroscientists. We all know that behaviors are not controlled by discrete brain areas working independently of each other, but rather by complex anatomical networks of a number of distally connected regions. In learning and memory research, a more sophisticated neural network model has been proposed for an associational organization of memories.
The sources of bias: (1) Across subfields, research on a given topic may be advanced to varying degrees. The boundaries between those subfields may be more or less permeable. (2) Imprecision in terminology leads to unnecessary distinctions and unwanted conflations. What one calls “working memory” may be referred to as “cognitive control” by another. (3) Once a brain function is linked to a function, knowledge of that link can shape the direction of future research. This may lead to reification of concepts as new researchers apply old labels to their findings.
We’ve applied the idea of a social network to study the semantic structure of cognitive neuroscience. Instead of people or neurons, our nodes are words from the abstracts of fMRI articles. The idea is that, typically speaking, authors use anatomy and concepts words together when they are describing a relationship between a brain area and a behavior or cognitive process. By searching for word co-associations, we extract this relational information. Then, we integrate these relationships across the literature into a network structure.
Functional MRI was selected largely because of its popularity: it was the most widely used human neuroimaging technique in the unfiltered pool. Additionally,by restricting our analysis to studies that employed a common neuroimaging method, we minimized differences in terminology and rhetoric. Because it was byfar the most common human neuroimaging method, fMRI was a good PROXY for the complete literature.The curation of the corpus involved parsing through those 7,675articles in the selected journals and time range to identify those that used fMRI to study human behavior or cognition. This meant excluding articles that referred to fMRI for atlas generation, in studies of fMRI methods or the hemodynamic response, etc.
Next, we generated alist of all words in the corpus of abstracts. This included over 15,000 unique words.The list sorted by frequency, and the 100 most frequent Anatomy and Concept terms were identified.
First, a bigram thesaurus was created to collapse word pairs to single words by replacing spaces with underscores. This involved generating a list of words that co-occurred together, and identifying those that fit one of our semantic categories. The process was then iterated for longer phrases: for example, “primary somatosensory cortex” was first converted to “primary_somatosensory.” A new co-occurrence list was generated, and we found that “primary somatosensory” appeared with “cortex.”Second, a generalization thesaurus was created to normalize for plurals, acronyms, and hyphenated compounds. To avoid manually searching the entire list, we only normalized for variants that were more common than the 100th most frequent words on the list. Finally, these terms were converted into presentable titles for the vsualization.
The terms used to generate the networks were the 100 most frequent word forms to appear in the text after preprocessing. The final judgment of term appropriateness for the two lists was made by two expert raters (authors LGA and SAH) who evaluated every candidate term.It is interesting to note that Ccncepts words are, on the whole, used more frequently. Though not shown, even the minimum frequencies for words to make the top-50 lists were higher for concepts than anatomy (50 vs. 15). This suggests that a larger portion of each abstract is devoted to discussing the behavioral components of the experiment, results, and implications. We’ll see that these frequency shifts bear implications for the networks, resulting in a higher network density for concepts vs. anatomy terms.
Automap software was used to generate a meta-network comprised of links within and between Anatomy and Concept node classes. A link was identified as the co-occurrence of two terms within a moving window of six adjacent words that appeared in the same sentence. Links were directed from the first to the second term, as read from left to right across the text within the window. Link weights were calculated from the sum of term co-occurrences throughout the corpus and were used to construct the three networks: Conceptual (Concepts to Concepts), Anatomical (Anatomy to Anatomy), and Functional (Anatomy to Concepts and Concepts to Anatomy).In order to create an interpretable visualization, we applied a filter to link weights so that only the top fifty nodes with the most highly weighted links are shown.
The Conceptual, Anatomical, and Functional networks are substructures within the larger meta-network.
Represents the “cognitive” or psychological underpinnings of the cognitive neuroscience field. Intuitive arrangement.... “Attention,” for example,” is connected to “control,” “top-down,” “selection,” “spatial,” and “vision.There appears to be a central hub of words including “control,” “attention,” “vision,” “object,” “representation,” and “motor” that relate to different domains in cognitive neuroscience. It is notable that “memory,” the second most frequent term, is disconnected from the central hub, appearing at the center of its own cluster off to the side.This is the densest network of the three. The minimum weight threshold is 51, which is the highest threshold of all the networks. Several disconnected islands -- e.g. neuroeconomics
Before going further, it is useful to considerthe connections within the island and between the mainland had we A) picked a lower threshold and B) picked a higher threshold.In either case the same set of nodes would have been largely isolated from the rest of the network.Bootstrapping empirical approaches will be useful in quantifying this further, but here is antidotal evidence
There are a few measures that can be used to quantify the relative position of a node in the network. Degree centrality is the raw number of a node’s connections, and this is highly correlated (r~.8) with frequency in our network. Note that we compute betweenness centrality that does incorporate link weight.
The plot of frequency shows thatbetweenness and frequency are loosely correlated, as the more frequent terms tend to serve linking roles in the network. Nonetheless there are a number of interesting nodes which are more or less between than predicted by frequency. There is a cluster of highly frequent terms in the bottom right corner below the regression line, indicating that they have lower betweenness than expected by their frequency. This suggests that words like “vision,” “memory,” and “reward,” which refer to distinct domains, tend to be used in the context of only their own domains.At the top of the plot, words like “selection,” “emotion,” and “control” refer to processes that span those domains. While not as frequent, these terms create bridges throughout the network.
The network of anatomy terms is considerably less dense than the concepts network, and it lacks a central hub or self-organized clusters. Instead, it appears to be dominated by three terms: “PFC,” “amygdala,” and “ACC.” From these and other common terms, branches emerge relating to processing pathways. For example, “insula” gives rise to a sensorimotor pathway linking cortical and subcortical regions.Finally, there are several groups of terms that are entirely disconnected from the network. These correspond to visual regions and prefrontal regions.
The plot of betweennes vs. frequency shows a cluster of terms in the top left which are considerably more between than expected by frequency. Insula, for example, has the second highest betweenness despite coming in only 9th on the frequency list. This is likely due to its important role in connecting the somatosensory branch to the main network.The most notable outlier is “thalamus,” which is the most highly between term despite its relatively low position at 15th on the frequency list. Its high betweenness is likely due to its direct connections to both “amygdala” and “ACC,” two of the most frequent and central nodes in the network. .
Drilling down on “thalamus” shows that it is connected to a great many more nodes than shown on the thresholded network. Its low-level connections to “hippocampus,” “mPFC,” and “parietal cortex” likely contribute to its high betweenness. This is an important reminder that although the network visualization only displays nodes with the strongest connections, the measures are based on the connectivity of the entire network.
The most direct representation of cognitive neuroscience. The network structure appears to be driven by anatomy terms which have a higher relative betweenness centralityConversely, high-frequency concepts terms show up on the margins of the network. Several are pendants with only one other connection, such as “emotion” and “observation.” There are also quite a few dyads and triads of words disconnected from the network and with only one or two above-threshold connections. The disconnectedness and low density of this network suggest that there is room to strengthen the links between anatomy and function in the field of cognitive neuroscience.
Clearly, the anatomy and concepts terms fall along different regression lines. The anatomy terms have higher betweenness, reflecting their central positions on the network. Because the anatomical terms are less frequent, the network structure depends on how concepts are arranged around the anatomical terms.
Second-order projection networks were computed by correlating across each row in the adjancy matrix of link weights. They show links between terms that occupy similar positions in the network, and we thus call them “semantic synonyms.”Have similar meaning (e.g. Anticipation and Future) Come from the same circumscribed area of the literature (risk and reward)Suggest aspects of the literature that deserve further refinement
Retrodictive forecasting
Produces an unbiased synthesis Our approach characterizes how cognitive neuroscience presents itself to the larger scientific community, through the summaries of individual articles within their titles and abstracts.Additionalstudies of their function would have the greatest effects on the overall character of the network, so we identify them as particularly important targets for future research
Bruer Himself characterizes these as “toy maps” that may not be representative of the whole discipline.1988 and 2007 journal citation map. Asymmetric between journal co-citation (not author-wise or directional) Hub‐authority journals are black nodes, authority journals dark grey nodes, hub journals light grey nodes. Nodes are proportional to hub scores, authority scores, and hub + authority score for the black nodes.1988 & 2007 cognitive neuroscience topic map. Topic maps are generated from “stop-listed title words” of all the articles in the citation maps. Co-occurrence matrices are computed for words that occur at least 11 times in the article titles-that year and the spatial arrangement is ‘spring loaded using the Kamada-Kawai algorithm. Node size proportional to log of word occurrences.
From Bruer 2004) “On the small scale employed in this study, one might hope to see how cognitiveneuroscience emerged from its progenitor disciplines (systems neuroscience, cognitivepsychology, neuropsychology) by noting changes in co‐citation patterns among theprogenitor discipline journals and possibly through the appearance of new cognitiveneuroscience journals. These maps also allow us to visualize the citation flow amongjournals and to assess how results and ideas flowed among them. For aninterdisciplinary field emerging from a multi‐disciplinary foundation, like cognitiveneuroscience, one might be able to see how, for example, ideas and results fromneuroscience fed into psychology, from psychology into neuroscience, or both.”
The global structure includes all of the nodes within the defined boundaries of the network. It can be described by density, or the degree of interconnectedness of the nodes. It can also be described by topology, defined as “the shape, or form, of a network,” i.e., which nodes are connected to which other nodes (Moody). The global structure can be expected to change in datasets collected from different time periods. The intrinsic structure of a network arises from the position, connectivity, and centrality of individual nodes. Connectivity refers to the direct relations a node has to other nodes. Its position within the network is dependent on the connectivity of all nodes, and its centrality is the degree to which the node is located at an important position within the network.
Uses Bibliographic Coupling of SCI & SSCI to arrive at ‘Neighborhoods’ and ‘Disciplines’.Cog Neuro Lives out hereWill contact Klavas about “science Locating it”