Machine learning is geared towards prediction. However, aside diagnosis or prognosis in the clinics, cognitive neuroimaging strives for uncovering insights from the data, rather than minimizing prediction error. I review various inferences on brain function that have been drawn using pattern recognition techniques, focusing on decoding. In particular, I discuss using generalization as a test for information, multivariate analysis to interpret overlapping activation patterns, and decoding for principled reverse inference. I give each time a statistical view and a cognitive imaging view.
VIP Call Girls Noida Sia 9711199171 High Class Call Girl Near Me
Machine learning and cognitive neuroimaging: new tools can answer new questions
1. Machine learning and cognitive neuroimaging:
new tools can answer new questions
Gaël Varoquaux
How machine learning is shaping cognitive neuroimaging
[Varoquaux and Thirion 2014]
2. Cognitive neuroscience: linking psychology and
neuroscience (neural implementations)
Vision: A computational investigation into the human representation
and processing of visual information [Marr 1982]
G Varoquaux 2
3. Machine learning:
computational statistics
for prediction
(out-of-sample properties)
Paradigm shift
the dimensionality of
data grows,
enables richer models
Open-ended questions
⇒ large # features
From parameter
inference to prediction
x
y
G Varoquaux 3
4. Machine learning:
computational statistics
for prediction
(out-of-sample properties)
Paradigm shift
the dimensionality of
data grows,
enables richer models
Open-ended questions
⇒ large # features
From parameter
inference to prediction
x
y
Understanding, not predicting
Danger of solving the
wrong problem
Lost in formalization
G Varoquaux 3
6. Statistics Machine learning
Statistical machine learning
Hypothesis testing Prediction
T-test Tests on prediction Cross-validation
In sample Out of sample
G Varoquaux 4
7. Statistics Machine learning
Statistical machine learning
Hypothesis testing Prediction
T-test Tests on prediction Cross-validation
In sample Out of sample
Parametric Non-parametric
G Varoquaux 4
8. Statistics Machine learning
Statistical machine learning
Hypothesis testing Prediction
T-test Tests on prediction Cross-validation
In sample Out of sample
Parametric Non-parametric
Non-parametric tests Probabilistic modeling
Few parameters Many parameters
G Varoquaux 4
9. Statistics Machine learning
Statistical machine learning
Hypothesis testing Prediction
T-test Tests on prediction Cross-validation
In sample Out of sample
Parametric Non-parametric
Non-parametric tests Probabilistic modeling
Few parameters Many parameters
Univariate Multivariate
G Varoquaux 4
10. Statistics Machine learning
Statistical machine learning
Hypothesis testing Prediction
T-test Tests on prediction Cross-validation
In sample Out of sample
Parametric Non-parametric
Non-parametric tests Probabilistic modeling
Few parameters Many parameters
Univariate Multivariate
GLM = correlations Naive Bayes
Univariate selection
Differences mostly cultural: it’s a continuum
G Varoquaux 4
19. 1 Uncovering neural coding
Insights on breaking down cognitive functions into
atomic steps
[Hubel and Wiesel 1962]
Neurons receptive to
Gabors (edges)
G Varoquaux 8
20. 1 Uncovering neural coding
Insights on breaking down cognitive functions into
atomic steps
[Hubel and Wiesel 1962]
Neurons receptive to
Gabors (edges)
[Logothetis... 1995]
Shapes in inferior
temporal cortex
G Varoquaux 8
21. 1 Uncovering neural coding: richer models
Insights on breaking down cognitive functions into
atomic steps
[Hubel and Wiesel 1962]
Neurons receptive to
Gabors (edges)
[Logothetis... 1995]
Shapes in inferior
temporal cortex
Machine learning:
computer-vision models mapped to brain activity
[Yamins... 2014]
G Varoquaux 8
23. Machine learning for encoding models
Richer models of encoding
capture fine descriptions of behavior / stimuli
Require to forgo the contrast methodolgy
Is this a good or a bad thing?
G Varoquaux 10
24. 1 Models of the visual system
Image
V1
cortex
V2
cortex
Inferior
temporal
cortex
Fusiform
face area
Jack?
Is there a “face” region? A “foot” region? A “left big toe” region?
G Varoquaux 11
25. 1 Uncovering neural coding: cognitive oppositions
Is there a “face” region? A “foot” region? A “left big toe” region?
vs
G Varoquaux 12
26. 1 Uncovering neural coding: cognitive oppositions
Is there a “face” region? A “foot” region? A “left big toe” region?
vs
G Varoquaux 12
27. 1 Uncovering neural coding: cognitive oppositions
Is there a “face” region? A “foot” region? A “left big toe” region?
vs
-
G Varoquaux 12
28. 1 Uncovering neural coding: cognitive oppositions
Is there a “face” region? A “foot” region? A “left big toe” region?
vs
-Mapping relies on cognitive subtraction
Bound to mental process decomposition
G Varoquaux 12
29. 1 Decomposing visual stimuli
Low-level visual cortex is tuned
to natural image statistics
[Olshausen et al. 1996]
What drives high-level representations?
G Varoquaux 13
30. 1 Decomposing visual stimuli
Low-level visual cortex is tuned
to natural image statistics
[Olshausen et al. 1996]
What drives high-level representations?
Convolutional Net
G Varoquaux 13
33. 2 Increased sensitivity
“Given the goal of detecting the presence of a particular
mental representation in the brain, the primary advantage
of MVPA methods over individual-voxel-based methods is
increased sensitivity.” — [Norman... 2006]
G Varoquaux 16
34. 2 Increased sensitivity
An omnibus test
“Given the goal of detecting the presence of a particular
mental representation in the brain, the primary advantage
of MVPA methods over individual-voxel-based methods is
increased sensitivity.” — [Norman... 2006]
Is there “information” about a
stimuli in a given region?
G Varoquaux 16
35. 2 Increased sensitivity
An omnibus test
“Given the goal of detecting the presence of a particular
mental representation in the brain, the primary advantage
of MVPA methods over individual-voxel-based methods is
increased sensitivity.” — [Norman... 2006]
“However, these maps are not guaranteed to include all
the voxels that are involved in representing the categories
of interest.” — [Norman... 2006]
G Varoquaux 16
37. 2 Generalization as a test: cross-validation
x
y
x
y
High-dimensional models
G Varoquaux 18
38. 2 Generalization as a test: cross-validation
x
y
x
y
High-dimensional models
⇒ Important to test on independent data,
to control for model complexity
G Varoquaux 18
39. 2 Generalization as a test: cross-validation
High-dimensional models
⇒ Important to test on independent data,
to control for model complexity
40% 20% 10% 0% +10% +20% +40%
Leave one
sample out
Leave one
subject/session
20% leftout,
3 splits
20% leftout,
10 splits
20% leftout,
50 splits
22% +19%
+3% +43%
10% +10%
21% +17%
11% +11%
24% +16%
9% +9%
24% +14%
9% +8%
23% +13%
Intra
subject
Inter
subject
No silver bullet Poster 3829, Oral Th 12:45
G Varoquaux 18
40. 2 Behavioral predictions as a test
Increase “cognitive resolution”
One voxel’s information is not enough to distinguish
many cognitive states
⇒ analysis combining info across voxels
G Varoquaux 19
41. 2 Behavioral predictions as a test
Increase “cognitive resolution”
One voxel’s information is not enough to distinguish
many cognitive states
⇒ analysis combining info across voxels
Interpreting overlapping activations
Psychology not interested in where a task is
creating activation,
but if two tasks are creating activations in same areas
G Varoquaux 19
42. 2 Inference in cognitive neuroimaging
What is the neural support of a function?
What is function of a given brain module?
G Varoquaux 20
43. 2 Inference in cognitive neuroimaging
What is the neural support of a function?
What is function of a given brain module?
Brain mapping = task-evoked activity
G Varoquaux 20
44. 2 Inference in cognitive neuroimaging
[Poldrack 2006, Henson 2006]
What is the neural support of a function?
What is function of a given brain module?
Reverse inference
Brain mapping = task-evoked activity
+ crafting “contrasts” to isolate effects
G Varoquaux 20
45. 2 Inference in cognitive neuroimaging
[Kanwisher... 1997, Gauthier... 2000, Hanson and Halchenko 2008]
What is the neural support of a function?
What is function of a given brain module?
Reverse inference
Is there a face area?
G Varoquaux 20
46. 2 Inference in cognitive neuroimaging
[Poldrack... 2009, Schwartz... 2013]
What is the neural support of a function?
What is function of a given brain module?
Reverse inference
Decoding: Find regions that
predict observed cognition
G Varoquaux 20
47. 2 Decoding for reverse inference
[Poldrack... 2009, Schwartz... 2013]
Prediction = proxy for implication
Need large cognitive coverage
G Varoquaux 21
48. 2 Decoding for reverse inference
[Poldrack... 2009, Schwartz... 2013]
Prediction = proxy for implication
Need large cognitive coverage
Interpretation of the “grandmother neuron”
“more than a neuron re-
sponds to one concept and
[...] neurons do not neces-
sarily respond to only one
concept are given by the
data itself
[Quian Quiroga and Kreiman 2010]
G Varoquaux 21
49. 2 Brain decoding with linear models
Design
matrix
× Coefficients =
Coefficients are
brain maps
Target
G Varoquaux 22
50. 2 Brain decoding to recover predictive regions?
Face vs house visual recognition [Haxby... 2001]
SVM
error: 26%
G Varoquaux 23
51. 2 Brain decoding to recover predictive regions?
Face vs house visual recognition [Haxby... 2001]
Sparse model
error: 19%
G Varoquaux 23
52. 2 Brain decoding to recover predictive regions?
Face vs house visual recognition [Haxby... 2001]
Ridge
error: 15%
Best predictor outlines the worst regions
Best maps predict worst
G Varoquaux 23
53. 2 Decoders as estimators [Gramfort... 2013]
Inverse problem
Minimize the error term:
ˆw = argmin
w
l(y − X w)
Ill-posed:
Many different w will give
the same prediction error
Choice driven by (implicit) priors of the decoder
SVM sparse ridge TV- 1
G Varoquaux 24
54. 2 Decoders as estimators [Gramfort... 2013]
Inverse problem
Minimize the error term:
ˆw = argmin
w
l(y − X w)
Ill-posed:
Many different w will give
the same prediction error
Choice driven by (implicit) priors of the decoder
SVM sparse ridge TV- 1
Inferences rely, explicitely or implicitely,
on the regions estimated by the decoder
G Varoquaux 24
56. @GaelVaroquaux
Machine learning for cognitive neuroimaging
The description of cognition is hard ⇒ Encoding
Rich models depend less on paradigms
57. @GaelVaroquaux
Machine learning for cognitive neuroimaging
The description of cognition is hard ⇒ Encoding
Decoding as an omnibus test
For rich encoding models
To interpret overlaping activation
Cross-validation error bars
58. @GaelVaroquaux
Machine learning for cognitive neuroimaging
The description of cognition is hard ⇒ Encoding
Decoding as an omnibus test
Decoding for reverse inference
Requires large cognitive coverage
59. @GaelVaroquaux
Machine learning for cognitive neuroimaging
The description of cognition is hard ⇒ Encoding
Decoding as an omnibus test
Decoding for reverse inference
Estimation of predictive regions is difficult
Infinite number of maps predict as well
60. @GaelVaroquaux
Machine learning for cognitive neuroimaging
The description of cognition is hard ⇒ Encoding
Decoding as an omnibus test
Decoding for reverse inference
Estimation of predictive regions is difficult
Software: nilearn
In Python
http://nilearn.github.io
ni
[Varoquaux and Thirion 2014]
How machine learning is
shaping cognitive neuroimaging
61. References I
I. Gauthier, M. J. Tarr, J. Moylan, P. Skudlarski, J. C. Gore, and
A. W. Anderson. The fusiform “face area” is part of a network
that processes faces at the individual level. J cognitive
neuroscience, 12:495, 2000.
A. Gramfort, B. Thirion, and G. Varoquaux. Identifying predictive
regions from fMRI with TV-L1 prior. In PRNI, page 17, 2013.
U. Güçlü and M. A. van Gerven. Deep neural networks reveal a
gradient in the complexity of neural representations across the
ventral stream. The Journal of Neuroscience, 35(27):
10005–10014, 2015.
S. J. Hanson and Y. O. Halchenko. Brain reading using full brain
support vector machines for object recognition: there is no
“face” identification area. Neural Computation, 20:486, 2008.
B. Harvey, B. Klein, N. Petridou, and S. Dumoulin. Topographic
representation of numerosity in the human parietal cortex.
Science, 341(6150):1123–1126, 2013.
62. References II
J. V. Haxby, I. M. Gobbini, M. L. Furey, ... Distributed and
overlapping representations of faces and objects in ventral
temporal cortex. Science, 293:2425, 2001.
R. Henson. Forward inference using functional neuroimaging:
Dissociations versus associations. Trends in cognitive sciences,
10:64, 2006.
D. H. Hubel and T. N. Wiesel. Receptive fields, binocular
interaction and functional architecture in the cat’s visual cortex.
The Journal of physiology, 160:106, 1962.
N. Kanwisher, J. McDermott, and M. M. Chun. The fusiform face
area: a module in human extrastriate cortex specialized for face
perception. J Neuroscience, 17:4302, 1997.
K. N. Kay, T. Naselaris, R. J. Prenger, and J. L. Gallant.
Identifying natural images from human brain activity. Nature,
452:352, 2008.
63. References III
S.-M. Khaligh-Razavi and N. Kriegeskorte. Deep supervised, but
not unsupervised, models may explain it cortical representation.
PLoS Comput Biol, 10(11):e1003915, 2014.
N. K. Logothetis, J. Pauls, and T. Poggio. Shape representation in
the inferior temporal cortex of monkeys. Current Biology, 5:552,
1995.
D. Marr. Vision: A computational investigation into the human
representation and processing of visual information. The MIT
press, Cambridge, 1982.
T. M. Mitchell, S. V. Shinkareva, A. Carlson, K.-M. Chang, V. L.
Malave, R. A. Mason, and M. A. Just. Predicting human brain
activity associated with the meanings of nouns. science, 320:
1191, 2008.
T. Naselaris, K. N. Kay, S. Nishimoto, and J. L. Gallant. Encoding
and decoding in fMRI. Neuroimage, 56:400, 2011.
64. References IV
K. A. Norman, S. M. Polyn, G. J. Detre, and J. V. Haxby. Beyond
mind-reading: multi-voxel pattern analysis of fmri data. Trends
in cognitive sciences, 10:424, 2006.
J. P. O’Doherty, A. Hampton, and H. Kim. Model-based fMRI and
its application to reward learning and decision making. Annals of
the New York Academy of Sciences, 1104:35, 2007.
B. Olshausen ... Emergence of simple-cell remainsceptive field
properties by learning a sparse code for natural images. Nature,
381:607, 1996.
R. Poldrack. Can cognitive processes be inferred from
neuroimaging data? Trends in cognitive sciences, 10:59, 2006.
R. A. Poldrack, Y. O. Halchenko, and S. J. Hanson. Decoding the
large-scale structure of brain function by classifying mental
states across individuals. Psychological Science, 20:1364, 2009.
65. References V
R. Quian Quiroga and G. Kreiman. Postscript: About grandmother
cells and jennifer aniston neurons. Psychological Review, 117:
297, 2010.
Y. Schwartz, B. Thirion, and G. Varoquaux. Mapping cognitive
ontologies to and from the brain. In NIPS, 2013.
G. Varoquaux and B. Thirion. How machine learning is shaping
cognitive neuroimaging. GigaScience, 3:28, 2014.
D. L. Yamins, H. Hong, C. F. Cadieu, E. A. Solomon, D. Seibert,
and J. J. DiCarlo. Performance-optimized hierarchical models
predict neural responses in higher visual cortex. Proc Natl Acad
Sci, page 201403112, 2014.