SlideShare una empresa de Scribd logo
1 de 63
Descargar para leer sin conexión
If you liked it you should’ve put a
p-value on it
… or not.
Chris Gorgolewski
Max Planck Institute for Human Cognitive and Brain Sciences
SIGNAL DETECTION THEORY
Signal and noise
False positive and false negative errors
Power
Signal detection theory
Types of errors
Vocabulary
• Type I error – false positive
• Type II error – false negative
• False positive rate
• False negative rate
• Statistical power = 1 – false negative rate
• Sensitivity = Power
Inference = thresholding
Inference = thresholding
Signal to Noise ratio
Looking in the wrong places
Lower SNR = we miss more stuff
Lower SNR = higher FDR threshold
VOXELWISE TESTS
P-maps
Multiple comparison
FWE correction: Bonferroni, permutations
FDR correction: B-H, Local FDR
Hypothesis testing
• Distinguish between two hypotheses
1. H0 – there is no difference between groups
2. H1 – there is a difference between groups
• Or…
1. H0 – there is no relation between two variables
2. H1 – there is some relation between the two
variables
From statistical values to p-values
• Various procedures give us statistical values
– T-tests (one sample, two sample, paired etc.)
– F-Tests
– Correlation tests (r values)
• What is a p value?
P value
• P(z) = A probability if we repeat our
experiment (with all the analyses) and there is
no effect we will get this or greater statistical
value.
t, z, F to p
OK back to neuroimaging
• Assuming that we are doing a massive
univariate analysis (we look at each voxel
independently) we have a t-map
• Now using a theoretical distribution (given the
degrees of freedom) we can turn it into a p-
map
Inference!
• We take out p-map discard all voxel with
values > 0.05
– “The value for which P=0.05, or 1 in 20, is 1.96 or
nearly 2; it is convenient to take this point as a
limit in judging whether a deviation ought to be
considered significant or not. Deviations
exceeding twice the standard deviation are thus
formally regarded as significant.”
• We are done – right?
Not quite done yet…
• Let me generate two vectors of values and test
using a t-test if they are different
• What is the probability that P(t) < 0.05
– Well… 0.05
• Let me generate another set of values… and
another… 100 pairs of vectors
• What is the probability that at least one of the
test?
The Salmon of Doubt
Correcting for multiple comparisons
• Bonferroni correction (based on Bool’s
inequality)
– Divide your p-threshold by the number of tests
you have performed
– Or multiple your p-values by the number of tests
you have performed
Bonferroni is a Family Wise Error
correction
It guarantees that the chances of getting at least
one false positive in all the tests is less than your
p-threshold
Permutation based FWE correction
• The assumptions behind the theoretical
distributions are often not met
• There are many dependencies between voxels
– Each test is not independent so Bonferroni
correction can be conservative
• We can however establish an empirical
distribution
Permutation based FWE correction
1. Break the relation: shuffle the participants
between the groups
2. Perform the test
3. Save the maximum statistical value across
voxels
4. Repeat
Permutation based FWE correction
Our FWE corrected p value is the percentage of
permutations that yielded statistical values
higher than the original (unshuffled one)
False Discovery Rate
• Even conceptually FWE correction seems
conservative
– At least one test out of 60 000?
• Is there a more intuitive way of looking at
this?
False Discovery Rate
I present a number of voxels that I think show a
strong effect, but I admit that a certain
percentage of them might be false positives.
False Discovery Rate
Percentage of false positive voxels among all
significant voxels.
FDR procedures
• Benjamini-Hochberg procedure
– With it’s dependent variables variant
• Efrons local FDR procedure
– Explicit modeling of the signal distribution
Interim Summary
• FWE corrections
– Bonferroni – simple but struggles with
dependencies (over conservative)
– Permutations – less dependent on assumptions,
but time consuming
• FDR corrections
– B-H – simple but also struggles with dependencies
– Local FDR – data driven, but can fail in case of low
SNR
CLUSTER EXTENT TESTS
Test how big are the blobs
Random field theory
Smoothness estimation
Permutation test
The problem of cluster forming threshold
Fun fact: FWE with RFT
Intuition
If we are interested in continuous regions of
activations why are we looking at voxels not
blobs?
Aww patters!
No wait… it’s just smooth noise…
What contributes to expected cluster
size?
How likely is to get cluster of this size from pure
noise?
It depends… on:
1. cluster forming threshold
2. smoothness of the map
3. size of the map
Where do we get those parameters?
1. cluster forming threshold
– Arbitrary decision
2. smoothness of the map
– Estimated from the residuals of the GLM
3. size of the map
– Calculated from the mask
Permutation based cluster extent
probability
1. Break the relation: shuffle the participants
between the groups
2. Perform the test
3. Threshold the map to get clusters
4. Save the sizes of all clusters
5. Repeat
Permutation based cluster extent
probability
Our cluster extent p value is the percentage of
permutations that yielded cluster sizes bigger
than the original (unshuffled one)
Cluster forming threshold conundrum
HONORABLE MENTIONS
TFCE
Mixture models
Threshold Free Cluster Enhancement
Spatially Regularized Mixture Models
IMPLEMENTATIONS
SPM
FSL
AFNI
SPM
• RFT based voxelwise FWE correction
• Smoothness estimation
• Cluster extent p-values
• Peak height p-values
• Permutation tests through SnPM toolbox
FSL
• RFT based voxelwise FWE correction
• Smoothness estimation
• Cluster extent p-values
• FDR
• Permutation tests through randomize
– Including TFCE
AFNI
• Cluster extent p-values (3dClustSim)
– Simulations are not permutations
• Smoothness estimation (3dFWHMx)
Interim summary
Clusterwise methods allow us to find surprising
patterns in terms of spatially consistent clusters
instead of individual voxels.
LIMITATIONS OF P-VALUES
P-VALUES ARE MEANINGLESS
FORGET ALL I SAID SO FAR
WE ARE ALL DOOMED
P-value paradox
• There are no two entities or groups that are
truly identical
• There are no two variables that are in no way
unrelated
• We just fail to obtain enough samples to see it
– Or our tools are not sensitive enough
More samples more “significance”
• The more subjects you will have in your study
the more likely it is that you will find
something significant
• The same applies to scan length, and field
strength
H0 is never true
we just fail to show that
P-value failure
• P-values do not tell us much about actual size
of the effect
• Neither do they tell of the predictive power of
the found relation
The interesting question
Is PCC involved in autism?
vs.
Given cortical thickness of a subjects PCC how
well am I able to predict his or hers diagnosis?
Why does this matter
• More subjects, longer scans, stronger scans –
everything is significant
– We are getting there
• Lack of faith in science from the public
– Poor reproducibility
What needs to be done
We need more replications
We need to start reporting null results
What you can do
• Report effect sizes and their confidence
intervals
– For all test/voxels – not just those significant
• Share the unthresholded statistical maps
– It only takes 5 minutes on neurovault.org
• Report all the tests you have performed – not
just the significant ones
http://dx.doi.org/10.1016/j.neuron.2012.05.001
If you liked it you should’ve
convinced a skeptical researcher to
to try to replicate your results.

Más contenido relacionado

La actualidad más candente

hypothesis testing
hypothesis testinghypothesis testing
hypothesis testing
msrpt
 
Basis of statistical inference
Basis of statistical inferenceBasis of statistical inference
Basis of statistical inference
zahidacademy
 
Hypothesis Testing
Hypothesis TestingHypothesis Testing
Hypothesis Testing
rishi.indian
 
Hypothesis Testing Lesson 1
Hypothesis Testing Lesson 1Hypothesis Testing Lesson 1
Hypothesis Testing Lesson 1
yhchung
 

La actualidad más candente (20)

Hypothesis testing, error and bias
Hypothesis testing, error and biasHypothesis testing, error and bias
Hypothesis testing, error and bias
 
hypothesis testing
hypothesis testinghypothesis testing
hypothesis testing
 
Hypothesis testing and p-value, www.eyenirvaan.com
Hypothesis testing and p-value, www.eyenirvaan.comHypothesis testing and p-value, www.eyenirvaan.com
Hypothesis testing and p-value, www.eyenirvaan.com
 
Hypothesis testing
Hypothesis testingHypothesis testing
Hypothesis testing
 
Review & Hypothesis Testing
Review & Hypothesis TestingReview & Hypothesis Testing
Review & Hypothesis Testing
 
Hypothesis testing
Hypothesis testingHypothesis testing
Hypothesis testing
 
P value part 1
P value part 1P value part 1
P value part 1
 
To p or not to p
To p or not to pTo p or not to p
To p or not to p
 
Basis of statistical inference
Basis of statistical inferenceBasis of statistical inference
Basis of statistical inference
 
Hypothesis testing Part1
Hypothesis testing Part1Hypothesis testing Part1
Hypothesis testing Part1
 
P value
P valueP value
P value
 
Test of hypothesis 1
Test of hypothesis 1Test of hypothesis 1
Test of hypothesis 1
 
Hypothesis test
Hypothesis testHypothesis test
Hypothesis test
 
Hypothesis testing
Hypothesis testingHypothesis testing
Hypothesis testing
 
Hypothesis testing
Hypothesis testingHypothesis testing
Hypothesis testing
 
Hypothesis Testing
Hypothesis TestingHypothesis Testing
Hypothesis Testing
 
Testing Of Hypothesis
Testing Of HypothesisTesting Of Hypothesis
Testing Of Hypothesis
 
Hypothesis Testing Lesson 1
Hypothesis Testing Lesson 1Hypothesis Testing Lesson 1
Hypothesis Testing Lesson 1
 
Hypothesis
HypothesisHypothesis
Hypothesis
 
Hypothesis testing
Hypothesis testingHypothesis testing
Hypothesis testing
 

Destacado

Reusable Science: How not to slip from the shoulders of giants
Reusable Science: How not to slip from the shoulders of giantsReusable Science: How not to slip from the shoulders of giants
Reusable Science: How not to slip from the shoulders of giants
Krzysztof Gorgolewski
 
On the large scale of studying dynamics with MEG: Lessons learned from the Hu...
On the large scale of studying dynamics with MEG: Lessons learned from the Hu...On the large scale of studying dynamics with MEG: Lessons learned from the Hu...
On the large scale of studying dynamics with MEG: Lessons learned from the Hu...
Robert Oostenveld
 

Destacado (11)

Reusable Science: How not to slip from the shoulders of giants
Reusable Science: How not to slip from the shoulders of giantsReusable Science: How not to slip from the shoulders of giants
Reusable Science: How not to slip from the shoulders of giants
 
Quality control for structural and functional MRI
Quality control for structural and functional MRIQuality control for structural and functional MRI
Quality control for structural and functional MRI
 
Eft energy-ob-070711
Eft energy-ob-070711Eft energy-ob-070711
Eft energy-ob-070711
 
How to hack your brain for effortless learning
How to hack your brain for effortless learningHow to hack your brain for effortless learning
How to hack your brain for effortless learning
 
The Future of Brain Health
The Future of Brain HealthThe Future of Brain Health
The Future of Brain Health
 
Nfb What Why How
Nfb What Why HowNfb What Why How
Nfb What Why How
 
On the large scale of studying dynamics with MEG: Lessons learned from the Hu...
On the large scale of studying dynamics with MEG: Lessons learned from the Hu...On the large scale of studying dynamics with MEG: Lessons learned from the Hu...
On the large scale of studying dynamics with MEG: Lessons learned from the Hu...
 
Master's prepared role electronic presentation dass
Master's prepared role electronic presentation dassMaster's prepared role electronic presentation dass
Master's prepared role electronic presentation dass
 
Hack Your Brain
Hack Your BrainHack Your Brain
Hack Your Brain
 
What's Significant? Hypothesis Testing, Effect Size, Confidence Intervals, & ...
What's Significant? Hypothesis Testing, Effect Size, Confidence Intervals, & ...What's Significant? Hypothesis Testing, Effect Size, Confidence Intervals, & ...
What's Significant? Hypothesis Testing, Effect Size, Confidence Intervals, & ...
 
MAD Konsep P value dan Confidence Interval
MAD Konsep P value dan Confidence IntervalMAD Konsep P value dan Confidence Interval
MAD Konsep P value dan Confidence Interval
 

Similar a If you liked it you should've put a p-value on it ...or not

What should we expect from reproducibiliry
What should we expect from reproducibiliryWhat should we expect from reproducibiliry
What should we expect from reproducibiliry
Stephen Senn
 

Similar a If you liked it you should've put a p-value on it ...or not (20)

Some statistical concepts relevant to proteomics data analysis
Some statistical concepts relevant to proteomics data analysisSome statistical concepts relevant to proteomics data analysis
Some statistical concepts relevant to proteomics data analysis
 
Hypothesis 151221131534
Hypothesis 151221131534Hypothesis 151221131534
Hypothesis 151221131534
 
Hypothesis
HypothesisHypothesis
Hypothesis
 
Math3010 week 5
Math3010 week 5Math3010 week 5
Math3010 week 5
 
Class 5 Hypothesis & Normal Disdribution.pptx
Class 5 Hypothesis & Normal Disdribution.pptxClass 5 Hypothesis & Normal Disdribution.pptx
Class 5 Hypothesis & Normal Disdribution.pptx
 
FandTtests.ppt
FandTtests.pptFandTtests.ppt
FandTtests.ppt
 
What should we expect from reproducibiliry
What should we expect from reproducibiliryWhat should we expect from reproducibiliry
What should we expect from reproducibiliry
 
Presentation chi-square test & Anova
Presentation   chi-square test & AnovaPresentation   chi-square test & Anova
Presentation chi-square test & Anova
 
Generalizability in fMRI, fast and slow
Generalizability in fMRI, fast and slowGeneralizability in fMRI, fast and slow
Generalizability in fMRI, fast and slow
 
Chi square test final
Chi square test finalChi square test final
Chi square test final
 
hypothesis testing
 hypothesis testing hypothesis testing
hypothesis testing
 
1. complete stats notes
1. complete stats notes1. complete stats notes
1. complete stats notes
 
D.G. Mayo Slides LSE PH500 Meeting #1
D.G. Mayo Slides LSE PH500 Meeting #1D.G. Mayo Slides LSE PH500 Meeting #1
D.G. Mayo Slides LSE PH500 Meeting #1
 
D.g. mayo 1st mtg lse ph 500
D.g. mayo 1st mtg lse ph 500D.g. mayo 1st mtg lse ph 500
D.g. mayo 1st mtg lse ph 500
 
Ds vs Is discuss 3.1
Ds vs Is discuss 3.1Ds vs Is discuss 3.1
Ds vs Is discuss 3.1
 
Oarsi jr1
Oarsi jr1Oarsi jr1
Oarsi jr1
 
teast mean one and two sample
teast mean one and two sampleteast mean one and two sample
teast mean one and two sample
 
Ds 2251 -_hypothesis test
Ds 2251 -_hypothesis testDs 2251 -_hypothesis test
Ds 2251 -_hypothesis test
 
Inexact reasoning
Inexact reasoningInexact reasoning
Inexact reasoning
 
Replication Crises and the Statistics Wars: Hidden Controversies
Replication Crises and the Statistics Wars: Hidden ControversiesReplication Crises and the Statistics Wars: Hidden Controversies
Replication Crises and the Statistics Wars: Hidden Controversies
 

Más de Krzysztof Gorgolewski

Towards open and reproducible neuroscience in the age of big data
Towards open and reproducible neuroscience in the age of big dataTowards open and reproducible neuroscience in the age of big data
Towards open and reproducible neuroscience in the age of big data
Krzysztof Gorgolewski
 
Evaluation of full brain parcellation schemes using the NeuroVault database o...
Evaluation of full brain parcellation schemes using the NeuroVault database o...Evaluation of full brain parcellation schemes using the NeuroVault database o...
Evaluation of full brain parcellation schemes using the NeuroVault database o...
Krzysztof Gorgolewski
 

Más de Krzysztof Gorgolewski (20)

Reproducibility and replicability: a practical approach
Reproducibility and replicability: a practical approachReproducibility and replicability: a practical approach
Reproducibility and replicability: a practical approach
 
ML Researcher’s Guide to Open Brain Imaging Data
ML Researcher’s Guide to Open Brain Imaging DataML Researcher’s Guide to Open Brain Imaging Data
ML Researcher’s Guide to Open Brain Imaging Data
 
Avoiding the tower of babel - The Role of Data Description Standards in Biome...
Avoiding the tower of babel - The Role of Data Description Standards in Biome...Avoiding the tower of babel - The Role of Data Description Standards in Biome...
Avoiding the tower of babel - The Role of Data Description Standards in Biome...
 
A practical guide to practicing open science
A practical guide to practicing open scienceA practical guide to practicing open science
A practical guide to practicing open science
 
Towards open and reproducible neuroscience in the age of big data
Towards open and reproducible neuroscience in the age of big dataTowards open and reproducible neuroscience in the age of big data
Towards open and reproducible neuroscience in the age of big data
 
Modern tools for sharing and synthesizing neuroimaging results
Modern tools for sharing and synthesizing neuroimaging resultsModern tools for sharing and synthesizing neuroimaging results
Modern tools for sharing and synthesizing neuroimaging results
 
OpenNeuro: a free online platform for sharing and analysis of neuroimaging data
OpenNeuro: a free online platform for sharing and analysis of neuroimaging dataOpenNeuro: a free online platform for sharing and analysis of neuroimaging data
OpenNeuro: a free online platform for sharing and analysis of neuroimaging data
 
Containers in Science: neuroimaging use cases
Containers in Science: neuroimaging use casesContainers in Science: neuroimaging use cases
Containers in Science: neuroimaging use cases
 
Study pre-registration: Benefits and considerations
Study pre-registration: Benefits and considerationsStudy pre-registration: Benefits and considerations
Study pre-registration: Benefits and considerations
 
Towards open and reproducible neuroscience in the age of big data
Towards open and  reproducible neuroscience in the age of big dataTowards open and  reproducible neuroscience in the age of big data
Towards open and reproducible neuroscience in the age of big data
 
FMRIPREP - robust and easy to use fMRI preprocessing pipeline
FMRIPREP - robust and easy to use fMRI preprocessing pipelineFMRIPREP - robust and easy to use fMRI preprocessing pipeline
FMRIPREP - robust and easy to use fMRI preprocessing pipeline
 
Evaluation of full brain parcellation schemes using the NeuroVault database o...
Evaluation of full brain parcellation schemes using the NeuroVault database o...Evaluation of full brain parcellation schemes using the NeuroVault database o...
Evaluation of full brain parcellation schemes using the NeuroVault database o...
 
Software testing for scientists
Software testing for scientistsSoftware testing for scientists
Software testing for scientists
 
Docker for scientists
Docker for scientistsDocker for scientists
Docker for scientists
 
The Brain Imaging Data Structure (OHBM 2016)
The Brain Imaging Data Structure (OHBM 2016)The Brain Imaging Data Structure (OHBM 2016)
The Brain Imaging Data Structure (OHBM 2016)
 
Share and Reuse: how data sharing can take your research to the next level
Share and Reuse: how data sharing can take your research to the next levelShare and Reuse: how data sharing can take your research to the next level
Share and Reuse: how data sharing can take your research to the next level
 
Brain Imaging Data Structure and Center for Reproducible Neuroscince
Brain Imaging Data Structure and Center for Reproducible NeuroscinceBrain Imaging Data Structure and Center for Reproducible Neuroscince
Brain Imaging Data Structure and Center for Reproducible Neuroscince
 
Brain Imaging Data Structure
Brain Imaging Data StructureBrain Imaging Data Structure
Brain Imaging Data Structure
 
Meta analysis in neuroimaging 101
Meta analysis in neuroimaging 101Meta analysis in neuroimaging 101
Meta analysis in neuroimaging 101
 
Data sharing in neuroimaging: incentives, tools, and challenges
Data sharing in neuroimaging: incentives, tools, and challengesData sharing in neuroimaging: incentives, tools, and challenges
Data sharing in neuroimaging: incentives, tools, and challenges
 

Último

Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
KarakKing
 

Último (20)

HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 

If you liked it you should've put a p-value on it ...or not

  • 1. If you liked it you should’ve put a p-value on it … or not. Chris Gorgolewski Max Planck Institute for Human Cognitive and Brain Sciences
  • 2. SIGNAL DETECTION THEORY Signal and noise False positive and false negative errors Power
  • 5. Vocabulary • Type I error – false positive • Type II error – false negative • False positive rate • False negative rate • Statistical power = 1 – false negative rate • Sensitivity = Power
  • 9. Looking in the wrong places
  • 10. Lower SNR = we miss more stuff
  • 11. Lower SNR = higher FDR threshold
  • 12. VOXELWISE TESTS P-maps Multiple comparison FWE correction: Bonferroni, permutations FDR correction: B-H, Local FDR
  • 13. Hypothesis testing • Distinguish between two hypotheses 1. H0 – there is no difference between groups 2. H1 – there is a difference between groups • Or… 1. H0 – there is no relation between two variables 2. H1 – there is some relation between the two variables
  • 14. From statistical values to p-values • Various procedures give us statistical values – T-tests (one sample, two sample, paired etc.) – F-Tests – Correlation tests (r values) • What is a p value?
  • 15. P value • P(z) = A probability if we repeat our experiment (with all the analyses) and there is no effect we will get this or greater statistical value.
  • 16. t, z, F to p
  • 17. OK back to neuroimaging • Assuming that we are doing a massive univariate analysis (we look at each voxel independently) we have a t-map • Now using a theoretical distribution (given the degrees of freedom) we can turn it into a p- map
  • 18. Inference! • We take out p-map discard all voxel with values > 0.05 – “The value for which P=0.05, or 1 in 20, is 1.96 or nearly 2; it is convenient to take this point as a limit in judging whether a deviation ought to be considered significant or not. Deviations exceeding twice the standard deviation are thus formally regarded as significant.” • We are done – right?
  • 19. Not quite done yet… • Let me generate two vectors of values and test using a t-test if they are different • What is the probability that P(t) < 0.05 – Well… 0.05 • Let me generate another set of values… and another… 100 pairs of vectors • What is the probability that at least one of the test?
  • 20. The Salmon of Doubt
  • 21. Correcting for multiple comparisons • Bonferroni correction (based on Bool’s inequality) – Divide your p-threshold by the number of tests you have performed – Or multiple your p-values by the number of tests you have performed
  • 22. Bonferroni is a Family Wise Error correction It guarantees that the chances of getting at least one false positive in all the tests is less than your p-threshold
  • 23. Permutation based FWE correction • The assumptions behind the theoretical distributions are often not met • There are many dependencies between voxels – Each test is not independent so Bonferroni correction can be conservative • We can however establish an empirical distribution
  • 24. Permutation based FWE correction 1. Break the relation: shuffle the participants between the groups 2. Perform the test 3. Save the maximum statistical value across voxels 4. Repeat
  • 25. Permutation based FWE correction Our FWE corrected p value is the percentage of permutations that yielded statistical values higher than the original (unshuffled one)
  • 26. False Discovery Rate • Even conceptually FWE correction seems conservative – At least one test out of 60 000? • Is there a more intuitive way of looking at this?
  • 27. False Discovery Rate I present a number of voxels that I think show a strong effect, but I admit that a certain percentage of them might be false positives.
  • 28. False Discovery Rate Percentage of false positive voxels among all significant voxels.
  • 29. FDR procedures • Benjamini-Hochberg procedure – With it’s dependent variables variant • Efrons local FDR procedure – Explicit modeling of the signal distribution
  • 30. Interim Summary • FWE corrections – Bonferroni – simple but struggles with dependencies (over conservative) – Permutations – less dependent on assumptions, but time consuming • FDR corrections – B-H – simple but also struggles with dependencies – Local FDR – data driven, but can fail in case of low SNR
  • 31. CLUSTER EXTENT TESTS Test how big are the blobs Random field theory Smoothness estimation Permutation test The problem of cluster forming threshold Fun fact: FWE with RFT
  • 32. Intuition If we are interested in continuous regions of activations why are we looking at voxels not blobs?
  • 34. No wait… it’s just smooth noise…
  • 35. What contributes to expected cluster size? How likely is to get cluster of this size from pure noise? It depends… on: 1. cluster forming threshold 2. smoothness of the map 3. size of the map
  • 36. Where do we get those parameters? 1. cluster forming threshold – Arbitrary decision 2. smoothness of the map – Estimated from the residuals of the GLM 3. size of the map – Calculated from the mask
  • 37. Permutation based cluster extent probability 1. Break the relation: shuffle the participants between the groups 2. Perform the test 3. Threshold the map to get clusters 4. Save the sizes of all clusters 5. Repeat
  • 38. Permutation based cluster extent probability Our cluster extent p value is the percentage of permutations that yielded cluster sizes bigger than the original (unshuffled one)
  • 40.
  • 42. Threshold Free Cluster Enhancement
  • 45. SPM • RFT based voxelwise FWE correction • Smoothness estimation • Cluster extent p-values • Peak height p-values • Permutation tests through SnPM toolbox
  • 46. FSL • RFT based voxelwise FWE correction • Smoothness estimation • Cluster extent p-values • FDR • Permutation tests through randomize – Including TFCE
  • 47. AFNI • Cluster extent p-values (3dClustSim) – Simulations are not permutations • Smoothness estimation (3dFWHMx)
  • 48. Interim summary Clusterwise methods allow us to find surprising patterns in terms of spatially consistent clusters instead of individual voxels.
  • 51. FORGET ALL I SAID SO FAR
  • 52. WE ARE ALL DOOMED
  • 53. P-value paradox • There are no two entities or groups that are truly identical • There are no two variables that are in no way unrelated • We just fail to obtain enough samples to see it – Or our tools are not sensitive enough
  • 54. More samples more “significance” • The more subjects you will have in your study the more likely it is that you will find something significant • The same applies to scan length, and field strength
  • 55. H0 is never true we just fail to show that
  • 56. P-value failure • P-values do not tell us much about actual size of the effect • Neither do they tell of the predictive power of the found relation
  • 57. The interesting question Is PCC involved in autism? vs. Given cortical thickness of a subjects PCC how well am I able to predict his or hers diagnosis?
  • 58. Why does this matter • More subjects, longer scans, stronger scans – everything is significant – We are getting there • Lack of faith in science from the public – Poor reproducibility
  • 59. What needs to be done We need more replications We need to start reporting null results
  • 60. What you can do • Report effect sizes and their confidence intervals – For all test/voxels – not just those significant • Share the unthresholded statistical maps – It only takes 5 minutes on neurovault.org • Report all the tests you have performed – not just the significant ones
  • 62.
  • 63. If you liked it you should’ve convinced a skeptical researcher to to try to replicate your results.