SlideShare una empresa de Scribd logo
1 de 25
Descargar para leer sin conexión
Using Principal Component
Analysis to Remove Correlated
Signal from Astronomical Images
Kim Scott
National Radio Astronomy Observatory
Data Science Meet-up
February 18, 2014
Galaxy Evolution in One Slide...
Galaxy Evolution in One Slide...
Galaxy Evolution in One Slide...

?
Galaxy Surveys – What Are We Missing?
Galaxy Surveys – What Are We Missing?

Optical surveys miss
~50% of star formation
in galaxies
Optical surveys
are biased

Dust reemits stellar
radiation at infrared to
millimeter wavelengths
(λ ~ 20 – 2000 μm)
Galaxy Surveys at (Sub)mm Wavelengths
Atmospheric emission

1000× stronger than signal from galaxies

Extragalactic emission:
Transmitted
Absorbed
Removing the Atmosphere by
Modulating the Signal in Time
Detector array

Galaxy
Removing the Atmosphere by
Modulating the Signal in Time
Detector array

i=1

i=2

i=3

Galaxy

xij: power measured for
time sample i on detector j
Surveys at λ=1.1mm with AzTEC
ASTE Telescope
AzTEC Dewar
AzTEC Array
(117 detectors)
Raw Time-stream Data

Sample rate = 1∕(15.625 ms)
Raw Time-stream Data

Sample rate = 1∕(15.625 ms)
(20 s = 1280 samples)
Principal Component Analysis (PCA)

[Used in supervised learning to compress data - fit to
fewer number of features]
• xij: power measured for time sample i on detector j
• n = number of detectors; m = number of time samples
• X = [ x1 x2 ... xm ] → n × m matrix

*Only input needed for PCA*
Principal Component Analysis (PCA)
Step 1: Mean normalization (and feature scaling)
• Compute μj = (1∕m) Σi=1,m xij for each detector
• Compute σ2j = (1∕(m-1)) Σi=1,m (xij - μj)2 for each detector
• Set xij (xij − μj) ∕ σj
• X = [ x1 x2 ... xm ] → n × m matrix
Principal Component Analysis (PCA)
Step 1: Mean normalization (and feature scaling)
• Compute μj = (1∕m) Σi=1,m xij for each detector
• Compute σ2j = (1∕(m-1)) Σi=1,m (xij - μj)2 for each detector
• Set xij (xij − μj) ∕ σj
• X = [ x1 x2 ... xm ] → n × m matrix
Principal Component Analysis (PCA)
Step 1: Mean normalization (and feature scaling)
• Compute μj = (1∕m) Σi=1,m xij for each detector
• Compute σ2j = (1∕(m-1)) Σi=1,m (xij - μj)2 for each detector
• Set xij (xij − μj) ∕ σj
• X = [ x1 x2 ... xm ] → n × m matrix

1mV

*PCA can identify lower level
correlations among subsets of
the detectors*
Principal Component Analysis (PCA)
Step 2: Calculate covariance matrix
• C = (1∕m) X XT
(recall m = # time samples)
• C → n × n symmetric matrix
(recall n = 117 detectors)
Step 3: Eigen decomposition
• C = Q Λ Q-1 (*solve using SVD*)
• Q = [ q1 q2 ... qn ] → n × n matrix containing
eigenvectors qi
• Λ → n × n diagonal matrix containing eigenvalues λi = Λii
• Principal components = uncorrelated variables
Principal Component Analysis (PCA)
Step 4: Choose number of components to remove
• Goal: choose fewest number of components (k) to
REMOVE most of the observed variance in the data
• QR = [ qk+1 qk+2 ... qn ] → n × k matrix, k < n
• Z = [ z1 z2 ... zm ] = QRT X → k x m matrix
• To derive model of galaxy intensities on sky, use Z instead
of X (but...)
Choosing k:
Variance after PCA (given k)
< 0.05
Variance with average subtraction only
Principal Component Analysis (PCA)
Step 5: Reconstruct data without correlated signal
• Know RA/Dec for each detector: need to reconstruct
approximation for data to make image
• XR = QR Z → n × m matrix with correlated signal
removed!

1mV
Principal Component Analysis (PCA)
Step 5: Reconstruct data without correlated signal
• Know RA/Dec for each detector: need to reconstruct
approximation for data to make image
• XR = QR Z → n × m matrix with correlated signal
removed!
20μV

*Variance reduced by factor of 50*
Image of PKS J1127-1857
Make the map:
• Use information on sky position for each detector at each time
sample (RAij, Decij) and bin data onto image grid
• Set the intensity of each image pixel to the average of the xRij values
that fall into that bin
• Smooth image by telescope point-spread response function
(Gaussian with FWHM=30’’)

Average Subtraction

PCA Cleaned

• raw data = 30 MB
• ttot = 4 min
• 16640 samples/detector
An Extragalactic Survey at λ=1.1 mm
• Most galaxies are 100× fainter
than PKS J1127-1857
• raw data ~ 25 GB
• ttot ~ 80 hrs
• ~ 2×107 samples/detector
• AzTEC/COSMOS survey
• 0.7 deg2
• 500× area of HUDF
• 160 hrs versus 11 days for
HUDF
• 130 mm-bright galaxies

Aretxaga et al. 2011
An Extragalactic Survey at λ=1.1 mm

• AzTEC/COSMOS survey
• 0.7 deg2
• 500× area of HUDF
• 160 hrs versus 270 hrs for
HUDF
• 130 mm-bright galaxies
An Extragalactic Survey at λ=1.1 mm

• AzTEC/COSMOS survey
• 0.7 deg2
• 500× area of HUDF
• 160 hrs versus 270 hrs for
HUDF
• 130 mm-bright galaxies
An Extragalactic Survey at λ=1.1 mm
• AzTEC-3
• Observed 1 Gyr after Big Bang
• Starburst galaxy (SFR~1000 Msun/yr)

Capak et al. 2011

• AzTEC/COSMOS survey
• 0.7 deg2
• 500× area of HUDF
• 160 hrs versus 270 hrs for
HUDF
• 130 mm-bright galaxies

Aretxaga et al. 2011

Más contenido relacionado

La actualidad más candente

Principal Component Analysis For Novelty Detection
Principal Component Analysis For Novelty DetectionPrincipal Component Analysis For Novelty Detection
Principal Component Analysis For Novelty DetectionJordan McBain
 
2012 mdsp pr08 nonparametric approach
2012 mdsp pr08 nonparametric approach2012 mdsp pr08 nonparametric approach
2012 mdsp pr08 nonparametric approachnozomuhamada
 
Multiclass Logistic Regression: Derivation and Apache Spark Examples
Multiclass Logistic Regression: Derivation and Apache Spark ExamplesMulticlass Logistic Regression: Derivation and Apache Spark Examples
Multiclass Logistic Regression: Derivation and Apache Spark ExamplesMarjan Sterjev
 
Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...
Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...
Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...MLconf
 
A Correlative Information-Theoretic Measure for Image Similarity
A Correlative Information-Theoretic Measure for Image SimilarityA Correlative Information-Theoretic Measure for Image Similarity
A Correlative Information-Theoretic Measure for Image SimilarityFarah M. Altufaili
 
MLHEP 2015: Introductory Lecture #2
MLHEP 2015: Introductory Lecture #2MLHEP 2015: Introductory Lecture #2
MLHEP 2015: Introductory Lecture #2arogozhnikov
 
MLHEP 2015: Introductory Lecture #1
MLHEP 2015: Introductory Lecture #1MLHEP 2015: Introductory Lecture #1
MLHEP 2015: Introductory Lecture #1arogozhnikov
 
MLHEP 2015: Introductory Lecture #4
MLHEP 2015: Introductory Lecture #4MLHEP 2015: Introductory Lecture #4
MLHEP 2015: Introductory Lecture #4arogozhnikov
 
机器学习Adaboost
机器学习Adaboost机器学习Adaboost
机器学习AdaboostShocky1
 
Independent Component Analysis
Independent Component Analysis Independent Component Analysis
Independent Component Analysis Ibrahim Amer
 
TENSOR DECOMPOSITION WITH PYTHON
TENSOR DECOMPOSITION WITH PYTHONTENSOR DECOMPOSITION WITH PYTHON
TENSOR DECOMPOSITION WITH PYTHONAndré Panisson
 
[Vldb 2013] skyline operator on anti correlated distributions
[Vldb 2013] skyline operator on anti correlated distributions[Vldb 2013] skyline operator on anti correlated distributions
[Vldb 2013] skyline operator on anti correlated distributionsWooSung Choi
 
Graph Based Clustering
Graph Based ClusteringGraph Based Clustering
Graph Based ClusteringSSA KPI
 
MLHEP Lectures - day 2, basic track
MLHEP Lectures - day 2, basic trackMLHEP Lectures - day 2, basic track
MLHEP Lectures - day 2, basic trackarogozhnikov
 
MLHEP 2015: Introductory Lecture #3
MLHEP 2015: Introductory Lecture #3MLHEP 2015: Introductory Lecture #3
MLHEP 2015: Introductory Lecture #3arogozhnikov
 
Multidimension Scaling and Isomap
Multidimension Scaling and IsomapMultidimension Scaling and Isomap
Multidimension Scaling and IsomapCheng-Shiang Li
 

La actualidad más candente (20)

Principal Component Analysis For Novelty Detection
Principal Component Analysis For Novelty DetectionPrincipal Component Analysis For Novelty Detection
Principal Component Analysis For Novelty Detection
 
2012 mdsp pr08 nonparametric approach
2012 mdsp pr08 nonparametric approach2012 mdsp pr08 nonparametric approach
2012 mdsp pr08 nonparametric approach
 
Pca ankita dubey
Pca ankita dubeyPca ankita dubey
Pca ankita dubey
 
Multiclass Logistic Regression: Derivation and Apache Spark Examples
Multiclass Logistic Regression: Derivation and Apache Spark ExamplesMulticlass Logistic Regression: Derivation and Apache Spark Examples
Multiclass Logistic Regression: Derivation and Apache Spark Examples
 
Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...
Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...
Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...
 
A Correlative Information-Theoretic Measure for Image Similarity
A Correlative Information-Theoretic Measure for Image SimilarityA Correlative Information-Theoretic Measure for Image Similarity
A Correlative Information-Theoretic Measure for Image Similarity
 
MLHEP 2015: Introductory Lecture #2
MLHEP 2015: Introductory Lecture #2MLHEP 2015: Introductory Lecture #2
MLHEP 2015: Introductory Lecture #2
 
MLHEP 2015: Introductory Lecture #1
MLHEP 2015: Introductory Lecture #1MLHEP 2015: Introductory Lecture #1
MLHEP 2015: Introductory Lecture #1
 
MLHEP 2015: Introductory Lecture #4
MLHEP 2015: Introductory Lecture #4MLHEP 2015: Introductory Lecture #4
MLHEP 2015: Introductory Lecture #4
 
机器学习Adaboost
机器学习Adaboost机器学习Adaboost
机器学习Adaboost
 
Independent Component Analysis
Independent Component Analysis Independent Component Analysis
Independent Component Analysis
 
K-Means Algorithm
K-Means AlgorithmK-Means Algorithm
K-Means Algorithm
 
TENSOR DECOMPOSITION WITH PYTHON
TENSOR DECOMPOSITION WITH PYTHONTENSOR DECOMPOSITION WITH PYTHON
TENSOR DECOMPOSITION WITH PYTHON
 
[Vldb 2013] skyline operator on anti correlated distributions
[Vldb 2013] skyline operator on anti correlated distributions[Vldb 2013] skyline operator on anti correlated distributions
[Vldb 2013] skyline operator on anti correlated distributions
 
Graph Based Clustering
Graph Based ClusteringGraph Based Clustering
Graph Based Clustering
 
MLHEP Lectures - day 2, basic track
MLHEP Lectures - day 2, basic trackMLHEP Lectures - day 2, basic track
MLHEP Lectures - day 2, basic track
 
K-means and GMM
K-means and GMMK-means and GMM
K-means and GMM
 
Data Analysis Homework Help
Data Analysis Homework HelpData Analysis Homework Help
Data Analysis Homework Help
 
MLHEP 2015: Introductory Lecture #3
MLHEP 2015: Introductory Lecture #3MLHEP 2015: Introductory Lecture #3
MLHEP 2015: Introductory Lecture #3
 
Multidimension Scaling and Isomap
Multidimension Scaling and IsomapMultidimension Scaling and Isomap
Multidimension Scaling and Isomap
 

Similar a PCA Removes Atmospheric Signal from Astronomical Images

DimensionalityReduction.pptx
DimensionalityReduction.pptxDimensionalityReduction.pptx
DimensionalityReduction.pptx36rajneekant
 
Introduction to Hadron Structure from Lattice QCD
Introduction to Hadron Structure from Lattice QCDIntroduction to Hadron Structure from Lattice QCD
Introduction to Hadron Structure from Lattice QCDChristos Kallidonis
 
5 DimensionalityReduction.pdf
5 DimensionalityReduction.pdf5 DimensionalityReduction.pdf
5 DimensionalityReduction.pdfRahul926331
 
Ultimate astronomicalimaging
Ultimate astronomicalimagingUltimate astronomicalimaging
Ultimate astronomicalimagingClifford Stone
 
Mathematics and AI
Mathematics and AIMathematics and AI
Mathematics and AIMarc Lelarge
 
MVPA with SpaceNet: sparse structured priors
MVPA with SpaceNet: sparse structured priorsMVPA with SpaceNet: sparse structured priors
MVPA with SpaceNet: sparse structured priorsElvis DOHMATOB
 
Mapping Ash Tree Colonization in an Agricultural Moutain Landscape_ Investiga...
Mapping Ash Tree Colonization in an Agricultural Moutain Landscape_ Investiga...Mapping Ash Tree Colonization in an Agricultural Moutain Landscape_ Investiga...
Mapping Ash Tree Colonization in an Agricultural Moutain Landscape_ Investiga...grssieee
 
Journey to structure from motion
Journey to structure from motionJourney to structure from motion
Journey to structure from motionJa-Keoung Koo
 
Vladimir Milov and Andrey Savchenko - Classification of Dangerous Situations...
Vladimir Milov and  Andrey Savchenko - Classification of Dangerous Situations...Vladimir Milov and  Andrey Savchenko - Classification of Dangerous Situations...
Vladimir Milov and Andrey Savchenko - Classification of Dangerous Situations...AIST
 
Introduction to Diffusion Monte Carlo
Introduction to Diffusion Monte CarloIntroduction to Diffusion Monte Carlo
Introduction to Diffusion Monte CarloClaudio Attaccalite
 
SPDE presentation 2012
SPDE presentation 2012SPDE presentation 2012
SPDE presentation 2012Zheng Mengdi
 
Digital Signal Processing[ECEG-3171]-Ch1_L05
Digital Signal Processing[ECEG-3171]-Ch1_L05Digital Signal Processing[ECEG-3171]-Ch1_L05
Digital Signal Processing[ECEG-3171]-Ch1_L05Rediet Moges
 
MIRAS: the instrument aboard SMOS
MIRAS: the instrument aboard SMOSMIRAS: the instrument aboard SMOS
MIRAS: the instrument aboard SMOSadrianocamps
 
NMR Spectroscopy
NMR SpectroscopyNMR Spectroscopy
NMR Spectroscopyclayqn88
 
Imaging the Unseen: Taking the First Picture of a Black Hole
Imaging the Unseen: Taking the First Picture of a Black HoleImaging the Unseen: Taking the First Picture of a Black Hole
Imaging the Unseen: Taking the First Picture of a Black HoleDatabricks
 
Computational Information Geometry on Matrix Manifolds (ICTP 2013)
Computational Information Geometry on Matrix Manifolds (ICTP 2013)Computational Information Geometry on Matrix Manifolds (ICTP 2013)
Computational Information Geometry on Matrix Manifolds (ICTP 2013)Frank Nielsen
 
Anomaly detection using deep one class classifier
Anomaly detection using deep one class classifierAnomaly detection using deep one class classifier
Anomaly detection using deep one class classifier홍배 김
 
Distributed Data Processing using Spark by Panos Labropoulos_and Sarod Yataw...
Distributed Data Processing using Spark by  Panos Labropoulos_and Sarod Yataw...Distributed Data Processing using Spark by  Panos Labropoulos_and Sarod Yataw...
Distributed Data Processing using Spark by Panos Labropoulos_and Sarod Yataw...Spark Summit
 

Similar a PCA Removes Atmospheric Signal from Astronomical Images (20)

DimensionalityReduction.pptx
DimensionalityReduction.pptxDimensionalityReduction.pptx
DimensionalityReduction.pptx
 
Introduction to Hadron Structure from Lattice QCD
Introduction to Hadron Structure from Lattice QCDIntroduction to Hadron Structure from Lattice QCD
Introduction to Hadron Structure from Lattice QCD
 
5 DimensionalityReduction.pdf
5 DimensionalityReduction.pdf5 DimensionalityReduction.pdf
5 DimensionalityReduction.pdf
 
Ultimate astronomicalimaging
Ultimate astronomicalimagingUltimate astronomicalimaging
Ultimate astronomicalimaging
 
Mathematics and AI
Mathematics and AIMathematics and AI
Mathematics and AI
 
MVPA with SpaceNet: sparse structured priors
MVPA with SpaceNet: sparse structured priorsMVPA with SpaceNet: sparse structured priors
MVPA with SpaceNet: sparse structured priors
 
Jere Koskela slides
Jere Koskela slidesJere Koskela slides
Jere Koskela slides
 
Mapping Ash Tree Colonization in an Agricultural Moutain Landscape_ Investiga...
Mapping Ash Tree Colonization in an Agricultural Moutain Landscape_ Investiga...Mapping Ash Tree Colonization in an Agricultural Moutain Landscape_ Investiga...
Mapping Ash Tree Colonization in an Agricultural Moutain Landscape_ Investiga...
 
Journey to structure from motion
Journey to structure from motionJourney to structure from motion
Journey to structure from motion
 
Xray interferometry
Xray interferometryXray interferometry
Xray interferometry
 
Vladimir Milov and Andrey Savchenko - Classification of Dangerous Situations...
Vladimir Milov and  Andrey Savchenko - Classification of Dangerous Situations...Vladimir Milov and  Andrey Savchenko - Classification of Dangerous Situations...
Vladimir Milov and Andrey Savchenko - Classification of Dangerous Situations...
 
Introduction to Diffusion Monte Carlo
Introduction to Diffusion Monte CarloIntroduction to Diffusion Monte Carlo
Introduction to Diffusion Monte Carlo
 
SPDE presentation 2012
SPDE presentation 2012SPDE presentation 2012
SPDE presentation 2012
 
Digital Signal Processing[ECEG-3171]-Ch1_L05
Digital Signal Processing[ECEG-3171]-Ch1_L05Digital Signal Processing[ECEG-3171]-Ch1_L05
Digital Signal Processing[ECEG-3171]-Ch1_L05
 
MIRAS: the instrument aboard SMOS
MIRAS: the instrument aboard SMOSMIRAS: the instrument aboard SMOS
MIRAS: the instrument aboard SMOS
 
NMR Spectroscopy
NMR SpectroscopyNMR Spectroscopy
NMR Spectroscopy
 
Imaging the Unseen: Taking the First Picture of a Black Hole
Imaging the Unseen: Taking the First Picture of a Black HoleImaging the Unseen: Taking the First Picture of a Black Hole
Imaging the Unseen: Taking the First Picture of a Black Hole
 
Computational Information Geometry on Matrix Manifolds (ICTP 2013)
Computational Information Geometry on Matrix Manifolds (ICTP 2013)Computational Information Geometry on Matrix Manifolds (ICTP 2013)
Computational Information Geometry on Matrix Manifolds (ICTP 2013)
 
Anomaly detection using deep one class classifier
Anomaly detection using deep one class classifierAnomaly detection using deep one class classifier
Anomaly detection using deep one class classifier
 
Distributed Data Processing using Spark by Panos Labropoulos_and Sarod Yataw...
Distributed Data Processing using Spark by  Panos Labropoulos_and Sarod Yataw...Distributed Data Processing using Spark by  Panos Labropoulos_and Sarod Yataw...
Distributed Data Processing using Spark by Panos Labropoulos_and Sarod Yataw...
 

Último

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 

Último (20)

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 

PCA Removes Atmospheric Signal from Astronomical Images

  • 1. Using Principal Component Analysis to Remove Correlated Signal from Astronomical Images Kim Scott National Radio Astronomy Observatory Data Science Meet-up February 18, 2014
  • 2. Galaxy Evolution in One Slide...
  • 3. Galaxy Evolution in One Slide...
  • 4. Galaxy Evolution in One Slide... ?
  • 5. Galaxy Surveys – What Are We Missing?
  • 6. Galaxy Surveys – What Are We Missing? Optical surveys miss ~50% of star formation in galaxies Optical surveys are biased Dust reemits stellar radiation at infrared to millimeter wavelengths (λ ~ 20 – 2000 μm)
  • 7. Galaxy Surveys at (Sub)mm Wavelengths Atmospheric emission 1000× stronger than signal from galaxies Extragalactic emission: Transmitted Absorbed
  • 8. Removing the Atmosphere by Modulating the Signal in Time Detector array Galaxy
  • 9. Removing the Atmosphere by Modulating the Signal in Time Detector array i=1 i=2 i=3 Galaxy xij: power measured for time sample i on detector j
  • 10. Surveys at λ=1.1mm with AzTEC ASTE Telescope AzTEC Dewar AzTEC Array (117 detectors)
  • 11. Raw Time-stream Data Sample rate = 1∕(15.625 ms)
  • 12. Raw Time-stream Data Sample rate = 1∕(15.625 ms) (20 s = 1280 samples)
  • 13. Principal Component Analysis (PCA) [Used in supervised learning to compress data - fit to fewer number of features] • xij: power measured for time sample i on detector j • n = number of detectors; m = number of time samples • X = [ x1 x2 ... xm ] → n × m matrix *Only input needed for PCA*
  • 14. Principal Component Analysis (PCA) Step 1: Mean normalization (and feature scaling) • Compute μj = (1∕m) Σi=1,m xij for each detector • Compute σ2j = (1∕(m-1)) Σi=1,m (xij - μj)2 for each detector • Set xij (xij − μj) ∕ σj • X = [ x1 x2 ... xm ] → n × m matrix
  • 15. Principal Component Analysis (PCA) Step 1: Mean normalization (and feature scaling) • Compute μj = (1∕m) Σi=1,m xij for each detector • Compute σ2j = (1∕(m-1)) Σi=1,m (xij - μj)2 for each detector • Set xij (xij − μj) ∕ σj • X = [ x1 x2 ... xm ] → n × m matrix
  • 16. Principal Component Analysis (PCA) Step 1: Mean normalization (and feature scaling) • Compute μj = (1∕m) Σi=1,m xij for each detector • Compute σ2j = (1∕(m-1)) Σi=1,m (xij - μj)2 for each detector • Set xij (xij − μj) ∕ σj • X = [ x1 x2 ... xm ] → n × m matrix 1mV *PCA can identify lower level correlations among subsets of the detectors*
  • 17. Principal Component Analysis (PCA) Step 2: Calculate covariance matrix • C = (1∕m) X XT (recall m = # time samples) • C → n × n symmetric matrix (recall n = 117 detectors) Step 3: Eigen decomposition • C = Q Λ Q-1 (*solve using SVD*) • Q = [ q1 q2 ... qn ] → n × n matrix containing eigenvectors qi • Λ → n × n diagonal matrix containing eigenvalues λi = Λii • Principal components = uncorrelated variables
  • 18. Principal Component Analysis (PCA) Step 4: Choose number of components to remove • Goal: choose fewest number of components (k) to REMOVE most of the observed variance in the data • QR = [ qk+1 qk+2 ... qn ] → n × k matrix, k < n • Z = [ z1 z2 ... zm ] = QRT X → k x m matrix • To derive model of galaxy intensities on sky, use Z instead of X (but...) Choosing k: Variance after PCA (given k) < 0.05 Variance with average subtraction only
  • 19. Principal Component Analysis (PCA) Step 5: Reconstruct data without correlated signal • Know RA/Dec for each detector: need to reconstruct approximation for data to make image • XR = QR Z → n × m matrix with correlated signal removed! 1mV
  • 20. Principal Component Analysis (PCA) Step 5: Reconstruct data without correlated signal • Know RA/Dec for each detector: need to reconstruct approximation for data to make image • XR = QR Z → n × m matrix with correlated signal removed! 20μV *Variance reduced by factor of 50*
  • 21. Image of PKS J1127-1857 Make the map: • Use information on sky position for each detector at each time sample (RAij, Decij) and bin data onto image grid • Set the intensity of each image pixel to the average of the xRij values that fall into that bin • Smooth image by telescope point-spread response function (Gaussian with FWHM=30’’) Average Subtraction PCA Cleaned • raw data = 30 MB • ttot = 4 min • 16640 samples/detector
  • 22. An Extragalactic Survey at λ=1.1 mm • Most galaxies are 100× fainter than PKS J1127-1857 • raw data ~ 25 GB • ttot ~ 80 hrs • ~ 2×107 samples/detector • AzTEC/COSMOS survey • 0.7 deg2 • 500× area of HUDF • 160 hrs versus 11 days for HUDF • 130 mm-bright galaxies Aretxaga et al. 2011
  • 23. An Extragalactic Survey at λ=1.1 mm • AzTEC/COSMOS survey • 0.7 deg2 • 500× area of HUDF • 160 hrs versus 270 hrs for HUDF • 130 mm-bright galaxies
  • 24. An Extragalactic Survey at λ=1.1 mm • AzTEC/COSMOS survey • 0.7 deg2 • 500× area of HUDF • 160 hrs versus 270 hrs for HUDF • 130 mm-bright galaxies
  • 25. An Extragalactic Survey at λ=1.1 mm • AzTEC-3 • Observed 1 Gyr after Big Bang • Starburst galaxy (SFR~1000 Msun/yr) Capak et al. 2011 • AzTEC/COSMOS survey • 0.7 deg2 • 500× area of HUDF • 160 hrs versus 270 hrs for HUDF • 130 mm-bright galaxies Aretxaga et al. 2011