SlideShare una empresa de Scribd logo
1 de 156
Descargar para leer sin conexión
Natural Image
              Statistics
                           Siwei Lyu

                          SUNY Albany

                                     CIFAR NCAP Summer School 2009
                                             August 7, 2009

                            Thanks to Eero Simoncelli for sharing some of the slides


Monday, August 10, 2009                                                                1
big numbers




Monday, August 10, 2009                 2
big numbers
                   • seconds since big bang: ~ 1017




Monday, August 10, 2009                               2
big numbers
                   • seconds since big bang: ~ 1017
                   • atoms in the universe: ∼1080




Monday, August 10, 2009                               2
big numbers
                   • seconds since big bang: ~ 1017
                   • atoms in the universe: ∼1080
                   • 65×65 8-bit gray-scale images: ∼1010000




Monday, August 10, 2009                                        2
... ...




Monday, August 10, 2009             3
... ...



                    “The distribution of natural images is complicated. Perhaps it is
                    something like beer foam, which is mostly empty but contains a thin
                    mesh-work of fluid which fills the space and occupies almost no volume.
                    The fluid region represents those images which are natural in character.”

                                                                           [Ruderman 1996]
Monday, August 10, 2009                                                                        3
natural image statistics
                   • natural images are rare in image space
                   • they distinguish by
                     nonrandom structures
                   • common statistical
                     properties of natural
                     images is the focal
                     element in the study
                     of natural image
                     statistics

Monday, August 10, 2009                                       4
why are we interested in
               natural image statistics?



                                                 Visual!
                                                 cortex




                                           LGN



                              Retina




Monday, August 10, 2009                                    5
why are we interested in
               natural image statistics?



                                                 Visual!
                                                 cortex




                                           LGN



                              Retina




Monday, August 10, 2009                                    5
why are we interested in
               natural image statistics?



                                                 Visual!
                                                 cortex




                                           LGN



                              Retina




Monday, August 10, 2009                                    5
why are we interested in
               natural image statistics?



                                                 Visual!
                                                 cortex




                                           LGN



                              Retina




Monday, August 10, 2009                                    5
(a)


                                =   ?   !         + noise

                          (b)               (c)




Monday, August 10, 2009                                     6
(a)


                                =   ?   !         + noise

                          (b)               (c)




Monday, August 10, 2009                                     6
computer vision applications
                   • image restoration
                     - de-noising, de-blurring and de-mosaicing,
                     super-resolution and in-painting
                   • image compression
                   • texture synthesis
                   • image segmentation
                   • features for object detection and classification
                     (SIFT, gist, “primal sketch”, saliency, etc)
                   • many others

Monday, August 10, 2009                                                7
optimization


                                                    math
                          perception
                                                                             machine!
                          neuro-!                                            learning
                          science
                                       biology                  statistics
                                                   natural!
                                                   image!
                                                   statistics

                                         computer!         signal!
                                         science           processing


                                                                  image!
                                                                  processing
                                computer!
                                vision




Monday, August 10, 2009                                                                 8
scope of this tutorial
                   • important developments following a
                     general theme
                   • focusing on concepts
                     - light on math or specific applications
                   • gray-scale intensity image, do not cover
                     - color
                     - time (video)
                     - multi-image information (stereo)


Monday, August 10, 2009                                         9
main components




               representation




Monday, August 10, 2009                           10
Neural characterization
                                representation




                                          Transform   Representation




dients:
• stimuli
 Monday, August 10, 2009                                               11
why representation matters?
                   • example (from David Marr)
                   • representation for numbers
                          • Arabic: 123
                          • Roman: MCXXIII
                          • binary: 1111011
                          • English: one hundred and twenty
                            three


Monday, August 10, 2009                                       12
why representation matters?
                   • example (from David Marr)
                   • representation for numbers
                          • Arabic: 123 × 10
                          • Roman: MCXXIII × X
                          • binary: 1111011 × 110
                          • English: one hundred and twenty
                            three × ten


Monday, August 10, 2009                                       13
why representation matters?
                   • example (from David Marr)
                   • representation for numbers
                          • Arabic: 123 × 4
                          • Roman: MCXXIII × IV
                          • binary: 1111011 × 100
                          • English: one hundred and twenty
                            three × four


Monday, August 10, 2009                                       14
linear representations
                   • pixel



                   • Fourier



                   • wavelet
                     - localized
                     - oriented

Monday, August 10, 2009                               15
main components

                                   observations



               representation




Monday, August 10, 2009                           16
image data
               • calibrated - linearized response
               • relatively large number




                          [van Hateren & van der Schaaf, 1998]
Monday, August 10, 2009                                          17
observations
                   • second-order pixel correlations
                   • 1/f power law of frequency domain energy
                   • importance of phases
                   • heavy-tail non-Gaussian marginals in wavelet
                     domain
                   • near elliptical shape of joint densities in
                     wavelet domain
                   • decay of dependency in wavelet domain
                   • ………….

Monday, August 10, 2009                                             18
main components

                                   observations



               representation                     model




Monday, August 10, 2009                                   19
models
                   • physical imaging process (e.g., occlusion)
                   • nonlinear manifold of natural images
                   • non-parametric implicit model based on large
                     set of images
                                   Gaussian image signals weak
                   • matching statistics of natural
                                                    model is
                          with density models <-- our focus

                                                  ω −2              −1
                                                         1/f2
                                                              F
                                                          natural   F -1
                                                          images


                                     P(x)    P(c)
                                                              all possible images
                                             a.                             b.

Monday, August 10, 2009                                                             20
main components

                                   observations



               representation                     model


                                                   Bayesian
                                   applications    framework




Monday, August 10, 2009                                        21
Bayesian framework

                          image (x)        corruption       obs. (y)
                                                                       ??   est. (x’)


                                      min         L(x, x (y))p(x|y)dx
                                       x      x

                                  ∝ min           L(x, x (y))p(y|x)p(x)dx
                                       x      x
                   • x’(y): estimator
                   • L(x,x’(y)): loss functional
                   • p(x): prior model for natural images
                   • p(y|x): likelihood -- from corruption process

Monday, August 10, 2009                                                                 22
application: Bayesian denoising
                   • additive Gaussian noise
                                     y   = x+w
                                p(y|x)   ∝ exp[−(y − x) /2σw ]
                                                       2   2

                   • maximum a posterior (MAP)
                           xMAP = argmax p(x|y) = argmax p(y|x)p(x)
                                     x                 x
                   • minimum mean squares error (MMSE)

                            xMMES   = argmin          x−x   2
                                                                p(x|y)dx
                                              x   x
                                           xp(y|x)p(x)dx
                                    =     x
                                                         = E(x|y)
                                          x
                                            p(y|x)p(x)dx
Monday, August 10, 2009                                                    23
main components

                                   observations

                                       efficient
                                        coding

               representation                     model



                                   applications




Monday, August 10, 2009                                   24
Neural characterization

                                    representation



                                                Transform      Representation




               • unsupervised learning
      •
dients:               specify desired properties of the transform outputs

• stimuli                        what are such properties?
• response model
Monday, August 10, 2009                                                         25
what makes a good representation?
                   • intuitively, transformed signal should be
                      +?.*-8,@A/-*BCD
                              !"#$%&'()*+&%$",-$%(%*.,/&01#,2%,3%(%1".&&45#)&2$#,2(+%5(1(%$&1%#21,%1-,%5#)&2$#,2$
                     “simpler”
                          !

                                "   !"#$#%&&'()*+,*-$)./0)$1*)*&/"2%$#/")/.)$1*),&/34()$1*0*)#5)"/)%--%0*"$)5$03,$30*)#")$1*)5*$)/.)-/#"$5
                                "   61//5#"2)%")%--0/-0#%$*)0/$%$#/")%&&/75)35)$/)3"8*#&)$1*)3"4*0&'#"2)5$03,$30*9):;/3),%")$1#"<)/.)$1#5)0/$%$#/")
                                    %5)=7%&<#"2)%0/3"4=)$1*)$10**>4#?*"5#/"%&)5*$()&//<#"2)./0)$1*)@*5$)8#*7-/#"$A
                     - reduced dimensionality
                          !   678%0(2%"&+*%3#25%$90"%925&.+:#2;%$1.9019.&<%=1%$&+&01$%(%.,1(1#,2%$90"%1"(1%),$1%,3%1"&%
                              >(.#(?#+#1:%-#1"#2%1"&%5(1(%$&1%#$%.&*.&$&21&5%#2%1"&%3#.$1%3&-%5#)&2$#,2$%,3%1"&%.,1(1&5%5(1(
                                "   !")/30)$10**>4#?*"5#/"%&),%5*()$1#5)?%')5**?)/.)&#$$&*)35*
                                "   B/7*8*0()71*")$1*)4%$%)#5)1#21&')?3&$#4#?*"5#/"%&):CDE5)/.)4#?*"5#/"5A()$1#5)%"%&'5#5)#5)F3#$*)-/7*0.3&




                                 !"#$%&'(#)%"*#%*+,##-$"*.",/01)1                                                                                 =>
                                 2)(,$&%*3'#)-$$-4561'",
                                 7-8,1*.9:*;")<-$1)#0




Monday, August 10, 2009                                                                                                                                26
what makes a good representation?
                   • intuitively, transformed signal should be
                     “simpler”
                     - reduced dependency

                     x                    r r has less dependency than x
                                                                 d

                          - optimum: r is independent, p(r) =         p(ri )
                                                                i=1
                   • reducing dependency is a general approach to
                     relieve the curse of dimensionality
                   • are there dependency in natural images?


Monday, August 10, 2009                                                        27
redundancy in natural images
                   • structure = predictability = redundancy




                                                    [Kersten, 1987]




Monday, August 10, 2009                                               28
measure of statistical dependency

                          multi-information (MI):

                                I(x) =    DKL p(x)              p(xk )
                                                            k
                                                           p(x)
                                      =         p(x) log            dx
                                            x              k p(xk )
                                            d
                                      =          H(xk ) − H(x)
                                           i=1

                                                    [Studeny and Vejnarova, 1998]

Monday, August 10, 2009                                                             29
efficient coding
                              [Attneave ’54; Barlow ’61; Laughlin ’81; Atick ’90; Bialek etal ‘91]



                  x                                 r             I(r, x) = H(r) − H(r|x)

                    • maximize mutual information of stimulus &
                          response, subject to constraints (e.g. metabolic)

                    • noiseless case => redundancy reduction:
                                                        d
                          H(r|x) = 0 ⇒ I(r, x) = H(r) =                               H(ri ) − I(r)

                          -
                                                                                i=1
                                independent components
                          -     efficient (maxEnt) marginals

Monday, August 10, 2009                                                                               30
main components

                                   observations



               representation                     model



                                   applications




Monday, August 10, 2009                                   31
closed loop

                                 observations



               representation                   model



                                 applications




Monday, August 10, 2009                                 32
pixel domain




Monday, August 10, 2009                  33
a.                                                        observation                                               b.
                                                                                                                            1




                                                                                                                   Correlation
                                                           I(x+2,y)




                                                                                               I(x+4,y)
               I(x+1,y)




                                                          b.                                                                0
                                   I(x,y)                                      I(x,y)                     I(x,y)
                                                     1                                                                                Sp
                                            Correlation
I(x+4,y)




                          I(x,y)                     0
                                                                        10      20       30       40
                                                                      Spatial separation (pixels)

     Monday, August 10, 2009                                                                                                          34
model
                   • maximum entropy density [Jaynes 54]
                     - assume zero mean
                     - Σ = E(xxT ) : consistent w/ second order
                       statistics
                     - find p(x) with maximum entropy
                     - solution:
                                      1 T −1
                          p(x) ∝ exp − x Σ x
                                      2




Monday, August 10, 2009                                           35
Gaussian model for Bayesian
                                   denoising
                   • additive Gaussian noise
                                   y   = x+w
                              p(y|x)   ∝ exp[− y − x /2σw ]
                                                     2  2

                   • Gaussian model
                                           1 T −1
                               p(x) ∝ exp − x Σ x
                                           2
                   • posterior density (another Gaussian)
                                        1 T −1   x−y     2
                          p(x|x) ∝ exp − x Σ x −
                                        2         2σw
                                                    2

                   • inference (Wiener filter)
                             xMAP = xMMSE = Σ(Σ +    2   −1
                                                    σw I) y
Monday, August 10, 2009                                       36
efficient coding transform
                   • for Gaussian p(x)
                                     d
                           I(x) ∝         log(Σ)ii − log det(Σ)
                                    i=1
                   • minimum (independent) when Σ is diagonal
                   • a transform that diagonalizes Σ can eliminate
                     all dependencies (second-order)




Monday, August 10, 2009                                              37
PCA
                   • eigen-decomposition of Σ: Σ = U ΛU    T

                     - U: orthonormal matrix (rotation)
                       UTU = UUT = I
                     - Λ: diagonal matrix, Λii ≥ 0 -- eigenvalue
                          E{U x(U x) } =
                             T    T   T          T     T
                                                U E{xx }U
                                       =        U U ΛU U = Λ
                                                 T    T


                   • s = UTx, or x = Us, s is independent Gaussian


                   • principal component analysis (PCA)
                     - Karhunen Loeve transform

Monday, August 10, 2009                                              38
PCA

                                x




Monday, August 10, 2009             39
PCA

                                x




                                xPCA = U x
                                       T




Monday, August 10, 2009                      40
PCA bases learned from natural images (U)

Monday, August 10, 2009                                               41
representation
                   • PCA is for local patches
                     - data dependent
                     - expensive for large images
                   • assume translation invariance
                     cyclic boundary handling
                     - image lattice on a torus
                        - covariance matrix is block circulant
                          - eigenvectors are complex exponential
                            - diagonalized (decorrelated) with DFT
                              - PCA => Fourier representation


Monday, August 10, 2009                                              42
Spectral power
                                             observations
Structural:spectral power
        •                                                           Empirical:
                                                                    6




Assume scale-invariance:                                            5




            F (sω) = s F (ω)   p                                    4




                                                        Log power
                                                                    3




                                                              10
 then:                                                              2

                     1
             F (ω) ∝ p
                    ω                                               1




                                                                    0
                                                                        0           1                  2          3

                                                                            Log spatialfrequency (cycles/image)
                                                                               10



                          [Ritterman 52; DeRiugin 56; Field 87; Tolhurst 92; Ruderman/Bialek 94; ...]
                                                                                        figure from [Simoncelli 05]
Monday, August 10, 2009                                                                                               43
model
                   • power law
                                                 A
                                         F (ω) = γ
                                                ω
                          - scale invariance F (sω) = sp F (ω)
                   • denoising (Wiener filter in frequency
                     domain)

                                  ˆ        A/ω γ
                                  X(ω) =             · Y (ω)
                                         A/ω γ + σ 2


Monday, August 10, 2009                                          44
further observations




                                      [Torralba and Oliva, 2003]


Monday, August 10, 2009                                            45
PCA

                                      x
                                      Σ = U ΛU          T




                                          T
                                     U x

                                  whitening

                                         −1
                                     Λ    2    T
                                              U x


                                               1
                          not unique! V Λ     −2    T
                                                   U x
Monday, August 10, 2009                                     46
zero-phase (symmetric) whitening (ZCA)




                                           −1          −1
                                         A      = UΛ    2   U   T



                          minimum wiring length
                          receptive fields of retina neurons [Atick & Redlich, 92]
Monday, August 10, 2009                                                             47
second-order constraints is weak
                               Gaussian model are weak
                                  1/ϖγ Gaussian sample

                                           ω −2               −1
                                                   1/f2
                                                          F   F -1


                             P(x)      P(c)

                                      a.                                b.

                                  whitened natural image


                              F                2              −1
                                           ω              F


                                                                     figure courtesy of Eero Simoncelli
Monday, August 10, 2009                                                                                  48
summary




                          pixel




Monday, August 10, 2009                     49
summary

                                  second order




                          pixel




Monday, August 10, 2009                          49
summary

                                  second order




                          pixel                  Gaussian




Monday, August 10, 2009                                     49
summary

                                    second order




                          pixel                    Gaussian

                      PCA/Fourier




Monday, August 10, 2009                                       49
summary

                                    second order

                                      power spectrum


                          pixel                        Gaussian

                      PCA/Fourier




Monday, August 10, 2009                                           49
summary

                                    second order

                                      power spectrum


                          pixel                         Gaussian

                      PCA/Fourier                      power law model




Monday, August 10, 2009                                                  49
summary

                                    second order

                                      power spectrum


                          pixel                         Gaussian

                      PCA/Fourier                      power law model



                                    Not enough!
Monday, August 10, 2009                                                  49
bandpass filter domain




                                ⊗      =




Monday, August 10, 2009                           50
observation
                   • sparseness
                             log p(x)




                                                     0

                          [Burt&Adelson 82; Field 87; Mallat 89; Daugman 89, ...]

Monday, August 10, 2009                                                             51
model
                   • if we only enforce consistency on 1D
                     marginal densities, i.e., p(xi) = qi(xi)
                     - maximum entropic density is the
                                                    d
                       factorial density p(x) = i=1 qi (xi )

                          - multi-information is non-negative, and
                          achieves minimum (zero) when xis are
                          independent H(x) = i H(xi ) − I(x)
                   • there are second order dependencies, so
                     derived model is a linearly transformed
                     factorial (LTF) model
Monday, August 10, 2009                                              52
model
                   • linearly transformed factorial (LTF)
                                                      d
                     - independent sources: p(s) = i=1 p(si )
                     - A: invertible linear transform (basis)
                                                     
                                                                   s1
                                            |         ···        |
                              x =    As = a1         ···           . 
                                                                ad  . 
                                                                      .
                                            |         ···        |   sd
                                 =   s 1 a1 + · · · + s d a d

                          - A-1: filters for analysis
                                            s=A     −1
                                                         x


Monday, August 10, 2009                                                     53
LTF model
                   • SVD of matrix A: A = U Λ V        1/2   T

                     - U,V: orthonormal matrices (rotation)
                        UTU = UUT = I and VTV = VVT = I
                     - Λ: diagonal matrix
                       (Λii)1/2 ≥ 0 -- singular value

                          rotation         scale         rotation




                          s          VTs           Λ1/2VTs       x = U Λ1/2VTs
Monday, August 10, 2009                                                          54
marginal model
                       • well fit with generalized Gaussian

                                                                |s|p
                                                   p(s) ∝ exp −
                                                                 σ
                                            Marginal densities         [Mallat 89; Simoncelli&Adelson 96; Moulin&Liu 99; …]
    log(Probability)




                                                    log(Probability)




                                                                                                       log(Probability)




                                                                                                                                                      log(Probability)
                         p = 0.46                                           p = 0.58                                       p = 0.48
                         !H/H = 0.0031                                      !H/H = 0.0011                                  !H/H = 0.0014

                       Wavelet coefficient value                           Wavelet coefficient value                      Wavelet coefficient value


         Fig. 4. Log histograms of a single wavelet subband of four example images (see Fig. 1 for image
         histogram, tails are truncated so as to show 99.8% of the distribution. Also shown (dashed lines)55
Monday, August 10, 2009                                                                                    ar
II. BLSBayesian denoising prior
                          for non-Gaussian
                     • Assume marginal distribution   [Mallat ‘89]:


                                    P (x) ∝ exp −|x/s| p



                     • Then Bayes estimator is generally nonlinear:



                          p = 2.0           p = 1.0                p = 0.5
                                                           [Simoncelli & Adelson, ‘96]

Monday, August 10, 2009                                                                  56
scale mixture of Gaussians (GSM)
                    •
                                                   =
             p(x)




                                                                  p(x)
                               x                                                 x
                             √                  [Andrews & Mallows 74, Wainwright & Simoncelli, 99]
                          x=u z
                          - u: zero mean Gaussian with unit variance
                          - z: positive random variable
                          - special cases (different p(z))
                          generalized Gaussian, Student’s t, Bessel’s K, Cauchy,
                          α-stable, etc


Monday, August 10, 2009                                                                               57
efficient coding transform
                   • LTF model => independent component
                     analysis (ICA)
                          [Comon 94; Cardoso 96; Bell/Sejnowski 97; …]
                          - many different implementations (JADE,
                          InfoMax, FastICA, etc.)
                          - interpretation using SVD
                                   s=A    −1
                                               x=VΛ       −1/2
                                                                 U x T

                          - where to get U
                             E{xx } =
                                    T
                                               AE{ss }A   T      T

                                    =          U Λ V IV Λ U
                                                  1/2 T  1/2 T

                                          =    U ΛU   T

Monday, August 10, 2009                                                  58
ICA            PCA

                          x




Monday, August 10, 2009             59
ICA                     PCA

                             x


                          xPCA = U x
                                 T




Monday, August 10, 2009                      59
ICA                                PCA

                               x


                           xPCA = U x     T




                                     −1
                          xwht = Λ    2   U x T




Monday, August 10, 2009                                 59
ICA                                           PCA

                               x


                           xPCA = U x     T




                                     −1
                          xwht = Λ    2   U x T




                                                   1
                              xICA = V Λ          −2    T
                                                       U x
Monday, August 10, 2009                                            59
finding V                             48/73


               Higher-order redundancy reduction:
               • find final rotation that maximizes non-
             Independent Component Analysis (ICA)
                 Gaussianity
                  - linear mixing makes more Gaussian (CLT)
           Find the most non-Gaussian directions:
                  - equivalent to maximize sparseness




                                        figure from [Bethge 2008]
Monday, August 10, 2009                                                    60
ICA bases (squared columns of A) learned from natural images
            - similar shape to receptive field of V1 simple cells
              [Olshausen & Field 1996, Bell & Sejnowski 1997]
Monday, August 10, 2009                                                    61
break




Monday, August 10, 2009           62
representation
                   • ICA basis resemble wavelet and other
                     multi-scale oriented linear representations
                     - localized in spatial location, frequency
                     band and local orientation
                   • ICA basis are learned from data, while
                     wavelet basis are fixed



Gabor wavelet basis




Monday, August 10, 2009                                            63
summary




               band-pass




Monday, August 10, 2009              64
summary

                            sparsity




               band-pass




Monday, August 10, 2009                64
summary

                            sparsity




               band-pass               LTF




Monday, August 10, 2009                      64
summary

                                     sparsity




               band-pass                        LTF

                      ICA/wavelet




Monday, August 10, 2009                               64
summary

                                     sparsity
                                      higher-order
                                      dependency


               band-pass                             LTF

                      ICA/wavelet




Monday, August 10, 2009                                    64
summary

                                     sparsity
                                       higher-order
                                       dependency


               band-pass                              LTF

                      ICA/wavelet




                                    Not enough!
Monday, August 10, 2009                                     64
problems with LTF
                   • any band-pass or high-pass filter will
                     lead to heavy tail marginals (even
                     random ones)
                   • if natural images are truly linear mixture
                     of independent non-Gaussian sources,
                     random projection (filtering) should
                     look like Gaussian
                     - central limit theorem


Monday, August 10, 2009                                           65
problems with LTF




                                                [Simoncelli ’97; Buccigrossi &Simoncelli ’99]

                • Large-magnitude subband coefficients are found at
                          neighboring positions, orientations, and scales.

Monday, August 10, 2009                                                                         66
0.5
                                                                           raw
                                                                           pca/ica
                                            0.4
                          MI (bits/coeff)


                                            0.3

                                            0.2

                                            0.1

                                             0
                                              1    2     4     8      16       32
                                                         Separation
                                            ICA achieves very little improvement
                                            over PCA in terms of dependency reduction
                                                      [Bethge 06, Lyu & Simoncelli 08]
Monday, August 10, 2009                                                                  67
LTF also a weak model...
                                        natural images after
                     sample from LTF
                            Sample         ICA filtering
                                                 Gaussianized




                           Sample         ICA-transformed
                                          and Gaussianized
                                         figure courtesy of Eero Simoncelli
Monday, August 10, 2009                                                      68
remedy
                   • assumptions in LTF model and ICA
                     - factorial marginals for filter outputs
                     - linear combination
                     - invertible




Monday, August 10, 2009                                        69
remedy
                   • assumptions in LTF model and ICA
                     - factorial marginals for filter outputs
                     - linear combination
                     - invertible
                   • model => [Zhu, Wu & Mumford 1997; Portilla &
                     Simoncelli 2000]
                     MaxEnt joint density with constraints on filter output
                   • representation => sparse coding [Olshausen & Field
                     1996]
                     - find filters giving optimum sparsity
                     - compressed sensing [Candes & Donoho 2003]

Monday, August 10, 2009                                                      70
remedy
                   • assumptions in LTF model and ICA
                     - factorial marginals for filter outputs
                     - linear combination nonlinear
                     - invertible




Monday, August 10, 2009                                        71
joint density of natural image
                          band-pass filter responses
                          with separation of 2 pixels



Monday, August 10, 2009                                 72
elliptically symmetric density                  spherically symmetric density




                                                whitening




                            1          1 T −1                          1   1 T
            pesd (x) =          1 f   − x Σ x                pssd (x) = f − x x
                          α|Σ| 2       2                               α   2


                                                                    (Fang et.al. 1990)
Monday, August 10, 2009                                                                  73
Monday, August 10, 2009   74
Monday, August 10, 2009   75
Spherical vs LTF
     0.2
                                              blk                                                     blk               0.4                               blk
                   blk size = 3x3                               0.2
                                                                           blk size = 7x7                                         blk size = 11x11
                                              spherical                                               spherical                                           spherical
                                              factorial                                               factorial        0.35                               factorial
    0.15
                                                                                                                        0.3
                                                               0.15
                                                                                                                       0.25
     0.1
                                                                                                                        0.2
                                                                0.1
                                                                                                                       0.15
    0.05                                                                                                                0.1
                                                               0.05
                                                                                                                       0.05
      0                                                          0                                                       0
              3     6      9        12   15       18      20           3    6      9        12   15       18      20          3    6      9     12   15       18      20
                          kurtosis                                               kurtosis                                               kurtosis


                   3x3                                                      7x7                                                 15x15
                  data (ICA’d):                                       sphericalized:                                     factorialized:

• Histograms, kurtosis of projections of image blocks onto random
unit-norm basis functions.
• These imply data are closer to spherical than factorial
                                                                                                                         [Lyu & Simoncelli 08]


Monday, August 10, 2009                                                                                                                                                    76
Linearly
                          transformed               Elliptical
                          factorial




                          Factorial      Gaussian       Spherical




Monday, August 10, 2009                                             77
Linearly
                          transformed               Elliptical
                          factorial




                          Factorial      Gaussian       Spherical




Monday, August 10, 2009                                             78
Linearly
                          transformed               Elliptical
                          factorial




                          Factorial      Gaussian       Spherical




Monday, August 10, 2009                                             79
Linearly
                          transformed               Elliptical
                          factorial




                          Factorial      Gaussian       Spherical




Monday, August 10, 2009                                             80
ICA        PCA




Monday, August 10, 2009         81
ICA        PCA




Monday, August 10, 2009         82
elliptical models of natural images

                            - Simoncelli, 1997;
                            - Zetzsche and Krieger, 1999;
                            - Huang and Mumford, 1999;
                            - Wainwright and Simoncelli, 2000;
                            - Hyvärinen et al., 2000;
                            - Parra et al., 2001;
                            - Srivastava et al., 2002;
                            - Sendur and Selesnick, 2002;
                            - Teh et al., 2003;
                            - Gehler and Welling, 2006
                            - etc.



                                           [Fang et.al. 1990]
Monday, August 10, 2009                                          83
joint GSM model

                                                   √
                                                x=u z




                                                          p(x)
                                     GSM simulation                   x

                                Image data               GSM simulation
                           !                         !
                          #"                        #"




                           "                         "
                          #"                        #"
                               !!"    "    !"               !!"   "   !"


Monday, August 10, 2009                                                    84
PCA/whitening



                              Linearly
                          transformed               Elliptical
                          factorial




                          Factorial      Gaussian       Spherical




Monday, August 10, 2009                                             85
PCA/whitening
                          ICA

                               Linearly
                           transformed               Elliptical
                           factorial




                           Factorial      Gaussian       Spherical




Monday, August 10, 2009                                              86
PCA/whitening
                          ICA                                ???

                               Linearly
                           transformed               Elliptical
                           factorial




                           Factorial      Gaussian       Spherical




Monday, August 10, 2009                                              87
nonlinear representations
                   • complex wavelet phase-based [Ates &
                     Orchid, 2003]
                   • orientation-based [Hammand & Simoncelli
                     2006]
                   • nonlinear whitening [Gluckman 2005]
                   • local divisive normalization [Malo et.al.
                     2004]
                   • global divisive normalization [Lyu &
                     Simoncelli 2007,2008]

Monday, August 10, 2009                                          88
1          xT x
                          p(x) =              exp −
                                        (2π)d        2
                                                         d
                                         1          1
                              =               exp −           x2
                                                               i
                                        (2π)d       2   i=1
                                    d
                                        1     1 2
                              =        √ exp − xi
                                   i=1
                                        2π    2
                                    d
                              =          p(xi )
                                   i=1


             Gaussian is the only density that can be both factorial and
                 spherically symmetric [Nash and Klamkin 1976]


Monday, August 10, 2009                                                    89
ICA        PCA




                                ?



Monday, August 10, 2009             90
radial Gaussianization (RG)




                                            [Lyu & Simoncelli, 2008,2009]
Monday, August 10, 2009                                                     91
radial Gaussianization (RG)




                                            [Lyu & Simoncelli, 2008,2009]
Monday, August 10, 2009                                                     92
radial Gaussianization (RG)
                                pχ (r) ∝ r exp(−r2 /2)




                                                                 pr (r) ∝ rf (−r2 /2)




                                                         [Lyu & Simoncelli, 2008,2009]
Monday, August 10, 2009                                                                  93
radial Gaussianization (RG)
                                pχ (r) ∝ r exp(−r2 /2)


                                                                   g(r) = Fχ Fr (r)
                                                                           −1




                                                                 pr (r) ∝ rf (−r2 /2)




                                                         [Lyu & Simoncelli, 2008,2009]
Monday, August 10, 2009                                                                  94
radial Gaussianization (RG)
                                pχ (r) ∝ r exp(−r2 /2)


                                                                   g(r) = Fχ Fr (r)
                                                                           −1




                                                                 pr (r) ∝ rf (−r2 /2)




                                        g( xwht )
                                xrg   =           xwht
                                          xwht
                                                         [Lyu & Simoncelli, 2008,2009]
Monday, August 10, 2009                                                                  95
Monday, August 10, 2009   96
0.5
                                                                        raw
                                                                        pca/ica
                          MI (bits/coeff)   0.4                         rg


                                            0.3

                                            0.2

                                            0.1

                                             0!
                                              1   2   4     8      16       32
                                                      Separation




Monday, August 10, 2009                                                           97
0.5
                                                                        raw
                                                                        pca/ica
                          MI (bits/coeff)   0.4                         rg


                                            0.3

                                            0.2

                                            0.1

                                             0!
                                              1   2   4     8      16       32
                                                      Separation




Monday, August 10, 2009                                                           97
0.5
                                                                        raw
                                                                        pca/ica
                          MI (bits/coeff)   0.4                         rg


                                            0.3

                                            0.2

                                            0.1

                                             0!
                                              1   2   4     8      16       32
                                                      Separation




Monday, August 10, 2009                                                           97
0.5
                                                                        raw
                                                                        pca/ica
                          MI (bits/coeff)   0.4                         rg


                                            0.3

                                            0.2

                                            0.1

                                             0!
                                              1   2   4     8      16       32
                                                      Separation




Monday, August 10, 2009                                                           97
0.5
                                                                        raw
                                                                        pca/ica
                          MI (bits/coeff)   0.4                         rg


                                            0.3

                                            0.2

                                            0.1

                                             0!
                                              1   2   4     8      16       32
                                                      Separation




Monday, August 10, 2009                                                           97
0.5
                                                                        raw
                                                                        pca/ica
                          MI (bits/coeff)   0.4                         rg


                                            0.3

                                            0.2

                                            0.1

                                             0!
                                              1   2   4     8      16       32
                                                      Separation




Monday, August 10, 2009                                                           97
0.5
                                                                        raw
                                                                        pca/ica
                          MI (bits/coeff)   0.4                         rg


                                            0.3

                                            0.2

                                            0.1

                                             0!
                                              1   2   4     8      16       32
                                                      Separation




Monday, August 10, 2009                                                           97
1.3         blk size = 15x15
                           blk size = 3x3
             0.7
                                                               1.2

             0.6                                               1.1

                                                                1
             0.5
                                                               0.9
             0.4
                                                               0.8
             0.3
                                                               0.7

             0.2                                               0.6

                     0.2     0.3     0.4    0.5   0.6                0.6     0.7   0.8    0.9   1   1.1
                              IPCA − IRAW                                          IPCA − IRAW

                                                   (+)IRG − IRAW
                                                   (◦)IICA − IRAW

    blocks of local mean removed pixel blocks of natural images

                                              (Lyu & Simoncelli, Neural Computation, to appear)
Monday, August 10, 2009                                                                                   98
ICA                           PCA   RG




                           unification as
                          Gaussianization?




Monday, August 10, 2009                                 99
marginal Gaussianization




                             p(y)   y


                                        p(x)




                                               x




Monday, August 10, 2009                              100
summary

                                     sparsity




               band-pass                        LTF

                      ICA/wavelet




Monday, August 10, 2009                               101
summary

                                     sparsity
                                      higher-order
                                      dependency


               band-pass                             LTF

                      ICA/wavelet




Monday, August 10, 2009                                    101
summary

                                     sparsity
                                      higher-order
                                      dependency


               band-pass                                LTF

                      ICA/wavelet                    elliptical model




Monday, August 10, 2009                                                 101
summary

                                     sparsity
                                      higher-order
                                      dependency


               band-pass                                LTF

                      ICA/wavelet                    elliptical model


                            RG


Monday, August 10, 2009                                                 101
summary

                                     sparsity
                                       higher-order
                                       dependency


               band-pass                                 LTF

                      ICA/wavelet                     elliptical model


                            RG

                                    Not enough!
Monday, August 10, 2009                                                  101
Joint densities
              adjacent                        near                       far                     other scale              other ori
       150                       150                        150                            150                     150

       100                       100                        100                            100                     100

        50                        50                         50                             50                      50

         0                         0                          0                              0                       0

       !50                      !50                        !50                            !50                      !50

      !100                      !100                       !100                           !100                    !100

      !150                      !150                       !150                           !150                    !150

             !100    0    100          !100     0    100          !100     0      100            !500   0   500          !100   0     100


       150                       150                        150                            150                     150

       100                       100                        100                            100                     100

        50                        50                         50                             50                      50

         0                         0                          0                              0                       0

       !50                      !50                        !50                            !50                     !50

      !100                      !100                       !100                           !100                    !100

      !150                      !150                       !150                           !150                    !150

             !100    0    100          !100     0    100          !100     0      100            !500   0   500          !100   0     100




             • Nearby: densities are approximately circular/elliptical
      Fig. 8. Empirical joint distributions of wavelet coefficients associated with different pairs of basis functions, for a single
      image of a New York City street scene (see Fig. 1 for image description). The top row shows joint distributions as contour
      plots, with lines drawn at equal intervals of log probability. The three leftmost examples correspond to pairs of basis func-


             • Distant: densities are approximately factorial
      tions at the same scale and orientation, but separated by different spatial offsets. The next corresponds to a pair at adjacent
      scales (but the same orientation, and nearly the same position), and the rightmost corresponds to a pair at orthogonal orien-
      tations (but the same scale and nearly the same position). The bottom row shows corresponding conditional distributions:
      brightness corresponds to frequency of occurance, except that each column has been independently rescaled to fill the full
      range of intensities.                                          [Simoncelli, ‘97; Wainwright&Simoncelli, ‘99]

  remain. First, although the normalized coefficients are                       cations. There are several issues that seem to be of pri-
Monday, August 10, 2009                                                                                                                     102
extended models
                   • independent subspace and topographical ICA
                     [Hoyer & Hyvarinen, 2001,2003; Karklin & Lewicki
                     2005]
                   • adaptive covariance structures [Hammond &
                     Simoncelli, 2006; Guerrero-Colon et.al. 2008;
                     Karklin & Lewicki 2009]
                   • product of t experts [Osindero et.al. 2003]
                   • fields of experts [Roth & Black, 2005]
                   • tree and fields of GSMs [Wainwright & Simoncelli,
                     2003; Lyu & Simoncelli, 2008]
                   • implicit MRF model [Lyu 2009]
Monday, August 10, 2009                                                 103
z field



                                                                 x field

                                                   d √
                          FoGSM:                 x=u⊗ z
                          • u : zero mean homogeneous Gauss MRF
                          • z : exponentiated homogeneous Gauss MRF
                          • x|z : inhomogeneous Gauss MRF
                                √
                          • x      z : homogeneous Gauss MRF
                          • marginal distribution is GSM
                          • generative model: efficient sampling

Monday, August 10, 2009                                                    104
x   u   log z




Monday, August 10, 2009                   105
Barbara                 boat                            house




marginal


                                           subband,      sample, · · · · · · Gaussian

      joint                                simulation: marginal densities
                          ∆=1             ∆=8              ∆ = 32                 orientation   scale




      subband




      FoGSM


Monday, August 10, 2009                                                                                 106
Lena                                                     Boats
                "                                                       "


                #                                                       #


               !"                                                      !"
      ∆()*+,




                                                              ∆()*+,
               !$                                                      !$


               !'                                                      !'


               !&                                                      !&

"##                 ! "#"!   $!     !#          %!     "##                   ! "#"!   $!      !#      %!      "##
                                     σ                                                            σ

                                  FoGSM                      BM3D                            kSVD
thods for three different images. Plotted are differences in PSNR for different input noise levels (σ) between
   BLS-GSM [17], kSVD [39] and FoE [27]). The PSNR values for these methods were taken from
                                                             GSM                             FoE

                              peak-signal-to-noise-ratio (PSNR)                            [Lyu&Simoncelli, PAMI 08]

                                                                            255
                                         20 ∗ log10
                                                                                                  2
                                                      i,j (Ioriginal (i, j) − Idenoised (i, j))


Monday, August 10, 2009                                                                                                107
original image   noisy image (σ = 25)   matlab wiener2    FoGSM
                                    (14.15dB)          (27.19dB)      (30.02dB)




Monday, August 10, 2009                                                           108
original image        noisy image (σ = 100) (8.13dB)




            matlab wiener2(29.32dB) (18.38dB)         FoGSM (23.01dB)
Monday, August 10, 2009                                                          109
pairwise conditional density



                                                        E(x2|x1)+std (x2|x1)
                           x2



                                                        E(x2|x1)

                                                        E(x2|x1)-std (x2|x1)
                                      x1
                                               p(x2|x1)
                                  “bow-tie”
                                           [Buccigrossi & Simoncelli, 97]

Monday, August 10, 2009                                                        110
pairwise conditional density



                                                            E(x2|x1)+std (x2|x1)
                           x2



                                                            E(x2|x1)


                                                            E(x2|x1)-std (x2|x1)
                                       x1
                                   E(x2 |x1 ) ≈ ax1
                                 var(x2 |x1 ) ≈ b +     2
                                                      cx1
Monday, August 10, 2009                                                            111
conditional density
                           µi   = E(xi |xj,j∈N (i) ) =             aj xj
                                                         j∈N (i)
                            2
                           σi   = var(xi |xj,j∈N (i) ) = b +               cj x2
                                                                               j
                                                               j∈N (i)

                   • maxEnt conditional density
                                                 1          (xi − µi )2
                          p(xi |xj,j∈N (i) ) =        exp −
                                                 2πσi
                                                    2          2σi2

                          - singleton conditionals
                          - joint MRF density can be determined by
                          all singletons (Brook’s lemma)

Monday, August 10, 2009                                                            112
implicit MRF
                   • defined by all singletons
                   • joint density (and clique potential) is
                     implicit
                   • learning: maximum pseudo-likelihood




Monday, August 10, 2009                                        113
ICM-MAP denoising
            argmax p(x|y) = argmax p(y|x)p(x) = argmax log p(y|x) + log p(x)
                   x                    x                           x


                          - set initial value for x(0) , and t = 1
                          - repeat until convergence
                               - repeat for all i
                                    - compute the current estimation for xi , as
                                             (t)                            (t)        (t)
                                            xi     =   argmax log p(x1 , · · · , xi−1 ,
                                                          xi
                                                             (t−1)            (t−1)
                                                       xi , xi+1 , · · ·   , xd     |y).

                               - t ←t+1

Monday, August 10, 2009                                                                      114
ICM-MAP denoising
                          argmax log p(y|x) + log p(x)
                             xi
                     = argmax log p(y|x) + log p(x1 , · · · , xi−1 , xi , xi+1 , · · · , xn )
                             xi
                     = argmax            log p(y|x)           + log p(xi |xj,j∈N (i) )
                             xi
                                   can be further simplified      singleton conditional
                                  (
                              ((( ) .
                     + (((j,j∈N (i)
                       log p(x
                          constant w.r.t xi


                  local adaptive and iterative Wiener filtering
                                                        
                                   2 2
                                 σw σi  yi  µi
                            xi = 2          + 2−                       wij (xj − yj )
                                σw + σi σw
                                       2  2  σi
                                                                 i=j
                                                                                         .



Monday, August 10, 2009                                                                         115
summary

                                observations



               representation                  model



                                applications




Monday, August 10, 2009                                116
what need to be done
                   • inhomogeneous structures
                     - structural (edge, contour, etc.)
                     - textual (grass, leaves, etc.)
                     - smooth (fog, sky, etc.)
                   • local orientations and relative phases

                          holy grail: comprehensive model &
                          representations to capture all these
                          variations

Monday, August 10, 2009                                          117
big question marks
                   • what are natural images, anyway?




                   • ironically, white noises are “natural” as
                     they are the result of cosmic radiations
                   • naturalness is subjective
Monday, August 10, 2009                                          118
peeling the onion




                                natural
                                images




      all possible images

Monday, August 10, 2009                         119
peeling the onion




                                natural
                                images          images of same
                                                 marginal stats




      all possible images

Monday, August 10, 2009                                           119
peeling the onion



                                           images of same
                                          second order stats
                                natural
                                images                         images of same
                                                                marginal stats




      all possible images

Monday, August 10, 2009                                                          119
peeling the onion



                                                    images of same
                                                   second order stats
                                         natural
                                         images                         images of same
                                                                         marginal stats



                             images of same
                            higher order stats




      all possible images

Monday, August 10, 2009                                                                   119
resources
                   • D. L. Ruderman. The statistics of natural images. Network:
                     Computation in Neural Systems, 5:517–548, 1996.
                   • E. P. Simoncelli and B. Olshausen. Natural image statistics and
                     neural representation. Annual Review of Neuroscience, 24:1193–
                     1216, 2001.
                   • S.-C. Zhu. Statistical modeling and conceptualization of visual
                     patterns. IEEE Trans PAMI, 25(6), 2003
                   • A. Srivastava, A. B. Lee, E. P. Simoncelli, and S.-C. Zhu. On
                     advances in statistical modeling of natural images. J. Math. Imaging
                     and Vision, 18(1):17–33, 2003.
                   • E. P. Simoncelli. Statistical modeling of photographic images. In
                     Handbook of Image and Video Processing, 431–441. Academic
                     Press, 2005.
                   • A. Hyvärinen, J. Hurri, and P. O. Hoyer. Natural Image Statistics:
                     A probabilistic approach to early computational vision. Springer, 2009.


Monday, August 10, 2009                                                                        120
thank you




Monday, August 10, 2009               121

Más contenido relacionado

Más de zukun

Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptorszukun
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectorszukun
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-introzukun
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video searchzukun
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video searchzukun
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video searchzukun
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learningzukun
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionzukun
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick startzukun
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysiszukun
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structureszukun
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities zukun
 
Icml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant featuresIcml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant featureszukun
 
ECCV2010: Modeling Temporal Structure of Decomposable Motion Segments for Act...
ECCV2010: Modeling Temporal Structure of Decomposable Motion Segments for Act...ECCV2010: Modeling Temporal Structure of Decomposable Motion Segments for Act...
ECCV2010: Modeling Temporal Structure of Decomposable Motion Segments for Act...zukun
 
Quoc le tera-scale deep learning
Quoc le   tera-scale deep learningQuoc le   tera-scale deep learning
Quoc le tera-scale deep learningzukun
 
Deep Learning workshop 2010: Deep Learning of Invariant Spatiotemporal Featur...
Deep Learning workshop 2010: Deep Learning of Invariant Spatiotemporal Featur...Deep Learning workshop 2010: Deep Learning of Invariant Spatiotemporal Featur...
Deep Learning workshop 2010: Deep Learning of Invariant Spatiotemporal Featur...zukun
 
Lecun 20060816-ciar-01-energy based learning
Lecun 20060816-ciar-01-energy based learningLecun 20060816-ciar-01-energy based learning
Lecun 20060816-ciar-01-energy based learningzukun
 
Lecun 20060816-ciar-02-deep learning for generic object recognition
Lecun 20060816-ciar-02-deep learning for generic object recognitionLecun 20060816-ciar-02-deep learning for generic object recognition
Lecun 20060816-ciar-02-deep learning for generic object recognitionzukun
 
P05 deep boltzmann machines cvpr2012 deep learning methods for vision
P05 deep boltzmann machines cvpr2012 deep learning methods for visionP05 deep boltzmann machines cvpr2012 deep learning methods for vision
P05 deep boltzmann machines cvpr2012 deep learning methods for visionzukun
 
P06 motion and video cvpr2012 deep learning methods for vision
P06 motion and video cvpr2012 deep learning methods for visionP06 motion and video cvpr2012 deep learning methods for vision
P06 motion and video cvpr2012 deep learning methods for visionzukun
 

Más de zukun (20)

Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptors
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectors
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-intro
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video search
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video search
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video search
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learning
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer vision
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick start
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysis
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structures
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities
 
Icml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant featuresIcml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant features
 
ECCV2010: Modeling Temporal Structure of Decomposable Motion Segments for Act...
ECCV2010: Modeling Temporal Structure of Decomposable Motion Segments for Act...ECCV2010: Modeling Temporal Structure of Decomposable Motion Segments for Act...
ECCV2010: Modeling Temporal Structure of Decomposable Motion Segments for Act...
 
Quoc le tera-scale deep learning
Quoc le   tera-scale deep learningQuoc le   tera-scale deep learning
Quoc le tera-scale deep learning
 
Deep Learning workshop 2010: Deep Learning of Invariant Spatiotemporal Featur...
Deep Learning workshop 2010: Deep Learning of Invariant Spatiotemporal Featur...Deep Learning workshop 2010: Deep Learning of Invariant Spatiotemporal Featur...
Deep Learning workshop 2010: Deep Learning of Invariant Spatiotemporal Featur...
 
Lecun 20060816-ciar-01-energy based learning
Lecun 20060816-ciar-01-energy based learningLecun 20060816-ciar-01-energy based learning
Lecun 20060816-ciar-01-energy based learning
 
Lecun 20060816-ciar-02-deep learning for generic object recognition
Lecun 20060816-ciar-02-deep learning for generic object recognitionLecun 20060816-ciar-02-deep learning for generic object recognition
Lecun 20060816-ciar-02-deep learning for generic object recognition
 
P05 deep boltzmann machines cvpr2012 deep learning methods for vision
P05 deep boltzmann machines cvpr2012 deep learning methods for visionP05 deep boltzmann machines cvpr2012 deep learning methods for vision
P05 deep boltzmann machines cvpr2012 deep learning methods for vision
 
P06 motion and video cvpr2012 deep learning methods for vision
P06 motion and video cvpr2012 deep learning methods for visionP06 motion and video cvpr2012 deep learning methods for vision
P06 motion and video cvpr2012 deep learning methods for vision
 

Siwei lyu: natural image statistics

  • 1. Natural Image Statistics Siwei Lyu SUNY Albany CIFAR NCAP Summer School 2009 August 7, 2009 Thanks to Eero Simoncelli for sharing some of the slides Monday, August 10, 2009 1
  • 3. big numbers • seconds since big bang: ~ 1017 Monday, August 10, 2009 2
  • 4. big numbers • seconds since big bang: ~ 1017 • atoms in the universe: ∼1080 Monday, August 10, 2009 2
  • 5. big numbers • seconds since big bang: ~ 1017 • atoms in the universe: ∼1080 • 65×65 8-bit gray-scale images: ∼1010000 Monday, August 10, 2009 2
  • 7. ... ... “The distribution of natural images is complicated. Perhaps it is something like beer foam, which is mostly empty but contains a thin mesh-work of fluid which fills the space and occupies almost no volume. The fluid region represents those images which are natural in character.” [Ruderman 1996] Monday, August 10, 2009 3
  • 8. natural image statistics • natural images are rare in image space • they distinguish by nonrandom structures • common statistical properties of natural images is the focal element in the study of natural image statistics Monday, August 10, 2009 4
  • 9. why are we interested in natural image statistics? Visual! cortex LGN Retina Monday, August 10, 2009 5
  • 10. why are we interested in natural image statistics? Visual! cortex LGN Retina Monday, August 10, 2009 5
  • 11. why are we interested in natural image statistics? Visual! cortex LGN Retina Monday, August 10, 2009 5
  • 12. why are we interested in natural image statistics? Visual! cortex LGN Retina Monday, August 10, 2009 5
  • 13. (a) = ? ! + noise (b) (c) Monday, August 10, 2009 6
  • 14. (a) = ? ! + noise (b) (c) Monday, August 10, 2009 6
  • 15. computer vision applications • image restoration - de-noising, de-blurring and de-mosaicing, super-resolution and in-painting • image compression • texture synthesis • image segmentation • features for object detection and classification (SIFT, gist, “primal sketch”, saliency, etc) • many others Monday, August 10, 2009 7
  • 16. optimization math perception machine! neuro-! learning science biology statistics natural! image! statistics computer! signal! science processing image! processing computer! vision Monday, August 10, 2009 8
  • 17. scope of this tutorial • important developments following a general theme • focusing on concepts - light on math or specific applications • gray-scale intensity image, do not cover - color - time (video) - multi-image information (stereo) Monday, August 10, 2009 9
  • 18. main components representation Monday, August 10, 2009 10
  • 19. Neural characterization representation Transform Representation dients: • stimuli Monday, August 10, 2009 11
  • 20. why representation matters? • example (from David Marr) • representation for numbers • Arabic: 123 • Roman: MCXXIII • binary: 1111011 • English: one hundred and twenty three Monday, August 10, 2009 12
  • 21. why representation matters? • example (from David Marr) • representation for numbers • Arabic: 123 × 10 • Roman: MCXXIII × X • binary: 1111011 × 110 • English: one hundred and twenty three × ten Monday, August 10, 2009 13
  • 22. why representation matters? • example (from David Marr) • representation for numbers • Arabic: 123 × 4 • Roman: MCXXIII × IV • binary: 1111011 × 100 • English: one hundred and twenty three × four Monday, August 10, 2009 14
  • 23. linear representations • pixel • Fourier • wavelet - localized - oriented Monday, August 10, 2009 15
  • 24. main components observations representation Monday, August 10, 2009 16
  • 25. image data • calibrated - linearized response • relatively large number [van Hateren & van der Schaaf, 1998] Monday, August 10, 2009 17
  • 26. observations • second-order pixel correlations • 1/f power law of frequency domain energy • importance of phases • heavy-tail non-Gaussian marginals in wavelet domain • near elliptical shape of joint densities in wavelet domain • decay of dependency in wavelet domain • …………. Monday, August 10, 2009 18
  • 27. main components observations representation model Monday, August 10, 2009 19
  • 28. models • physical imaging process (e.g., occlusion) • nonlinear manifold of natural images • non-parametric implicit model based on large set of images Gaussian image signals weak • matching statistics of natural model is with density models <-- our focus ω −2 −1 1/f2 F natural F -1 images P(x) P(c) all possible images a. b. Monday, August 10, 2009 20
  • 29. main components observations representation model Bayesian applications framework Monday, August 10, 2009 21
  • 30. Bayesian framework image (x) corruption obs. (y) ?? est. (x’) min L(x, x (y))p(x|y)dx x x ∝ min L(x, x (y))p(y|x)p(x)dx x x • x’(y): estimator • L(x,x’(y)): loss functional • p(x): prior model for natural images • p(y|x): likelihood -- from corruption process Monday, August 10, 2009 22
  • 31. application: Bayesian denoising • additive Gaussian noise y = x+w p(y|x) ∝ exp[−(y − x) /2σw ] 2 2 • maximum a posterior (MAP) xMAP = argmax p(x|y) = argmax p(y|x)p(x) x x • minimum mean squares error (MMSE) xMMES = argmin x−x 2 p(x|y)dx x x xp(y|x)p(x)dx = x = E(x|y) x p(y|x)p(x)dx Monday, August 10, 2009 23
  • 32. main components observations efficient coding representation model applications Monday, August 10, 2009 24
  • 33. Neural characterization representation Transform Representation • unsupervised learning • dients: specify desired properties of the transform outputs • stimuli what are such properties? • response model Monday, August 10, 2009 25
  • 34. what makes a good representation? • intuitively, transformed signal should be +?.*-8,@A/-*BCD !"#$%&'()*+&%$",-$%(%*.,/&01#,2%,3%(%1".&&45#)&2$#,2(+%5(1(%$&1%#21,%1-,%5#)&2$#,2$ “simpler” ! " !"#$#%&&'()*+,*-$)./0)$1*)*&/"2%$#/")/.)$1*),&/34()$1*0*)#5)"/)%--%0*"$)5$03,$30*)#")$1*)5*$)/.)-/#"$5 " 61//5#"2)%")%--0/-0#%$*)0/$%$#/")%&&/75)35)$/)3"8*#&)$1*)3"4*0&'#"2)5$03,$30*9):;/3),%")$1#"<)/.)$1#5)0/$%$#/") %5)=7%&<#"2)%0/3"4=)$1*)$10**>4#?*"5#/"%&)5*$()&//<#"2)./0)$1*)@*5$)8#*7-/#"$A - reduced dimensionality ! 678%0(2%"&+*%3#25%$90"%925&.+:#2;%$1.9019.&<%=1%$&+&01$%(%.,1(1#,2%$90"%1"(1%),$1%,3%1"&% >(.#(?#+#1:%-#1"#2%1"&%5(1(%$&1%#$%.&*.&$&21&5%#2%1"&%3#.$1%3&-%5#)&2$#,2$%,3%1"&%.,1(1&5%5(1( " !")/30)$10**>4#?*"5#/"%&),%5*()$1#5)?%')5**?)/.)&#$$&*)35* " B/7*8*0()71*")$1*)4%$%)#5)1#21&')?3&$#4#?*"5#/"%&):CDE5)/.)4#?*"5#/"5A()$1#5)%"%&'5#5)#5)F3#$*)-/7*0.3& !"#$%&'(#)%"*#%*+,##-$"*.",/01)1 => 2)(,$&%*3'#)-$$-4561'", 7-8,1*.9:*;")<-$1)#0 Monday, August 10, 2009 26
  • 35. what makes a good representation? • intuitively, transformed signal should be “simpler” - reduced dependency x r r has less dependency than x d - optimum: r is independent, p(r) = p(ri ) i=1 • reducing dependency is a general approach to relieve the curse of dimensionality • are there dependency in natural images? Monday, August 10, 2009 27
  • 36. redundancy in natural images • structure = predictability = redundancy [Kersten, 1987] Monday, August 10, 2009 28
  • 37. measure of statistical dependency multi-information (MI): I(x) = DKL p(x) p(xk ) k p(x) = p(x) log dx x k p(xk ) d = H(xk ) − H(x) i=1 [Studeny and Vejnarova, 1998] Monday, August 10, 2009 29
  • 38. efficient coding [Attneave ’54; Barlow ’61; Laughlin ’81; Atick ’90; Bialek etal ‘91] x r I(r, x) = H(r) − H(r|x) • maximize mutual information of stimulus & response, subject to constraints (e.g. metabolic) • noiseless case => redundancy reduction: d H(r|x) = 0 ⇒ I(r, x) = H(r) = H(ri ) − I(r) - i=1 independent components - efficient (maxEnt) marginals Monday, August 10, 2009 30
  • 39. main components observations representation model applications Monday, August 10, 2009 31
  • 40. closed loop observations representation model applications Monday, August 10, 2009 32
  • 42. a. observation b. 1 Correlation I(x+2,y) I(x+4,y) I(x+1,y) b. 0 I(x,y) I(x,y) I(x,y) 1 Sp Correlation I(x+4,y) I(x,y) 0 10 20 30 40 Spatial separation (pixels) Monday, August 10, 2009 34
  • 43. model • maximum entropy density [Jaynes 54] - assume zero mean - Σ = E(xxT ) : consistent w/ second order statistics - find p(x) with maximum entropy - solution: 1 T −1 p(x) ∝ exp − x Σ x 2 Monday, August 10, 2009 35
  • 44. Gaussian model for Bayesian denoising • additive Gaussian noise y = x+w p(y|x) ∝ exp[− y − x /2σw ] 2 2 • Gaussian model 1 T −1 p(x) ∝ exp − x Σ x 2 • posterior density (another Gaussian) 1 T −1 x−y 2 p(x|x) ∝ exp − x Σ x − 2 2σw 2 • inference (Wiener filter) xMAP = xMMSE = Σ(Σ + 2 −1 σw I) y Monday, August 10, 2009 36
  • 45. efficient coding transform • for Gaussian p(x) d I(x) ∝ log(Σ)ii − log det(Σ) i=1 • minimum (independent) when Σ is diagonal • a transform that diagonalizes Σ can eliminate all dependencies (second-order) Monday, August 10, 2009 37
  • 46. PCA • eigen-decomposition of Σ: Σ = U ΛU T - U: orthonormal matrix (rotation) UTU = UUT = I - Λ: diagonal matrix, Λii ≥ 0 -- eigenvalue E{U x(U x) } = T T T T T U E{xx }U = U U ΛU U = Λ T T • s = UTx, or x = Us, s is independent Gaussian • principal component analysis (PCA) - Karhunen Loeve transform Monday, August 10, 2009 38
  • 47. PCA x Monday, August 10, 2009 39
  • 48. PCA x xPCA = U x T Monday, August 10, 2009 40
  • 49. PCA bases learned from natural images (U) Monday, August 10, 2009 41
  • 50. representation • PCA is for local patches - data dependent - expensive for large images • assume translation invariance cyclic boundary handling - image lattice on a torus - covariance matrix is block circulant - eigenvectors are complex exponential - diagonalized (decorrelated) with DFT - PCA => Fourier representation Monday, August 10, 2009 42
  • 51. Spectral power observations Structural:spectral power • Empirical: 6 Assume scale-invariance: 5 F (sω) = s F (ω) p 4 Log power 3 10 then: 2 1 F (ω) ∝ p ω 1 0 0 1 2 3 Log spatialfrequency (cycles/image) 10 [Ritterman 52; DeRiugin 56; Field 87; Tolhurst 92; Ruderman/Bialek 94; ...] figure from [Simoncelli 05] Monday, August 10, 2009 43
  • 52. model • power law A F (ω) = γ ω - scale invariance F (sω) = sp F (ω) • denoising (Wiener filter in frequency domain) ˆ A/ω γ X(ω) = · Y (ω) A/ω γ + σ 2 Monday, August 10, 2009 44
  • 53. further observations [Torralba and Oliva, 2003] Monday, August 10, 2009 45
  • 54. PCA x Σ = U ΛU T T U x whitening −1 Λ 2 T U x 1 not unique! V Λ −2 T U x Monday, August 10, 2009 46
  • 55. zero-phase (symmetric) whitening (ZCA) −1 −1 A = UΛ 2 U T minimum wiring length receptive fields of retina neurons [Atick & Redlich, 92] Monday, August 10, 2009 47
  • 56. second-order constraints is weak Gaussian model are weak 1/ϖγ Gaussian sample ω −2 −1 1/f2 F F -1 P(x) P(c) a. b. whitened natural image F 2 −1 ω F figure courtesy of Eero Simoncelli Monday, August 10, 2009 48
  • 57. summary pixel Monday, August 10, 2009 49
  • 58. summary second order pixel Monday, August 10, 2009 49
  • 59. summary second order pixel Gaussian Monday, August 10, 2009 49
  • 60. summary second order pixel Gaussian PCA/Fourier Monday, August 10, 2009 49
  • 61. summary second order power spectrum pixel Gaussian PCA/Fourier Monday, August 10, 2009 49
  • 62. summary second order power spectrum pixel Gaussian PCA/Fourier power law model Monday, August 10, 2009 49
  • 63. summary second order power spectrum pixel Gaussian PCA/Fourier power law model Not enough! Monday, August 10, 2009 49
  • 64. bandpass filter domain ⊗ = Monday, August 10, 2009 50
  • 65. observation • sparseness log p(x) 0 [Burt&Adelson 82; Field 87; Mallat 89; Daugman 89, ...] Monday, August 10, 2009 51
  • 66. model • if we only enforce consistency on 1D marginal densities, i.e., p(xi) = qi(xi) - maximum entropic density is the d factorial density p(x) = i=1 qi (xi ) - multi-information is non-negative, and achieves minimum (zero) when xis are independent H(x) = i H(xi ) − I(x) • there are second order dependencies, so derived model is a linearly transformed factorial (LTF) model Monday, August 10, 2009 52
  • 67. model • linearly transformed factorial (LTF) d - independent sources: p(s) = i=1 p(si ) - A: invertible linear transform (basis)     s1 | ··· | x = As = a1 ···  .  ad  .  . | ··· | sd = s 1 a1 + · · · + s d a d - A-1: filters for analysis s=A −1 x Monday, August 10, 2009 53
  • 68. LTF model • SVD of matrix A: A = U Λ V 1/2 T - U,V: orthonormal matrices (rotation) UTU = UUT = I and VTV = VVT = I - Λ: diagonal matrix (Λii)1/2 ≥ 0 -- singular value rotation scale rotation s VTs Λ1/2VTs x = U Λ1/2VTs Monday, August 10, 2009 54
  • 69. marginal model • well fit with generalized Gaussian |s|p p(s) ∝ exp − σ Marginal densities [Mallat 89; Simoncelli&Adelson 96; Moulin&Liu 99; …] log(Probability) log(Probability) log(Probability) log(Probability) p = 0.46 p = 0.58 p = 0.48 !H/H = 0.0031 !H/H = 0.0011 !H/H = 0.0014 Wavelet coefficient value Wavelet coefficient value Wavelet coefficient value Fig. 4. Log histograms of a single wavelet subband of four example images (see Fig. 1 for image histogram, tails are truncated so as to show 99.8% of the distribution. Also shown (dashed lines)55 Monday, August 10, 2009 ar
  • 70. II. BLSBayesian denoising prior for non-Gaussian • Assume marginal distribution [Mallat ‘89]: P (x) ∝ exp −|x/s| p • Then Bayes estimator is generally nonlinear: p = 2.0 p = 1.0 p = 0.5 [Simoncelli & Adelson, ‘96] Monday, August 10, 2009 56
  • 71. scale mixture of Gaussians (GSM) • = p(x) p(x) x x √ [Andrews & Mallows 74, Wainwright & Simoncelli, 99] x=u z - u: zero mean Gaussian with unit variance - z: positive random variable - special cases (different p(z)) generalized Gaussian, Student’s t, Bessel’s K, Cauchy, α-stable, etc Monday, August 10, 2009 57
  • 72. efficient coding transform • LTF model => independent component analysis (ICA) [Comon 94; Cardoso 96; Bell/Sejnowski 97; …] - many different implementations (JADE, InfoMax, FastICA, etc.) - interpretation using SVD s=A −1 x=VΛ −1/2 U x T - where to get U E{xx } = T AE{ss }A T T = U Λ V IV Λ U 1/2 T 1/2 T = U ΛU T Monday, August 10, 2009 58
  • 73. ICA PCA x Monday, August 10, 2009 59
  • 74. ICA PCA x xPCA = U x T Monday, August 10, 2009 59
  • 75. ICA PCA x xPCA = U x T −1 xwht = Λ 2 U x T Monday, August 10, 2009 59
  • 76. ICA PCA x xPCA = U x T −1 xwht = Λ 2 U x T 1 xICA = V Λ −2 T U x Monday, August 10, 2009 59
  • 77. finding V 48/73 Higher-order redundancy reduction: • find final rotation that maximizes non- Independent Component Analysis (ICA) Gaussianity - linear mixing makes more Gaussian (CLT) Find the most non-Gaussian directions: - equivalent to maximize sparseness figure from [Bethge 2008] Monday, August 10, 2009 60
  • 78. ICA bases (squared columns of A) learned from natural images - similar shape to receptive field of V1 simple cells [Olshausen & Field 1996, Bell & Sejnowski 1997] Monday, August 10, 2009 61
  • 80. representation • ICA basis resemble wavelet and other multi-scale oriented linear representations - localized in spatial location, frequency band and local orientation • ICA basis are learned from data, while wavelet basis are fixed Gabor wavelet basis Monday, August 10, 2009 63
  • 81. summary band-pass Monday, August 10, 2009 64
  • 82. summary sparsity band-pass Monday, August 10, 2009 64
  • 83. summary sparsity band-pass LTF Monday, August 10, 2009 64
  • 84. summary sparsity band-pass LTF ICA/wavelet Monday, August 10, 2009 64
  • 85. summary sparsity higher-order dependency band-pass LTF ICA/wavelet Monday, August 10, 2009 64
  • 86. summary sparsity higher-order dependency band-pass LTF ICA/wavelet Not enough! Monday, August 10, 2009 64
  • 87. problems with LTF • any band-pass or high-pass filter will lead to heavy tail marginals (even random ones) • if natural images are truly linear mixture of independent non-Gaussian sources, random projection (filtering) should look like Gaussian - central limit theorem Monday, August 10, 2009 65
  • 88. problems with LTF [Simoncelli ’97; Buccigrossi &Simoncelli ’99] • Large-magnitude subband coefficients are found at neighboring positions, orientations, and scales. Monday, August 10, 2009 66
  • 89. 0.5 raw pca/ica 0.4 MI (bits/coeff) 0.3 0.2 0.1 0 1 2 4 8 16 32 Separation ICA achieves very little improvement over PCA in terms of dependency reduction [Bethge 06, Lyu & Simoncelli 08] Monday, August 10, 2009 67
  • 90. LTF also a weak model... natural images after sample from LTF Sample ICA filtering Gaussianized Sample ICA-transformed and Gaussianized figure courtesy of Eero Simoncelli Monday, August 10, 2009 68
  • 91. remedy • assumptions in LTF model and ICA - factorial marginals for filter outputs - linear combination - invertible Monday, August 10, 2009 69
  • 92. remedy • assumptions in LTF model and ICA - factorial marginals for filter outputs - linear combination - invertible • model => [Zhu, Wu & Mumford 1997; Portilla & Simoncelli 2000] MaxEnt joint density with constraints on filter output • representation => sparse coding [Olshausen & Field 1996] - find filters giving optimum sparsity - compressed sensing [Candes & Donoho 2003] Monday, August 10, 2009 70
  • 93. remedy • assumptions in LTF model and ICA - factorial marginals for filter outputs - linear combination nonlinear - invertible Monday, August 10, 2009 71
  • 94. joint density of natural image band-pass filter responses with separation of 2 pixels Monday, August 10, 2009 72
  • 95. elliptically symmetric density spherically symmetric density whitening 1 1 T −1 1 1 T pesd (x) = 1 f − x Σ x pssd (x) = f − x x α|Σ| 2 2 α 2 (Fang et.al. 1990) Monday, August 10, 2009 73
  • 98. Spherical vs LTF 0.2 blk blk 0.4 blk blk size = 3x3 0.2 blk size = 7x7 blk size = 11x11 spherical spherical spherical factorial factorial 0.35 factorial 0.15 0.3 0.15 0.25 0.1 0.2 0.1 0.15 0.05 0.1 0.05 0.05 0 0 0 3 6 9 12 15 18 20 3 6 9 12 15 18 20 3 6 9 12 15 18 20 kurtosis kurtosis kurtosis 3x3 7x7 15x15 data (ICA’d): sphericalized: factorialized: • Histograms, kurtosis of projections of image blocks onto random unit-norm basis functions. • These imply data are closer to spherical than factorial [Lyu & Simoncelli 08] Monday, August 10, 2009 76
  • 99. Linearly transformed Elliptical factorial Factorial Gaussian Spherical Monday, August 10, 2009 77
  • 100. Linearly transformed Elliptical factorial Factorial Gaussian Spherical Monday, August 10, 2009 78
  • 101. Linearly transformed Elliptical factorial Factorial Gaussian Spherical Monday, August 10, 2009 79
  • 102. Linearly transformed Elliptical factorial Factorial Gaussian Spherical Monday, August 10, 2009 80
  • 103. ICA PCA Monday, August 10, 2009 81
  • 104. ICA PCA Monday, August 10, 2009 82
  • 105. elliptical models of natural images - Simoncelli, 1997; - Zetzsche and Krieger, 1999; - Huang and Mumford, 1999; - Wainwright and Simoncelli, 2000; - Hyvärinen et al., 2000; - Parra et al., 2001; - Srivastava et al., 2002; - Sendur and Selesnick, 2002; - Teh et al., 2003; - Gehler and Welling, 2006 - etc. [Fang et.al. 1990] Monday, August 10, 2009 83
  • 106. joint GSM model √ x=u z p(x) GSM simulation x Image data GSM simulation ! ! #" #" " " #" #" !!" " !" !!" " !" Monday, August 10, 2009 84
  • 107. PCA/whitening Linearly transformed Elliptical factorial Factorial Gaussian Spherical Monday, August 10, 2009 85
  • 108. PCA/whitening ICA Linearly transformed Elliptical factorial Factorial Gaussian Spherical Monday, August 10, 2009 86
  • 109. PCA/whitening ICA ??? Linearly transformed Elliptical factorial Factorial Gaussian Spherical Monday, August 10, 2009 87
  • 110. nonlinear representations • complex wavelet phase-based [Ates & Orchid, 2003] • orientation-based [Hammand & Simoncelli 2006] • nonlinear whitening [Gluckman 2005] • local divisive normalization [Malo et.al. 2004] • global divisive normalization [Lyu & Simoncelli 2007,2008] Monday, August 10, 2009 88
  • 111. 1 xT x p(x) = exp − (2π)d 2 d 1 1 = exp − x2 i (2π)d 2 i=1 d 1 1 2 = √ exp − xi i=1 2π 2 d = p(xi ) i=1 Gaussian is the only density that can be both factorial and spherically symmetric [Nash and Klamkin 1976] Monday, August 10, 2009 89
  • 112. ICA PCA ? Monday, August 10, 2009 90
  • 113. radial Gaussianization (RG) [Lyu & Simoncelli, 2008,2009] Monday, August 10, 2009 91
  • 114. radial Gaussianization (RG) [Lyu & Simoncelli, 2008,2009] Monday, August 10, 2009 92
  • 115. radial Gaussianization (RG) pχ (r) ∝ r exp(−r2 /2) pr (r) ∝ rf (−r2 /2) [Lyu & Simoncelli, 2008,2009] Monday, August 10, 2009 93
  • 116. radial Gaussianization (RG) pχ (r) ∝ r exp(−r2 /2) g(r) = Fχ Fr (r) −1 pr (r) ∝ rf (−r2 /2) [Lyu & Simoncelli, 2008,2009] Monday, August 10, 2009 94
  • 117. radial Gaussianization (RG) pχ (r) ∝ r exp(−r2 /2) g(r) = Fχ Fr (r) −1 pr (r) ∝ rf (−r2 /2) g( xwht ) xrg = xwht xwht [Lyu & Simoncelli, 2008,2009] Monday, August 10, 2009 95
  • 118. Monday, August 10, 2009 96
  • 119. 0.5 raw pca/ica MI (bits/coeff) 0.4 rg 0.3 0.2 0.1 0! 1 2 4 8 16 32 Separation Monday, August 10, 2009 97
  • 120. 0.5 raw pca/ica MI (bits/coeff) 0.4 rg 0.3 0.2 0.1 0! 1 2 4 8 16 32 Separation Monday, August 10, 2009 97
  • 121. 0.5 raw pca/ica MI (bits/coeff) 0.4 rg 0.3 0.2 0.1 0! 1 2 4 8 16 32 Separation Monday, August 10, 2009 97
  • 122. 0.5 raw pca/ica MI (bits/coeff) 0.4 rg 0.3 0.2 0.1 0! 1 2 4 8 16 32 Separation Monday, August 10, 2009 97
  • 123. 0.5 raw pca/ica MI (bits/coeff) 0.4 rg 0.3 0.2 0.1 0! 1 2 4 8 16 32 Separation Monday, August 10, 2009 97
  • 124. 0.5 raw pca/ica MI (bits/coeff) 0.4 rg 0.3 0.2 0.1 0! 1 2 4 8 16 32 Separation Monday, August 10, 2009 97
  • 125. 0.5 raw pca/ica MI (bits/coeff) 0.4 rg 0.3 0.2 0.1 0! 1 2 4 8 16 32 Separation Monday, August 10, 2009 97
  • 126. 1.3 blk size = 15x15 blk size = 3x3 0.7 1.2 0.6 1.1 1 0.5 0.9 0.4 0.8 0.3 0.7 0.2 0.6 0.2 0.3 0.4 0.5 0.6 0.6 0.7 0.8 0.9 1 1.1 IPCA − IRAW IPCA − IRAW (+)IRG − IRAW (◦)IICA − IRAW blocks of local mean removed pixel blocks of natural images (Lyu & Simoncelli, Neural Computation, to appear) Monday, August 10, 2009 98
  • 127. ICA PCA RG unification as Gaussianization? Monday, August 10, 2009 99
  • 128. marginal Gaussianization p(y) y p(x) x Monday, August 10, 2009 100
  • 129. summary sparsity band-pass LTF ICA/wavelet Monday, August 10, 2009 101
  • 130. summary sparsity higher-order dependency band-pass LTF ICA/wavelet Monday, August 10, 2009 101
  • 131. summary sparsity higher-order dependency band-pass LTF ICA/wavelet elliptical model Monday, August 10, 2009 101
  • 132. summary sparsity higher-order dependency band-pass LTF ICA/wavelet elliptical model RG Monday, August 10, 2009 101
  • 133. summary sparsity higher-order dependency band-pass LTF ICA/wavelet elliptical model RG Not enough! Monday, August 10, 2009 101
  • 134. Joint densities adjacent near far other scale other ori 150 150 150 150 150 100 100 100 100 100 50 50 50 50 50 0 0 0 0 0 !50 !50 !50 !50 !50 !100 !100 !100 !100 !100 !150 !150 !150 !150 !150 !100 0 100 !100 0 100 !100 0 100 !500 0 500 !100 0 100 150 150 150 150 150 100 100 100 100 100 50 50 50 50 50 0 0 0 0 0 !50 !50 !50 !50 !50 !100 !100 !100 !100 !100 !150 !150 !150 !150 !150 !100 0 100 !100 0 100 !100 0 100 !500 0 500 !100 0 100 • Nearby: densities are approximately circular/elliptical Fig. 8. Empirical joint distributions of wavelet coefficients associated with different pairs of basis functions, for a single image of a New York City street scene (see Fig. 1 for image description). The top row shows joint distributions as contour plots, with lines drawn at equal intervals of log probability. The three leftmost examples correspond to pairs of basis func- • Distant: densities are approximately factorial tions at the same scale and orientation, but separated by different spatial offsets. The next corresponds to a pair at adjacent scales (but the same orientation, and nearly the same position), and the rightmost corresponds to a pair at orthogonal orien- tations (but the same scale and nearly the same position). The bottom row shows corresponding conditional distributions: brightness corresponds to frequency of occurance, except that each column has been independently rescaled to fill the full range of intensities. [Simoncelli, ‘97; Wainwright&Simoncelli, ‘99] remain. First, although the normalized coefficients are cations. There are several issues that seem to be of pri- Monday, August 10, 2009 102
  • 135. extended models • independent subspace and topographical ICA [Hoyer & Hyvarinen, 2001,2003; Karklin & Lewicki 2005] • adaptive covariance structures [Hammond & Simoncelli, 2006; Guerrero-Colon et.al. 2008; Karklin & Lewicki 2009] • product of t experts [Osindero et.al. 2003] • fields of experts [Roth & Black, 2005] • tree and fields of GSMs [Wainwright & Simoncelli, 2003; Lyu & Simoncelli, 2008] • implicit MRF model [Lyu 2009] Monday, August 10, 2009 103
  • 136. z field x field d √ FoGSM: x=u⊗ z • u : zero mean homogeneous Gauss MRF • z : exponentiated homogeneous Gauss MRF • x|z : inhomogeneous Gauss MRF √ • x z : homogeneous Gauss MRF • marginal distribution is GSM • generative model: efficient sampling Monday, August 10, 2009 104
  • 137. x u log z Monday, August 10, 2009 105
  • 138. Barbara boat house marginal subband, sample, · · · · · · Gaussian joint simulation: marginal densities ∆=1 ∆=8 ∆ = 32 orientation scale subband FoGSM Monday, August 10, 2009 106
  • 139. Lena Boats " " # # !" !" ∆()*+, ∆()*+, !$ !$ !' !' !& !& "## ! "#"! $! !# %! "## ! "#"! $! !# %! "## σ σ FoGSM BM3D kSVD thods for three different images. Plotted are differences in PSNR for different input noise levels (σ) between BLS-GSM [17], kSVD [39] and FoE [27]). The PSNR values for these methods were taken from GSM FoE peak-signal-to-noise-ratio (PSNR) [Lyu&Simoncelli, PAMI 08] 255 20 ∗ log10 2 i,j (Ioriginal (i, j) − Idenoised (i, j)) Monday, August 10, 2009 107
  • 140. original image noisy image (σ = 25) matlab wiener2 FoGSM (14.15dB) (27.19dB) (30.02dB) Monday, August 10, 2009 108
  • 141. original image noisy image (σ = 100) (8.13dB) matlab wiener2(29.32dB) (18.38dB) FoGSM (23.01dB) Monday, August 10, 2009 109
  • 142. pairwise conditional density E(x2|x1)+std (x2|x1) x2 E(x2|x1) E(x2|x1)-std (x2|x1) x1 p(x2|x1) “bow-tie” [Buccigrossi & Simoncelli, 97] Monday, August 10, 2009 110
  • 143. pairwise conditional density E(x2|x1)+std (x2|x1) x2 E(x2|x1) E(x2|x1)-std (x2|x1) x1 E(x2 |x1 ) ≈ ax1 var(x2 |x1 ) ≈ b + 2 cx1 Monday, August 10, 2009 111
  • 144. conditional density µi = E(xi |xj,j∈N (i) ) = aj xj j∈N (i) 2 σi = var(xi |xj,j∈N (i) ) = b + cj x2 j j∈N (i) • maxEnt conditional density 1 (xi − µi )2 p(xi |xj,j∈N (i) ) = exp − 2πσi 2 2σi2 - singleton conditionals - joint MRF density can be determined by all singletons (Brook’s lemma) Monday, August 10, 2009 112
  • 145. implicit MRF • defined by all singletons • joint density (and clique potential) is implicit • learning: maximum pseudo-likelihood Monday, August 10, 2009 113
  • 146. ICM-MAP denoising argmax p(x|y) = argmax p(y|x)p(x) = argmax log p(y|x) + log p(x) x x x - set initial value for x(0) , and t = 1 - repeat until convergence - repeat for all i - compute the current estimation for xi , as (t) (t) (t) xi = argmax log p(x1 , · · · , xi−1 , xi (t−1) (t−1) xi , xi+1 , · · · , xd |y). - t ←t+1 Monday, August 10, 2009 114
  • 147. ICM-MAP denoising argmax log p(y|x) + log p(x) xi = argmax log p(y|x) + log p(x1 , · · · , xi−1 , xi , xi+1 , · · · , xn ) xi = argmax log p(y|x) + log p(xi |xj,j∈N (i) ) xi can be further simplified singleton conditional ( ((( ) . + (((j,j∈N (i) log p(x constant w.r.t xi local adaptive and iterative Wiener filtering   2 2 σw σi  yi µi xi = 2 + 2− wij (xj − yj ) σw + σi σw 2 2 σi i=j . Monday, August 10, 2009 115
  • 148. summary observations representation model applications Monday, August 10, 2009 116
  • 149. what need to be done • inhomogeneous structures - structural (edge, contour, etc.) - textual (grass, leaves, etc.) - smooth (fog, sky, etc.) • local orientations and relative phases holy grail: comprehensive model & representations to capture all these variations Monday, August 10, 2009 117
  • 150. big question marks • what are natural images, anyway? • ironically, white noises are “natural” as they are the result of cosmic radiations • naturalness is subjective Monday, August 10, 2009 118
  • 151. peeling the onion natural images all possible images Monday, August 10, 2009 119
  • 152. peeling the onion natural images images of same marginal stats all possible images Monday, August 10, 2009 119
  • 153. peeling the onion images of same second order stats natural images images of same marginal stats all possible images Monday, August 10, 2009 119
  • 154. peeling the onion images of same second order stats natural images images of same marginal stats images of same higher order stats all possible images Monday, August 10, 2009 119
  • 155. resources • D. L. Ruderman. The statistics of natural images. Network: Computation in Neural Systems, 5:517–548, 1996. • E. P. Simoncelli and B. Olshausen. Natural image statistics and neural representation. Annual Review of Neuroscience, 24:1193– 1216, 2001. • S.-C. Zhu. Statistical modeling and conceptualization of visual patterns. IEEE Trans PAMI, 25(6), 2003 • A. Srivastava, A. B. Lee, E. P. Simoncelli, and S.-C. Zhu. On advances in statistical modeling of natural images. J. Math. Imaging and Vision, 18(1):17–33, 2003. • E. P. Simoncelli. Statistical modeling of photographic images. In Handbook of Image and Video Processing, 431–441. Academic Press, 2005. • A. Hyvärinen, J. Hurri, and P. O. Hoyer. Natural Image Statistics: A probabilistic approach to early computational vision. Springer, 2009. Monday, August 10, 2009 120
  • 156. thank you Monday, August 10, 2009 121