Siwei lyu: natural image statistics

Natural Image
Statistics
Siwei Lyu

SUNY Albany

CIFAR NCAP Summer School 2009
August 7, 2009

Thanks to Eero Simoncelli for sharing some of the slides

Monday, August 10, 2009 1

big numbers


big numbers
• seconds since big bang: ~ 1017


big numbers
• atoms in the universe: ∼1080


big numbers
• atoms in the universe: ∼1080
• 65×65 8-bit gray-scale images: ∼1010000


... ...


... ...

“The distribution of natural images is complicated. Perhaps it is
something like beer foam, which is mostly empty but contains a thin
mesh-work of fluid which fills the space and occupies almost no volume.
The fluid region represents those images which are natural in character.”

[Ruderman 1996]

natural image statistics
• natural images are rare in image space
• they distinguish by
nonrandom structures
• common statistical
properties of natural
images is the focal
element in the study
of natural image
statistics


why are we interested in
natural image statistics?

Visual!
cortex

LGN

Retina


(a)

= ? ! + noise

(b) (c)


computer vision applications
• image restoration
- de-noising, de-blurring and de-mosaicing,
super-resolution and in-painting
• image compression
• texture synthesis
• image segmentation
• features for object detection and classiﬁcation
(SIFT, gist, “primal sketch”, saliency, etc)
• many others


optimization

math
perception
machine!
neuro-! learning
science
biology statistics
natural!
image!
statistics

computer! signal!
science processing

image!
processing
computer!
vision


scope of this tutorial
• important developments following a
general theme
• focusing on concepts
- light on math or speciﬁc applications
• gray-scale intensity image, do not cover
- color
- time (video)
- multi-image information (stereo)


main components

representation


Neural characterization
representation

Transform Representation

dients:
• stimuli

why representation matters?
• example (from David Marr)
• representation for numbers
• Arabic: 123
• Roman: MCXXIII
• binary: 1111011
• English: one hundred and twenty
three


• Arabic: 123 × 10
• Roman: MCXXIII × X
• binary: 1111011 × 110
three × ten


• Arabic: 123 × 4
• Roman: MCXXIII × IV
• binary: 1111011 × 100
three × four


linear representations
• pixel

• Fourier

• wavelet
- localized
- oriented


main components

observations

representation


image data
• calibrated - linearized response
• relatively large number

[van Hateren & van der Schaaf, 1998]

observations
• second-order pixel correlations
• 1/f power law of frequency domain energy
• importance of phases
• heavy-tail non-Gaussian marginals in wavelet
domain
• near elliptical shape of joint densities in
wavelet domain
• decay of dependency in wavelet domain
• ………….


main components

observations

representation model


models
• physical imaging process (e.g., occlusion)
• nonlinear manifold of natural images
• non-parametric implicit model based on large
set of images
Gaussian image signals weak
• matching statistics of natural
model is
with density models <-- our focus

ω −2 −1
1/f2
F
natural F -1
images

P(x) P(c)
all possible images
a. b.


main components

observations


Bayesian
applications framework


Bayesian framework

image (x) corruption obs. (y)
?? est. (x’)

min L(x, x (y))p(x|y)dx
x x

∝ min L(x, x (y))p(y|x)p(x)dx
x x
• x’(y): estimator
• L(x,x’(y)): loss functional
• p(x): prior model for natural images
• p(y|x): likelihood -- from corruption process


main components

observations

efﬁcient
coding


applications


Neural characterization

representation

Transform Representation

• unsupervised learning
•
dients: specify desired properties of the transform outputs

• stimuli what are such properties?
• response model

what makes a good representation?
• intuitively, transformed signal should be
+?.*-8,@A/-*BCD
!"#$%&'()*+&%$",-$%(%*.,/&01#,2%,3%(%1".&&45#)&2$#,2(+%5(1(%$&1%#21,%1-,%5#)&2$#,2$
“simpler”
!

" !"#$#%&&'()*+,*-$)./0)$1*)*&/"2%$#/")/.)$1*),&/34()$1*0*)#5)"/)%--%0*"$)5$03,$30*)#")$1*)5*$)/.)-/#"$5
" 61//5#"2)%")%--0/-0#%$*)0/$%$#/")%&&/75)35)$/)3"8*#&)$1*)3"4*0&'#"2)5$03,$30*9):;/3),%")$1#"<)/.)$1#5)0/$%$#/")
%5)=7%&<#"2)%0/3"4=)$1*)$10**>4#?*"5#/"%&)5*$()&//<#"2)./0)$1*)@*5$)8#*7-/#"$A
- reduced dimensionality
! 678%0(2%"&+*%3#25%$90"%925&.+:#2;%$1.9019.&<%=1%$&+&01$%(%.,1(1#,2%$90"%1"(1%),$1%,3%1"&%
>(.#(?#+#1:%-#1"#2%1"&%5(1(%$&1%#$%.&*.&$&21&5%#2%1"&%3#.$1%3&-%5#)&2$#,2$%,3%1"&%.,1(1&5%5(1(
" !")/30)$10**>4#?*"5#/"%&),%5*()$1#5)?%')5**?)/.)&#$$&*)35*
" B/7*8*0()71*")$1*)4%$%)#5)1#21&')?3&$#4#?*"5#/"%&):CDE5)/.)4#?*"5#/"5A()$1#5)%"%&'5#5)#5)F3#$*)-/7*0.3&

!"#$%&'(#)%"*#%*+,##-$"*.",/01)1 =>
2)(,$&%*3'#)-$$-4561'",
7-8,1*.9:*;")<-$1)#0


what makes a good representation?
• intuitively, transformed signal should be
“simpler”
- reduced dependency

x r r has less dependency than x
d

- optimum: r is independent, p(r) = p(ri )
i=1
• reducing dependency is a general approach to
relieve the curse of dimensionality
• are there dependency in natural images?


redundancy in natural images
• structure = predictability = redundancy

[Kersten, 1987]


measure of statistical dependency

multi-information (MI):

I(x) = DKL p(x) p(xk )
k
p(x)
= p(x) log dx
x k p(xk )
d
= H(xk ) − H(x)
i=1

[Studeny and Vejnarova, 1998]


efﬁcient coding
[Attneave ’54; Barlow ’61; Laughlin ’81; Atick ’90; Bialek etal ‘91]

x r I(r, x) = H(r) − H(r|x)

• maximize mutual information of stimulus &
response, subject to constraints (e.g. metabolic)

• noiseless case => redundancy reduction:
d
H(r|x) = 0 ⇒ I(r, x) = H(r) = H(ri ) − I(r)

-
i=1
independent components
- efﬁcient (maxEnt) marginals


main components

observations


applications


closed loop

observations


applications


pixel domain


a. observation b.
1

Correlation
I(x+2,y)

I(x+4,y)
I(x+1,y)

b. 0
I(x,y) I(x,y) I(x,y)
1 Sp
Correlation
I(x+4,y)

I(x,y) 0
10 20 30 40
Spatial separation (pixels)


model
• maximum entropy density [Jaynes 54]
- assume zero mean
- Σ = E(xxT ) : consistent w/ second order
statistics
- ﬁnd p(x) with maximum entropy
- solution:
1 T −1
p(x) ∝ exp − x Σ x
2


Gaussian model for Bayesian
denoising
• additive Gaussian noise
y = x+w
p(y|x) ∝ exp[− y − x /2σw ]
2 2

• Gaussian model
1 T −1
p(x) ∝ exp − x Σ x
2
• posterior density (another Gaussian)
1 T −1 x−y 2
p(x|x) ∝ exp − x Σ x −
2 2σw
2

• inference (Wiener ﬁlter)
xMAP = xMMSE = Σ(Σ + 2 −1
σw I) y

efﬁcient coding transform
• for Gaussian p(x)
d
I(x) ∝ log(Σ)ii − log det(Σ)
i=1
• minimum (independent) when Σ is diagonal
• a transform that diagonalizes Σ can eliminate
all dependencies (second-order)


PCA
• eigen-decomposition of Σ: Σ = U ΛU T

- U: orthonormal matrix (rotation)
UTU = UUT = I
- Λ: diagonal matrix, Λii ≥ 0 -- eigenvalue
E{U x(U x) } =
T T T T T
U E{xx }U
= U U ΛU U = Λ
T T

• s = UTx, or x = Us, s is independent Gaussian

• principal component analysis (PCA)
- Karhunen Loeve transform


PCA

x


PCA

x

xPCA = U x
T


PCA bases learned from natural images (U)


representation
• PCA is for local patches
- data dependent
- expensive for large images
• assume translation invariance
cyclic boundary handling
- image lattice on a torus
- covariance matrix is block circulant
- eigenvectors are complex exponential
- diagonalized (decorrelated) with DFT
- PCA => Fourier representation


Spectral power
observations
Structural:spectral power
• Empirical:
6

Assume scale-invariance: 5

F (sω) = s F (ω) p 4

Log power
3

10
then: 2

1
F (ω) ∝ p
ω 1

0
0 1 2 3

Log spatialfrequency (cycles/image)
10

[Ritterman 52; DeRiugin 56; Field 87; Tolhurst 92; Ruderman/Bialek 94; ...]
ﬁgure from [Simoncelli 05]

model
• power law
A
F (ω) = γ
ω
- scale invariance F (sω) = sp F (ω)
• denoising (Wiener ﬁlter in frequency
domain)

ˆ A/ω γ
X(ω) = · Y (ω)
A/ω γ + σ 2


further observations

[Torralba and Oliva, 2003]


PCA

x
Σ = U ΛU T

T
U x

whitening

−1
Λ 2 T
U x

1
not unique! V Λ −2 T
U x

zero-phase (symmetric) whitening (ZCA)

−1 −1
A = UΛ 2 U T

minimum wiring length
receptive ﬁelds of retina neurons [Atick & Redlich, 92]

second-order constraints is weak
Gaussian model are weak
1/ϖγ Gaussian sample

ω −2 −1
1/f2
F F -1

P(x) P(c)

a. b.

whitened natural image

F 2 −1
ω F

ﬁgure courtesy of Eero Simoncelli

summary

pixel


summary

second order

pixel


summary

second order

pixel Gaussian


summary

second order

pixel Gaussian

PCA/Fourier


summary

second order

power spectrum

pixel Gaussian

PCA/Fourier


summary

second order

power spectrum

pixel Gaussian

PCA/Fourier power law model


summary

second order

power spectrum

pixel Gaussian

PCA/Fourier power law model

Not enough!

bandpass ﬁlter domain

⊗ =


observation
• sparseness
log p(x)

0

[Burt&Adelson 82; Field 87; Mallat 89; Daugman 89, ...]


model
• if we only enforce consistency on 1D
marginal densities, i.e., p(xi) = qi(xi)
- maximum entropic density is the
d
factorial density p(x) = i=1 qi (xi )

- multi-information is non-negative, and
achieves minimum (zero) when xis are
independent H(x) = i H(xi ) − I(x)
• there are second order dependencies, so
derived model is a linearly transformed
factorial (LTF) model

model
• linearly transformed factorial (LTF)
d
- independent sources: p(s) = i=1 p(si )
- A: invertible linear transform (basis)
 
  s1
| ··· |
x = As = a1 ···  . 
ad  . 
.
| ··· | sd
= s 1 a1 + · · · + s d a d

- A-1: ﬁlters for analysis
s=A −1
x


LTF model
• SVD of matrix A: A = U Λ V 1/2 T

- U,V: orthonormal matrices (rotation)
UTU = UUT = I and VTV = VVT = I
- Λ: diagonal matrix
(Λii)1/2 ≥ 0 -- singular value

rotation scale rotation

s VTs Λ1/2VTs x = U Λ1/2VTs

marginal model
• well ﬁt with generalized Gaussian

|s|p
p(s) ∝ exp −
σ
Marginal densities [Mallat 89; Simoncelli&Adelson 96; Moulin&Liu 99; …]
log(Probability)

log(Probability)

log(Probability)

log(Probability)
p = 0.46 p = 0.58 p = 0.48
!H/H = 0.0031 !H/H = 0.0011 !H/H = 0.0014

Wavelet coefficient value Wavelet coefficient value Wavelet coefficient value

Fig. 4. Log histograms of a single wavelet subband of four example images (see Fig. 1 for image
histogram, tails are truncated so as to show 99.8% of the distribution. Also shown (dashed lines)55
Monday, August 10, 2009 ar

II. BLSBayesian denoising prior
for non-Gaussian
• Assume marginal distribution [Mallat ‘89]:

P (x) ∝ exp −|x/s| p

• Then Bayes estimator is generally nonlinear:

p = 2.0 p = 1.0 p = 0.5
[Simoncelli & Adelson, ‘96]


scale mixture of Gaussians (GSM)
•
=
p(x)

p(x)
x x
√ [Andrews & Mallows 74, Wainwright & Simoncelli, 99]
x=u z
- u: zero mean Gaussian with unit variance
- z: positive random variable
- special cases (different p(z))
generalized Gaussian, Student’s t, Bessel’s K, Cauchy,
α-stable, etc


efﬁcient coding transform
• LTF model => independent component
analysis (ICA)
[Comon 94; Cardoso 96; Bell/Sejnowski 97; …]
- many different implementations (JADE,
InfoMax, FastICA, etc.)
- interpretation using SVD
s=A −1
x=VΛ −1/2
U x T

- where to get U
E{xx } =
T
AE{ss }A T T

= U Λ V IV Λ U
1/2 T 1/2 T

= U ΛU T


ICA PCA

x


ICA PCA

x

xPCA = U x
T


ICA PCA

x

xPCA = U x T

−1
xwht = Λ 2 U x T


ICA PCA

x

xPCA = U x T

−1
xwht = Λ 2 U x T

1
xICA = V Λ −2 T
U x

finding V 48/73

Higher-order redundancy reduction:
• find final rotation that maximizes non-
Independent Component Analysis (ICA)
Gaussianity
- linear mixing makes more Gaussian (CLT)
Find the most non-Gaussian directions:
- equivalent to maximize sparseness

figure from [Bethge 2008]

ICA bases (squared columns of A) learned from natural images
- similar shape to receptive ﬁeld of V1 simple cells
[Olshausen & Field 1996, Bell & Sejnowski 1997]

break


representation
• ICA basis resemble wavelet and other
multi-scale oriented linear representations
- localized in spatial location, frequency
band and local orientation
• ICA basis are learned from data, while
wavelet basis are ﬁxed

Gabor wavelet basis


summary

band-pass


summary

sparsity

band-pass


summary

sparsity

band-pass LTF


summary

sparsity

band-pass LTF

ICA/wavelet


summary

sparsity
higher-order
dependency

band-pass LTF

ICA/wavelet


summary

sparsity
higher-order
dependency

band-pass LTF

ICA/wavelet

Not enough!

problems with LTF
• any band-pass or high-pass ﬁlter will
lead to heavy tail marginals (even
random ones)
• if natural images are truly linear mixture
of independent non-Gaussian sources,
random projection (ﬁltering) should
look like Gaussian
- central limit theorem


problems with LTF

[Simoncelli ’97; Buccigrossi &Simoncelli ’99]

• Large-magnitude subband coefﬁcients are found at
neighboring positions, orientations, and scales.


0.5
raw
pca/ica
0.4
MI (bits/coeff)

0.3

0.2

0.1

0
1 2 4 8 16 32
Separation
ICA achieves very little improvement
over PCA in terms of dependency reduction
[Bethge 06, Lyu & Simoncelli 08]

LTF also a weak model...
natural images after
sample from LTF
Sample ICA ﬁltering
Gaussianized

Sample ICA-transformed
and Gaussianized
ﬁgure courtesy of Eero Simoncelli

remedy
• assumptions in LTF model and ICA
- factorial marginals for ﬁlter outputs
- linear combination
- invertible


remedy
- linear combination
- invertible
• model => [Zhu, Wu & Mumford 1997; Portilla &
Simoncelli 2000]
MaxEnt joint density with constraints on filter output
• representation => sparse coding [Olshausen & Field
1996]
- find filters giving optimum sparsity
- compressed sensing [Candes & Donoho 2003]


remedy
- linear combination nonlinear
- invertible


joint density of natural image
band-pass ﬁlter responses
with separation of 2 pixels


elliptically symmetric density spherically symmetric density

whitening

1 1 T −1 1 1 T
pesd (x) = 1 f − x Σ x pssd (x) = f − x x
α|Σ| 2 2 α 2

(Fang et.al. 1990)

Spherical vs LTF
0.2
blk blk 0.4 blk
blk size = 3x3 0.2
blk size = 7x7 blk size = 11x11
spherical spherical spherical
factorial factorial 0.35 factorial
0.15
0.3
0.15
0.25
0.1
0.2
0.1
0.15
0.05 0.1
0.05
0.05
0 0 0
3 6 9 12 15 18 20 3 6 9 12 15 18 20 3 6 9 12 15 18 20
kurtosis kurtosis kurtosis

3x3 7x7 15x15
data (ICA’d): sphericalized: factorialized:

• Histograms, kurtosis of projections of image blocks onto random
unit-norm basis functions.
• These imply data are closer to spherical than factorial
[Lyu & Simoncelli 08]


Linearly
transformed Elliptical
factorial

Factorial Gaussian Spherical


Linearly
factorial



ICA PCA


elliptical models of natural images

- Simoncelli, 1997;
- Zetzsche and Krieger, 1999;
- Huang and Mumford, 1999;
- Wainwright and Simoncelli, 2000;
- Hyvärinen et al., 2000;
- Parra et al., 2001;
- Srivastava et al., 2002;
- Sendur and Selesnick, 2002;
- Teh et al., 2003;
- Gehler and Welling, 2006
- etc.

[Fang et.al. 1990]

joint GSM model

√
x=u z

p(x)
GSM simulation x

Image data GSM simulation
! !
#" #"

" "
#" #"
!!" " !" !!" " !"


PCA/whitening

Linearly
factorial



PCA/whitening
ICA

Linearly
factorial



PCA/whitening
ICA ???

Linearly
factorial



nonlinear representations
• complex wavelet phase-based [Ates &
Orchid, 2003]
• orientation-based [Hammand & Simoncelli
2006]
• nonlinear whitening [Gluckman 2005]
• local divisive normalization [Malo et.al.
2004]
• global divisive normalization [Lyu &
Simoncelli 2007,2008]


1 xT x
p(x) = exp −
(2π)d 2
d
1 1
= exp − x2
i
(2π)d 2 i=1
d
1 1 2
= √ exp − xi
i=1
2π 2
d
= p(xi )
i=1

Gaussian is the only density that can be both factorial and
spherically symmetric [Nash and Klamkin 1976]


ICA PCA

?


radial Gaussianization (RG)

[Lyu & Simoncelli, 2008,2009]

pχ (r) ∝ r exp(−r2 /2)

pr (r) ∝ rf (−r2 /2)


pχ (r) ∝ r exp(−r2 /2)

g(r) = Fχ Fr (r)
−1

pr (r) ∝ rf (−r2 /2)


pχ (r) ∝ r exp(−r2 /2)

g(r) = Fχ Fr (r)
−1

pr (r) ∝ rf (−r2 /2)

g( xwht )
xrg = xwht
xwht

0.5
raw
pca/ica
MI (bits/coeff) 0.4 rg

0.3

0.2

0.1

0!
1 2 4 8 16 32
Separation


1.3 blk size = 15x15
blk size = 3x3
0.7
1.2

0.6 1.1

1
0.5
0.9
0.4
0.8
0.3
0.7

0.2 0.6

0.2 0.3 0.4 0.5 0.6 0.6 0.7 0.8 0.9 1 1.1
IPCA − IRAW IPCA − IRAW

(+)IRG − IRAW
(◦)IICA − IRAW

blocks of local mean removed pixel blocks of natural images

(Lyu & Simoncelli, Neural Computation, to appear)

ICA PCA RG

uniﬁcation as
Gaussianization?


marginal Gaussianization

p(y) y

p(x)

x


summary

sparsity

band-pass LTF

ICA/wavelet


summary

sparsity
higher-order
dependency

band-pass LTF

ICA/wavelet


summary

sparsity
higher-order
dependency

band-pass LTF

ICA/wavelet elliptical model


summary

sparsity
higher-order
dependency

band-pass LTF


RG


summary

sparsity
higher-order
dependency

band-pass LTF


RG

Not enough!

Joint densities
adjacent near far other scale other ori
150 150 150 150 150

100 100 100 100 100

50 50 50 50 50

0 0 0 0 0

!50 !50 !50 !50 !50

!100 !100 !100 !100 !100

!150 !150 !150 !150 !150

!100 0 100 !100 0 100 !100 0 100 !500 0 500 !100 0 100

150 150 150 150 150

100 100 100 100 100

50 50 50 50 50

0 0 0 0 0

!50 !50 !50 !50 !50

!100 !100 !100 !100 !100

!150 !150 !150 !150 !150

!100 0 100 !100 0 100 !100 0 100 !500 0 500 !100 0 100

• Nearby: densities are approximately circular/elliptical
Fig. 8. Empirical joint distributions of wavelet coefficients associated with different pairs of basis functions, for a single
image of a New York City street scene (see Fig. 1 for image description). The top row shows joint distributions as contour
plots, with lines drawn at equal intervals of log probability. The three leftmost examples correspond to pairs of basis func-

• Distant: densities are approximately factorial
tions at the same scale and orientation, but separated by different spatial offsets. The next corresponds to a pair at adjacent
scales (but the same orientation, and nearly the same position), and the rightmost corresponds to a pair at orthogonal orien-
tations (but the same scale and nearly the same position). The bottom row shows corresponding conditional distributions:
brightness corresponds to frequency of occurance, except that each column has been independently rescaled to fill the full
range of intensities. [Simoncelli, ‘97; Wainwright&Simoncelli, ‘99]

remain. First, although the normalized coefficients are cations. There are several issues that seem to be of pri-

extended models
• independent subspace and topographical ICA
[Hoyer & Hyvarinen, 2001,2003; Karklin & Lewicki
2005]
• adaptive covariance structures [Hammond &
Simoncelli, 2006; Guerrero-Colon et.al. 2008;
Karklin & Lewicki 2009]
• product of t experts [Osindero et.al. 2003]
• ﬁelds of experts [Roth & Black, 2005]
• tree and ﬁelds of GSMs [Wainwright & Simoncelli,
2003; Lyu & Simoncelli, 2008]
• implicit MRF model [Lyu 2009]

z field

x field

d √
FoGSM: x=u⊗ z
• u : zero mean homogeneous Gauss MRF
• z : exponentiated homogeneous Gauss MRF
• x|z : inhomogeneous Gauss MRF
√
• x z : homogeneous Gauss MRF
• marginal distribution is GSM
• generative model: eﬃcient sampling


x u log z


Barbara boat house

marginal

subband, sample, · · · · · · Gaussian

joint simulation: marginal densities
∆=1 ∆=8 ∆ = 32 orientation scale

subband

FoGSM


Lena Boats
" "

# #

!" !"
∆()*+,

∆()*+,
!$ !$

!' !'

!& !&

"## ! "#"! $! !# %! "## ! "#"! $! !# %! "##
σ σ

FoGSM BM3D kSVD
thods for three different images. Plotted are differences in PSNR for different input noise levels (σ) between
BLS-GSM [17], kSVD [39] and FoE [27]). The PSNR values for these methods were taken from
GSM FoE

peak-signal-to-noise-ratio (PSNR) [Lyu&Simoncelli, PAMI 08]

255
20 ∗ log10
2
i,j (Ioriginal (i, j) − Idenoised (i, j))


original image noisy image (σ = 25) matlab wiener2 FoGSM
(14.15dB) (27.19dB) (30.02dB)


original image noisy image (σ = 100) (8.13dB)

matlab wiener2(29.32dB) (18.38dB) FoGSM (23.01dB)

conditional density
µi = E(xi |xj,j∈N (i) ) = aj xj
j∈N (i)
2
σi = var(xi |xj,j∈N (i) ) = b + cj x2
j
j∈N (i)

• maxEnt conditional density
1 (xi − µi )2
p(xi |xj,j∈N (i) ) = exp −
2πσi
2 2σi2

- singleton conditionals
- joint MRF density can be determined by
all singletons (Brook’s lemma)


implicit MRF
• deﬁned by all singletons
• joint density (and clique potential) is
implicit
• learning: maximum pseudo-likelihood


ICM-MAP denoising
argmax p(x|y) = argmax p(y|x)p(x) = argmax log p(y|x) + log p(x)
x x x

- set initial value for x(0) , and t = 1
- repeat until convergence
- repeat for all i
- compute the current estimation for xi , as
(t) (t) (t)
xi = argmax log p(x1 , · · · , xi−1 ,
xi
(t−1) (t−1)
xi , xi+1 , · · · , xd |y).

- t ←t+1


ICM-MAP denoising
argmax log p(y|x) + log p(x)
xi
= argmax log p(y|x) + log p(x1 , · · · , xi−1 , xi , xi+1 , · · · , xn )
xi
= argmax log p(y|x) + log p(xi |xj,j∈N (i) )
xi
can be further simpliﬁed singleton conditional
(
((( ) .
+ (((j,j∈N (i)
log p(x
constant w.r.t xi

local adaptive and iterative Wiener ﬁltering
 
2 2
σw σi  yi µi
xi = 2 + 2− wij (xj − yj )
σw + σi σw
2 2 σi
i=j
.


summary

observations


applications


what need to be done
• inhomogeneous structures
- structural (edge, contour, etc.)
- textual (grass, leaves, etc.)
- smooth (fog, sky, etc.)
• local orientations and relative phases

holy grail: comprehensive model &
representations to capture all these
variations


big question marks
• what are natural images, anyway?

• ironically, white noises are “natural” as
they are the result of cosmic radiations
• naturalness is subjective

peeling the onion

natural
images

all possible images


peeling the onion

natural
images images of same
marginal stats

all possible images


peeling the onion

images of same
second order stats
natural
marginal stats

all possible images


peeling the onion

images of same
second order stats
natural
marginal stats

images of same
higher order stats

all possible images


resources
• D. L. Ruderman. The statistics of natural images. Network:
Computation in Neural Systems, 5:517–548, 1996.
• E. P. Simoncelli and B. Olshausen. Natural image statistics and
neural representation. Annual Review of Neuroscience, 24:1193–
1216, 2001.
• S.-C. Zhu. Statistical modeling and conceptualization of visual
patterns. IEEE Trans PAMI, 25(6), 2003
• A. Srivastava, A. B. Lee, E. P. Simoncelli, and S.-C. Zhu. On
advances in statistical modeling of natural images. J. Math. Imaging
and Vision, 18(1):17–33, 2003.
• E. P. Simoncelli. Statistical modeling of photographic images. In
Handbook of Image and Video Processing, 431–441. Academic
Press, 2005.
• A. Hyvärinen, J. Hurri, and P. O. Hoyer. Natural Image Statistics:
A probabilistic approach to early computational vision. Springer, 2009.


thank you


Siwei lyu: natural image statistics

Recomendados

Recomendados

Más contenido relacionado

Más de zukun

Más de zukun (20)

Siwei lyu: natural image statistics