SlideShare a Scribd company logo
1 of 158
Download to read offline
Sensory Coding &
   Hierarchical
 Representations
        Michael S. Lewicki

   Computer Science Department &
Center for the Neural Basis of Cognition
      Carnegie Mellon University
“IT”




                                                                         V4

                                                                   V2
                         retina   LGN                   V1




NIPS 2007 Tutorial   2                  Michael S. Lewicki ! Carnegie Mellon
Fig. 1. Anatomical and functional hierarchies in the macaque visual system. The human Hedgé and Felleman, 2007
                                                                                          from visual system (not shown) is
believed to be roughly similar. A, A schematic summary of the laminar patterns of feed-forward (or ascending) and feed-
back (or descending) connections for visual area V1. The laminar patterns vary somewhat from one visual area to the next.
But in general, the connections are complementary, so that the ascending connections terminate in the granular layer (layer
4) and the descending connections avoid it. The connections are generally reciprocal, in that an area that sends feed-
forward connections to another area also receives feedback connections from it. The visual anatomical hierarchy is defined
based on, among other things, the laminar patterns of these interconnections among the various areas. See text for details.
B, A simplified version of the visual anatomical hierarchy in the macaque monkey. For the complete version, see Felleman
and Van Essen (1991). See text for additional details. AIT = anterior inferotemporal; LGN = lateral geniculate nucleus;
LIP = lateral intraparietal; MT = middle temporal; MST = medial superior temporal; PIT = posterior inferotemporal; V1 =
visual area 1; V2 = visual area 2; V4 = visual area 4; VIP = ventral intraparietal. C, A model of hierarchical processing of
visual information proposed by David Marr (1982). D, A schematic illustration of the presumed parallels between the
anatomical and functional hierarchies. It is widely presumed 3 only that visual processingMichael S. Lewicki ! Carnegie Mellon the
     NIPS 2007 Tutorial                                        not                            is hierarchical but also that
Palmer and Rock, 1994




NIPS 2007 Tutorial   4     Michael S. Lewicki ! Carnegie Mellon
V1 simple cells
                                              !"#"$%&'"()&"*+(,)(-(./(0&1$*"(#"




                                                                                                 2"345"*&06 789-:

                     Hubel and Wiesel, 1959                 DeAngelis, et al, 1995
NIPS 2007 Tutorial                                  5                    Michael S. Lewicki ! Carnegie Mellon
NIPS 2007 Tutorial   6   Michael S. Lewicki ! Carnegie Mellon
NIPS 2007 Tutorial   7   Michael S. Lewicki ! Carnegie Mellon
Out of the retina
    •    > 23 distinct neural pathways, no simple function
         division
    •    Suprachiasmatic nucleus: circadian rhythm
    •    Accessory optic system: stabilize retinal image
    •    Superior colliculus: integrate visual and auditory
         information with head movements, direct eyes
    •    Pretectum: plays adjusting pupil size, track large
         moving objects
    •    Pregeniculate: cells responsive to ambient light
    •    Lateral geniculate (LGN): main “relay” to visual
         cortex; contains 6 distinct layers, each with 2
         sublayers. Organization very complex




NIPS 2007 Tutorial                                8           Michael S. Lewicki ! Carnegie Mellon
V1 simple cell integration of LGN cells
  !"#"$%&"'(()'*+,-+)".*)(/"*'"&0.1/("2(//&

                     !"#$%&'(")&*                            +,%"-




                                                                                   345
                                                                                   2(//&



                                                                                   60.1/(
                                                                                   2(//




                                    Hubel and Wiesel, 1963
                                                                             !78(/"#"$0(&(/9":;<=


NIPS 2007 Tutorial                               9                   Michael S. Lewicki ! Carnegie Mellon
Hyper-column model




                          Hubel and Wiesel, 1978


NIPS 2007 Tutorial   10                     Michael S. Lewicki ! Carnegie Mellon
Anatomical circuitry in V1

        54        CALLAWAY




          Figure 1 Contributions of individual neurons to local excitatory connections between cortical
          layers. From left to right, typical spiny from Callaway, 1998 4C, 2–4B, 5, and 6 are shown. Dendritic
                                                     neurons in layers
          arbors are illustrated with thick lines and axonal arbors with finer lines. Each cell projects axons
          specifically to only a subset of the layers. This simplified diagram focuses on connections between
          layers 2–4B, 4C, 5, and 6 and does not incorporate details of circuitry specific for subdivisions of
NIPS 2007 Tutorial                                                                            Michael S. Lewicki ! Carnegie
          these layers. A model of the interactions between these neurons is shown in Figure 2. The neurons Mellon
                                                               11
Where is this headed?




                 Ramon y Cajal

NIPS 2007 Tutorial               12   Michael S. Lewicki ! Carnegie Mellon
A wing would be a most mystifying structure
if one did not know that birds flew.

                            Horace Barlow, 1961




An algorithm is likely to be understood more
readily by understanding the nature of the
problem being solved than by examining the
mechanism in which it is embodied.

                              David Marr, 1982
A theoretical approach

    •    Look at the system from a functional perspective:
         What problems does it need to solve?


    •    Abstract from the details:
         Make predictions from theoretical principles.


    •    Models are bottom-up; theories are top-down.




                     What are the relevant computational principles?




NIPS 2007 Tutorial                                14          Michael S. Lewicki ! Carnegie Mellon
Representing structure in natural images




                      image from Field (1994)




NIPS 2007 Tutorial                              15        Michael S. Lewicki ! Carnegie Mellon
Representing structure in natural images




                      image from Field (1994)




NIPS 2007 Tutorial                              16        Michael S. Lewicki ! Carnegie Mellon
Representing structure in natural images




                      image from Field (1994)




NIPS 2007 Tutorial                              17        Michael S. Lewicki ! Carnegie Mellon
Representing structure in natural images




                      Need to describe all image structure in the scene.
                                         What representation is best?




                       image from Field (1994)




NIPS 2007 Tutorial                                    18                Michael S. Lewicki ! Carnegie Mellon
V1 simple cells
                                              !"#"$%&'"()&"*+(,)(-(./(0&1$*"(#"




                                                                                                 2"345"*&06 789-:

                     Hubel and Wiesel, 1959                 DeAngelis, et al, 1995
NIPS 2007 Tutorial                                 19                    Michael S. Lewicki ! Carnegie Mellon
Oriented Gabor models of individual simple cells




                             figure from Daugman, 1990; data from Jones and Palmer, 1987


NIPS 2007 Tutorial                                       20                               Michael S. Lewicki ! Carnegie Mellon
2D Gabor wavelet captures spatial structure



                                     •   2D Gabor functions


                                     •   Wavelet basis generated by
                                         dilations, translations, and
                                         rotations of a single basis function


                                     •   Can also control phase and
                                         aspect ratio


                                     •   (drifting) Gabor functions are what
                                         the eye “sees best”




NIPS 2007 Tutorial              21                        Michael S. Lewicki ! Carnegie Mellon
Recoding with Gabor functions




                     Pixel entropy = 7.57 bits        Recoding with 2D Gabor functions
                                                        Coefficient entropy = 2.55 bits


NIPS 2007 Tutorial                               22                    Michael S. Lewicki ! Carnegie Mellon
How is the V1 population organized?




NIPS 2007 Tutorial                    23               Michael S. Lewicki ! Carnegie Mellon
A general approach to coding: redundancy reduction


                                                              Correlation of adjacent pixels
                                                    1



                                                   0.8



                                                   0.6



                                                   0.4



                                                   0.2



                                                    0
                                                         0   0.2      0.4       0.6         0.8           1


                     Redundancy reduction is equivalent to efficient coding.

NIPS 2007 Tutorial                            24                             Michael S. Lewicki ! Carnegie Mellon
Describing signals with a simple statistical model
    Principle
         Good codes capture the statistical distribution of sensory patterns.


    How do we describe the distribution?


    • Goal is to encode the data to desired precision
                     x = a1 s1 + a2 s2 + · · · + aL sL +
                       = As +

    • Can solve for the coefficients in the no noise case

                      ˆ=A
                      s      −1
                                  x




NIPS 2007 Tutorial                                 25                           Michael S. Lewicki ! Carnegie Mellon
An algorithm for deriving efficient linear codes: ICA
    Learning objective:                                  Using independent component analysis
                                                         (ICA) to optimize A:
          maximize coding efficiency
         maximize P(x|A) over A.                                        ∂
                                                          ∆A     ∝ AA     log P (x|A)
                                                                           T
                                                                       ∂A
                                                                 = −A(zsT − I)
    Probability of the pattern ensemble is:
                                                         where z = (log P(s))’.
      P (x1 , x2 , ..., xN |A) =        P (xk |A)
                                    k

                                                         This learning rule:
    To obtain P(x|A) marginalize over s:
                                                         • learns the features that capture the
                                                           most structure
      P (x|A) =            ds P (x|A, s)P (s)
                                                         • optimizes the efficiency of the code
                           P (s)
                     =
                         | det A|
                                                         What should we use for P(s)?


NIPS 2007 Tutorial                                  26                         Michael S. Lewicki ! Carnegie Mellon
Modeling Non-Gaussian distributions

    •    Typical coeff. distributions of natural
         signals are non-Gaussian.
                                    !=2                                   !=4




                     !6   !4   !2   0     2   4   6        !6   !4   !2   0     2    4      6
                                    s                                     s
                                     1                                     2




NIPS 2007 Tutorial                                    27                            Michael S. Lewicki ! Carnegie Mellon
The generalized Gaussian distribution
                                                       1 q
                                        P (x|q) ∝ exp(− |x| )
                                                       2

    •    Or equivalently, and exponential power distribution (Box and Tiao, 1973):
                                                                                  2/(1+β)
                               ω(β)           x−µ
               P (x|µ, σ, β) =      exp −c(β)
                                σ              σ

    •    " varies monotonically with the kurtosis, #2:
                     !=!0.75 " =!1.08             !=!0.25 " =!0.45          !=+0.00 (Normal) " =+0.00
                              2                             2                                2




               !=+0.50 (ICA tanh) "2=+1.21   !=+1.00 (Laplacian) "2=+3.00       !=+2.00 "2=+9.26




NIPS 2007 Tutorial                                              28                                 Michael S. Lewicki ! Carnegie Mellon
Modeling Gaussian distributions with PCA

    •
                                                       !=0                                  !=0
         Principal component
         analysis (PCA) describes
         the principal axes of
         variation in the data
         distribution.
    •    This is equivalent to fitting
         the data with a multivariate
         Gaussian.                      !6   !4   !2   0     2   4   6   !6    !4     !2     0      2     4     6
                                                       s1                                    s2




NIPS 2007 Tutorial                                29                          Michael S. Lewicki ! Carnegie Mellon
Modeling non-Gaussian distributions

    •
                                                      !=2                                  !=4
         What about non-Gaussian
         marginals?




                                       !6   !4   !2   0     2   4   6   !6    !4     !2     0      2     4      6
                                                      s1                                    s2




    •    How would this distribution
         be modeled by PCA?




NIPS 2007 Tutorial                               30                          Michael S. Lewicki ! Carnegie Mellon
Modeling non-Gaussian distributions

    •
                                                      !=2                                    !=4
         What about non-Gaussian
         marginals?




                                       !6   !4   !2   0     2   4   6     !6    !4     !2     0      2     4      6
                                                      s1                                      s2




    •    How would this distribution
         be modeled by PCA?




    •    How should the distribution
         be described?
                                                                         The non-orthogonal ICA
                                                                        solution captures the non-
                                                                            Gaussian structure

NIPS 2007 Tutorial                               31                            Michael S. Lewicki ! Carnegie Mellon
Efficient coding of natural images: Olshausen and Field, 1996

                     naturalscene
                     nature scene       visual input
                                    visual input   units   image basis functions
                                                              receptive fields




                                                              ...


                                                                                ...
                                                     ...
                                                            before         after
                                                           learning      learning

Network weights are adapted to maximize coding efficiency:
minimizes redundancy and maximizes the independence of the outputs
NIPS 2007 Tutorial                         32                    Michael S. Lewicki ! Carnegie Mellon
Model predicts local and global receptive field properties




         Learned basis for natural images                 Overlaid basis function properties

                                     from Lewicki and Olshausen, 1999

NIPS 2007 Tutorial                                33                        Michael S. Lewicki ! Carnegie Mellon
Algorithm selects best of many possible sensory codes
                                                                       Haar
                     Learned               Wavelet




                     Gabor                  Fourier                    PCA




          Theoretical perspective: Not edge “detectors.”   from Lewicki and Olshausen, 1999

          An efficient code for natural images.
NIPS 2007 Tutorial                            34                       Michael S. Lewicki ! Carnegie Mellon
Comparing coding efficiency on natural images
                                                        Comparing codes on natural images
                                                8


                                                7


                                                6
                     estimated bits per pixel




                                                5


                                                4


                                                3


                                                2


                                                1


                                                0
                                                    Gabor   wavelet   Haar    Fourier   PCA   learned


NIPS 2007 Tutorial                                                       35                     Michael S. Lewicki ! Carnegie Mellon
Responses in primary visual cortex to visual motion




                            from Wandell, 1995

NIPS 2007 Tutorial                  36            Michael S. Lewicki ! Carnegie Mellon
Sparse coding of time-varying images (Olshausen, 2002)
                         "#$!%



                                                     !(                             !


                                            '
                                                           "#$&)')!!!(%
                                            &
                                                     !


                          '
                                 *$&)')!%                                         ...
                          &

                                                                                        !


                     I(x, y, t) =                         ai (t )φi (x, y, t − t ) + (x, y, t)
                                        i        t

                                 =              ai (t) ∗ φi (x, y, t) + (x, y, t)
                                        i
NIPS 2007 Tutorial                                          37                      Michael S. Lewicki ! Carnegie Mellon
Sparse decomposition of image sequences
                               convolution                                               posterior maximum
                                                                                                                             %#$
                                                          !
                        "%                                                          "%
                                                                                                                             %#5
                                                          "#$
                        !%                                                          !%                                       %#7
                                                          "
                        7%                                                          7%                                       %#!

                                                          %#$                                                                %#"
                        5%                                                          5%
          34),,'3')/&




                                                                      34),,'3')/&
                        $%                                %                         $%                                       %

                                                          !%#$                                                               !%#"
                        6%                                                          6%
                                                                                                                             !%#!
                        8%                                !"                        8%
                                                                                                                             !%#7
                        9%                                !"#$                      9%
                                                                                                                             !%#5
                        :%                                                          :%
                                                          !!
                                                                                                                             !%#$
                                !%         5%      6%                                        !%         5%          6%
                             &'()*+,-.()*/0(1)-2                                          &'()*+,-.()*/0(1)-2


                                                           reconstruction

                                                           input sequence

                                                        from Olshausen, 2002
NIPS 2007 Tutorial                                               38                                        Michael S. Lewicki ! Carnegie Mellon
Learned spatio-temporal basis functions

               time


basis
function




                           from Olshausen, 2002




NIPS 2007 Tutorial                  39            Michael S. Lewicki ! Carnegie Mellon
Animated spatial-temporal basis functions




                                                      from Olshausen, 2002



NIPS 2007 Tutorial              40          Michael S. Lewicki ! Carnegie Mellon
$   explains data from principles
               Theory
                         $   requires idealization and abstraction


                         $   code signals accurately and efficiently
             Principle
                         $   adapted to natural sensory environment


         Idealization    $   cell response is linear


        Methodology      $   information theory, natural images



                         $   explains individual receptive fields
           Prediction
                         $   explains population organization


NIPS 2007 Tutorial                     41                     Michael S. Lewicki ! Carnegie Mellon
How general is the efficient coding principle?



                           Can it explain auditory coding?




NIPS 2007 Tutorial                     42                     Michael S. Lewicki ! Carnegie Mellon
Limitations of the linear model

    •    linear
    •    only optimal for block, not whole signal
    •    no phase-locking
    •    representation depends on block alignment




NIPS 2007 Tutorial                                   Michael S. Lewicki ! Carnegie Mellon
A continuous filterbank does not form an efficient code


            ⊗                 ←


                              →

                              →


                              →



       Goal:
                     find a representation that is both time-relative and efficient



NIPS 2007 Tutorial                               44                  Michael S. Lewicki ! Carnegie Mellon
Efficient signal representation using time-shiftable kernels (spikes)
Smith and Lewicki (2005) Neural Comp. 17:19-45




                              M   nm
                     x(t) =             sm,i φm (t − τm,i ) + (t)
                              m=1 i=1


    •    Each spike encodes the precise time and magnitude of an acoustic feature
    •                       Figure 3: Smith, NC ms 2956
         Two important theoretical abstractions for “spikes”
          - not binary
          - not probabilistic
    •    Can convert to a population of stochastic, binary spikes
NIPS 2007 Tutorial                                      45           Michael S. Lewicki ! Carnegie Mellon
Coding audio signals with spikes

                 5000
Kernel CF (Hz)




                 2000

                 1000
K




                 500
                 200
                 100
                       0                    5   10        15   20                     25 ms
                        Input



                           Reconstruction



                           Residual




    NIPS 2007 Tutorial                               46        Michael S. Lewicki ! Carnegie Mellon
Kernel CF (Hz)
                        2000

Comparing a spike code to a spectrogram
   1000

                         500
                          a
                         200
                         100

          b
          c
                        5000
                        5000
     FrequencyCF (Hz)




                        2000
        Kernel (Hz)




                        4000

                        1000
                        3000
                         500
                        2000
                         200
                          a
                        1000
                         100

         c
         b                     100   200      300       400       500      600    700             800 ms

                        5000
      Frequency (Hz)




                        4000
     Kernel CF (Hz)




                        2000
                        3000
                        1000
                        2000
                         500
                        1000
                         200
                         100
                               100   200      300       400       500      600    700             800 ms
         c                                 How do we compute the spikes?
NIPS 2007 Tutorial
        5000                                            47                       Michael S. Lewicki ! Carnegie Mellon
“can” with filter-threshold

                 5000
Kernel CF (Hz)




                 2000


                 1000


                 500


                 200
                 100
                        0                    5   10        15             20 ms


                            Input




                            Reconstruction




                            Residual




NIPS 2007 Tutorial                                    48        Michael S. Lewicki ! Carnegie Mellon
Spike Coding with Matching Pursuit
    1. convolve signal with kernels




NIPS 2007 Tutorial                    49   Michael S. Lewicki ! Carnegie Mellon
Spike Coding with Matching Pursuit
    1. convolve signal with kernels
    2. find largest peak over convolution set




NIPS 2007 Tutorial                             50   Michael S. Lewicki ! Carnegie Mellon
Spike Coding with Matching Pursuit
    1. convolve signal with kernels
    2. find largest peak over convolution set
    3. fit signal with kernel




NIPS 2007 Tutorial                             51   Michael S. Lewicki ! Carnegie Mellon
Spike Coding with Matching Pursuit
    1. convolve signal with kernels
    2. find largest peak over convolution set
    3. fit signal with kernel
    4. subtract kernel from signal, record
       spike, and adjust convolutions




NIPS 2007 Tutorial                             52   Michael S. Lewicki ! Carnegie Mellon
Spike Coding with Matching Pursuit
    1. convolve signal with kernels
    2. find largest peak over convolution set
    3. fit signal with kernel
    4. subtract kernel from signal, record
       spike, and adjust convolutions
    5. repeat




NIPS 2007 Tutorial                             53   Michael S. Lewicki ! Carnegie Mellon
Spike Coding with Matching Pursuit
    1. convolve signal with kernels
    2. find largest peak over convolution set
    3. fit signal with kernel
    4. subtract kernel from signal, record
       spike, and adjust convolutions
    5. repeat




NIPS 2007 Tutorial                             54   Michael S. Lewicki ! Carnegie Mellon
Spike Coding with Matching Pursuit
    1. convolve signal with kernels
    2. find largest peak over convolution set
    3. fit signal with kernel
    4. subtract kernel from signal, record
       spike, and adjust convolutions
    5. repeat




NIPS 2007 Tutorial                             55   Michael S. Lewicki ! Carnegie Mellon
Spike Coding with Matching Pursuit
    1. convolve signal with kernels
    2. find largest peak over convolution set
    3. fit signal with kernel
    4. subtract kernel from signal, record
       spike, and adjust convolutions
    5. repeat




NIPS 2007 Tutorial                             56   Michael S. Lewicki ! Carnegie Mellon
Spike Coding with Matching Pursuit
    1. convolve signal with kernels
    2. find largest peak over convolution set
    3. fit signal with kernel
    4. subtract kernel from signal, record
       spike, and adjust convolutions
    5. repeat . . .




NIPS 2007 Tutorial                             57   Michael S. Lewicki ! Carnegie Mellon
Spike Coding with Matching Pursuit
    1. convolve signal with kernels
    2. find largest peak over convolution set
    3. fit signal with kernel
    4. subtract kernel from signal, record
       spike, and adjust convolutions
    5. repeat . . .
    6. halt when desired fidelity is reached




NIPS 2007 Tutorial                             58   Michael S. Lewicki ! Carnegie Mellon
“can” 5 dB SNR, 36 spikes, 145 sp/sec
                      5000
     Kernel CF (Hz)




                      2000


                      1000


                      500


                      200
                      100
                             0              50    100        150   200 ms


                                 Input




                                 Reconstruction




                                 Residual




NIPS 2007 Tutorial                                      59         Michael S. Lewicki ! Carnegie Mellon
“can” 10 dB SNR, 93 spikes, 379 sp/sec
                      5000
     Kernel CF (Hz)




                      2000


                      1000


                      500


                      200
                      100
                             0              50    100        150   200 ms


                                 Input




                                 Reconstruction




                                 Residual




NIPS 2007 Tutorial                                      60         Michael S. Lewicki ! Carnegie Mellon
“can” 20 dB SNR, 391 spikes, 1700 sp/sec
                      5000
     Kernel CF (Hz)




                      2000


                      1000


                      500


                      200
                      100
                             0              50    100        150   200 ms


                                 Input




                                 Reconstruction




                                 Residual




NIPS 2007 Tutorial                                      61         Michael S. Lewicki ! Carnegie Mellon
“can” 40 dB SNR, 1285 spikes, 5238 sp/sec
                      5000
     Kernel CF (Hz)




                      2000


                      1000


                      500


                      200
                      100
                             0              50    100        150   200 ms


                                 Input




                                 Reconstruction




                                 Residual




NIPS 2007 Tutorial                                      62         Michael S. Lewicki ! Carnegie Mellon
Efficient auditory coding with optimized kernel shapes
Smith and Lewicki (2006) Nature 439:978-982


                                What are the optimal kernel shapes?

                                   M   nm
                         x(t) =               sm,i φm (t − τm,i ) + (t)
                                  m=1 i=1




                               Adapt algorithm of Olshausen (2002)

NIPS 2007 Tutorial                               63                       Michael S. Lewicki ! Carnegie Mellon
Optimizing the probabilistic model
                                       M    nm
                              x(t) =             sm φm(t − τim) + (t),
                                                  i
                                       m=1 i=1



                            p(x|Φ) =             p(x|Φ, s, τ )p(s)p(τ )dsdτ

                                       ≈ p(x|Φ, s, τ )p(ˆ)p(ˆ)
                                                ˆ ˆ s τ

                                            (t) ∼ N (0, σ )
Learning (Olshausen, 2002):

                      ∂                     ∂
                         log p(x|Φ) =          log p(x|Φ, s, τ ) + log p(ˆ)p(ˆ)
                                                          ˆ ˆ            s τ
                     ∂φm                   ∂φm
                                                             M    n
                                            1 ∂              m
                                   =               [x −         sm φm(t − τim)]2
                                                                ˆi
                                           2σε ∂φm      m=1 i=1
                                           1
                                   =          [x − x]
                                                   ˆ         ˆi
                                                             sm
                                           σε            i

                                                                              Also adapt kernel lengths
NIPS 2007 Tutorial                                  64                         Michael S. Lewicki ! Carnegie Mellon
Adapting the optimal kernel shapes




NIPS 2007 Tutorial              65   Michael S. Lewicki ! Carnegie Mellon
Kernel functions optimized for coding speech




NIPS 2007 Tutorial              66             Michael S. Lewicki ! Carnegie Mellon
Quantifying coding efficiency
                             5000


    1. fit signal
            Kernel CF (Hz)
                             2000
    2. quantize time and amplitude values
                             1000                                        spike code
    3. prune zero values
    4. measure coding efficiency using the
             500

       entropy of quantized values
             200

    5. reconstruct signal using quantized
              100
                  0                     50                                           150
       values
    6. measure fidelity using Input
                             signal-to-noise
       ratio (SNR) of residual error                                     original

    •    identical procedure for other codes
         (e.g. Fourier and wavelet)
                               Reconstruction
                                                                         reconstruction

                         M     nm
 x(t) =                             sm,i φm (tResidual) + (t)
                                               − τm,i
                   m=1 i=1                                               residual
                                                                         error

NIPS 2007 Tutorial                                              Michael S. Lewicki ! Carnegie Mellon
                                                                                                  67
Coding efficiency curves

                   40

                   35

                   30

                   25
        SNR (dB)




                            +14 dB
                   20

                   15                          4x more efficient

                   10                                          Spike Code: adapted
                                                               Spike Code: gammatone
                    5                                          Block Code: wavelet
                                                               Block Code: Fourier
                    0
                        0   10       20   30    40        50      60    70     80         90
                                               Rate (Kbps)

NIPS 2007 Tutorial                                   68                         Michael S. Lewicki ! Carnegie Mellon
Using efficient coding theoryComparing atheoretical a spectrogram
                             to make spike code to predictions

                                                                                                         a


                                                                                          b
                                                                                                       5000   Natural Sound Environment




                                                                                      Kernel CF (Hz)
                                                                                                       2000
                                                            evolution                                                                                                   theory
                                                                                                       1000

                                                                                                       500
                                      A simple model of auditory coding
                                                                                                       200

            physiological data:                                                                        100
                                                                                                                                                             optimal kernels:
                                                                                                                                              How do we compute the spikes?
      sound
            • auditory nerve non-linearities
     waveform
                  filterbank  filter shapes
                                 static c                                                                         stochastic
                                                                                                                    spiking
                                                                                                                                           population
                                                                                                                                           spike code
                                                                                                                                                                       • properties
            • population trends            5000                                                                                                                        • coding efficiency
                                                                                      Frequency (Hz)




                                                                                                       4000
                  auditory revcor filters: gammatones
                                                                  5                                    3000
                                                                         Speech Prediction




                                                                                                                                                                                 ?
                                                                         Auditory Nerve Filters

      1   2   3   4   5   6   7   8   9   10
                                                                                                       2000
                                  time (ms)                       2

                                                                                                       1000
                                               Bandwidth (kHz)




                                                                  1                                                More theoretical questions:
                                                                                                  • Why gammatones?
          1       2       3       4       5
                                  time (ms)
                                                                 0.5
                                                                                                      100         200                                  300
                                                                                                                                                             only compare 500 the data after optimizing
                                                                                                                                                                   400
                                                                                                                                                                           to     600       700        800 ms

                                                                                                  • Why spikes?
                                                                               Michael S. Lewicki ! Carnegie Mellon
                                                                                                  • How is sound coded by
                                                                                                                                                                       we do not fit theBad Zwischenahn ! Aug 21, 2004
                                                                                                                                                                                        data
                                                                 0.2


          1       2       3       4       5
                                                                                                                        the spike population?
                                  time (ms)                      0.1
                                                                   0.1      0.2                  0.5         1          2        5
                                                                                               Center Frequency (kHz)
                                                                                                                    How do we develop a theory?                Prediction depends on sound ensemble.
   deBoer and deJongh, 1978                                 Carney and Yin, 1988
    Michael S. Lewicki ! Carnegie Mellon                                                                                    Bad Zwischenahn ! Aug 21, 2004



NIPS 2007 Tutorial                                                                                                                    69                                          Michael S. Lewicki ! Carnegie Mellon
Learned kernels share features of auditory nerve filters




                          Auditory nerve filters              Optimized kernels
                 from Carney, McDuffy, and Shekhter, 1999    scale bar = 1 msec
NIPS 2007 Tutorial                                      70                        Michael S. Lewicki ! Carnegie Mellon
Learned kernels closely match individual auditory nerve filters




                     for each kernel find closet matching auditory nerve filter
                            in Laurel Carney’s database of ~100 filters.

NIPS 2007 Tutorial                              71                          Michael S. Lewicki ! Carnegie Mellon
Learned kernels overlaid on selected auditory nerve filters




                     For almost all learned kernels there is a closely matching auditory nerve filter.



NIPS 2007 Tutorial                                          72                           Michael S. Lewicki ! Carnegie Mellon
Coding of a speech consonant

a                       Input



                        Reconstruction


                        Residual


b
                 5000
Kernel CF (Hz)




                 2000
                                         4800
                 1000                    4000
                 500                     3400

                 200                            38        39   40
                 100
c
                 5000
Frequency (Hz)




                 4000
                 3000
                 2000
                 1000

                                    10               20              30   40        50                     60ms



   NIPS 2007 Tutorial                                               73         Michael S. Lewicki ! Carnegie Mellon
How is this achieving an efficient, time-relative code?

                                 Time-relative coding of glottal pulses

a

b
                 5000
Kernel CF (Hz)




                 2000

                 1000

                 500

                 200
                 100
                        0   20             40                 60            80                        100 ms




      NIPS 2007 Tutorial                           74                     Michael S. Lewicki ! Carnegie Mellon
$   explains data from principles
               Theory
                         $   requires idealization and abstraction


                         $   code signals accurately and efficiently
             Principle
                         $   adapted to natural sensory environment


         Idealization    $   analog spikes


                         $   information theory, natural sounds
        Methodology
                         $   optimization



           Prediction    $   explains individual receptive fields
                         $   explains population organization


NIPS 2007 Tutorial                     75                     Michael S. Lewicki ! Carnegie Mellon
A gap in the theory?
                                                           -
                                                     -     +      -
                                                           -




                                                         Learned




                     from Hubel, 1995

NIPS 2007 Tutorial                      76   Michael S. Lewicki ! Carnegie Mellon
Redundancy reduction for noisy channels (Atick, 1992)


           s          input          x        recoding            Ax     output                 y
                     channel                        A                   channel
                     y = x +"                                           y = Ax +!




       Mutual information

                                                                   P (x, s)
                                I(x, s) =         P (x, s) log2
                                            s,x
                                                                  P (s)P (x)

       I(x, s) = 0 iff P (x, s) = P (x)P (s), i.e. x and s are independent.




NIPS 2007 Tutorial                                  77                         Michael S. Lewicki ! Carnegie Mellon
Profiles of optimal filters

                                             •   high SNR
                                   -             -  reduce redundancy
                                 - + -
                                   -
                                                 -  center-surround




                                     -

                            -        +   -

                                     -




                                     +       •   low SNR
                                                 -  average
                                                 -  low-pass filter

                                             •   matches behavior of
                                                 retinal ganglion cells
NIPS 2007 Tutorial              78                   Michael S. Lewicki ! Carnegie Mellon
An observation: Contrast sensitivity of ganglion cells




                     Luminance level decreases one log unit each time we go to lower curve.




NIPS 2007 Tutorial                                 79                      Michael S. Lewicki ! Carnegie Mellon
Natural images have a 1/f amplitude spectrum




                                                       Field, 1987




NIPS 2007 Tutorial                        80                     Michael S. Lewicki ! Carnegie Mellon
Components of predicted filters
                                               whitening filter



                              low-pass filter




                             optimal filter




                                                           from Atick, 1992




NIPS 2007 Tutorial                    81                Michael S. Lewicki ! Carnegie Mellon
Predicted contrast sensitivity functions match neural data




                                                             from Atick, 1992




NIPS 2007 Tutorial                     82                 Michael S. Lewicki ! Carnegie Mellon
Robust coding of natural images

    •    Theory refined:
          -    image is noisy and blurred
          -    neural population size changes
          -    neurons are noisy




                                                                                                        from Hubel, 1995

(a) Undistorted image      (b) Fovea                                 retinal image   (c) 40 degrees eccentricity
                           Intensity




                                          -4     0      4                                -4         0          4
                                       Visual angle [arc min]
                                                                                              from Doi and Lewicki, 2006



NIPS 2007 Tutorial                                              83                      Michael S. Lewicki ! Carnegie Mellon
Problem 1: Real neurons are “noisy”

                                  Estimates of neural information capacity

      system (area)                            stimulus              bits / sec       bits / spike

      fly visual (H1)                           motion                        64                 ~1

      monkey visual (MT)                       motion                  5.5 - 12            0.6 - 1.5

      frog auditory (auditory nerve)           noise & call           46 & 133             1.4 & 7.8

      Salamander visual (ganglinon cells)      rand. spots               3.2                    1.6

      cricket cercal (sensory afferent)        mech. motion              294                    3.2

      cricket cercal (sensory afferent)        wind noise              75 - 220            0.6 - 3.1

      cricket cercal (10-2 and 10-3)           wind noise               8 - 80              avg. = 1

      Electric fish (P-afferent)                amp. modulation         0 - 200               0 - 1.2

                     Limited capacity         neural codes needBorst and Theunissen, 1999
                                                             After to be robust.


NIPS 2007 Tutorial                                   84                           Michael S. Lewicki ! Carnegie Mellon
Traditional codes are not robust

                       encoding neurons



                           sensory input

                     Original




NIPS 2007 Tutorial                         85   Michael S. Lewicki ! Carnegie Mellon
Traditional codes are not robust

                       encoding neurons
                                                                       Add
                                                                 noise equivalent
                                                                to 1 bit precision
                           sensory input

                     Original              1x efficient coding     reconstruction        (34% error)

                                             1 bit precision




NIPS 2007 Tutorial                                 86                     Michael S. Lewicki ! Carnegie Mellon
al., to pairwise correla- al.,
  st 2001; Schneidman et
e correlations that can
   not just to pairwise correla-
        Redundancy plays an important role in neural coding
 ental Procedures). By can
  to all correlations that
xperimental Procedures). By
cy can be measured
model ofcan be measured
 undancy
              the light re-
g any model of the light re-
 ancy was defined as
 edundancy was defined as
                                 Response              of salamander retinal ganglion cells to natural movies
al information that that the
  mutual information the
ne conveyed aboutstim-
veyed about the the stim-
 information conveyed
  d the information conveyed
 a As information rates
).,Rb;S). As information rates
n of ganglion cells, we
pulation of ganglion cells, we
 as a fraction of the minimum
action of the minimum
al cells cells (Reichal., al.,
dividual (Reich et et




   min{I(Ra;S), I(Rb;S)}, is the
 ancy I(Rb;S)}, is the
 Ra;S),between two cells, so
between be nocells, sothan
 ncy can two greater
                                       fractional redundancy




an be 0 when the two cells
  cy is no greater than
 0 when the two cells
mation about the stimulus; its
 lls encode exactly the same
 about the stimulus; its
 ell’s information is a subset
 ode exactly the same
 ues of the redundancy mean
 formation is a subset
 c.
 the redundancy mean

 s of Ganglion Cells
 rocessing under realistic vi-
 anglion Cells
 ted retinas with a set of nat-
sing underarealistic of envi-
   represent variety vi-                                                                                        From Puchalla et al, 2005
                                                                   cell pair distance (%m)
 important a set of nat- of
  inas with characteristic
sent a NIPS 2007 Tutorial by the
         variety of envi-
  field motion caused                                                       87                           Michael S. Lewicki ! Carnegie Mellon
tance, the code. great numbersee, coding elementsintroduces redundancy into the cod
  ty of what if a As we will of robust coding are available while their coding pre
 o be Robust codingthe assumed noise in 2005,2006)
  case, separated from (Doi and Lewicki, the representation, unlike PCA or ICA the
         the apparent dimensionality could be increased (instead of reduced) while tha
 limited. Robust coding should make optimal use of such an arbitrary number of cod
 e present the optimal linear encoder andcoding introduces redundancy into the code
 lity of the code. As we will Encoder see, robust decoder when the coefficients are noisy, an
                                                            Decoder
 to be separated from the assumedthe different representation, unlike PCA or ICApaper r
   to the number of units and to noise in the data and noise conditions. This that
                               x       W          u           A          ^
                                                                         x
  , we formulate the problem. Then, in section III, we analyze the solutions for the g
we present theand two-dimensional case. Next, in when theIV, we applyare noisy, and
ons for one- optimal Data encoder and decoder section coefficients the proposed
                               linear          Noiseless            Reconstruction
                                             Representation
 y to theits considerable robustness different data and noise conditions. This codes. is
 nstrate number of units and to the compared against conventional image paper Fi
udies. formulate the problem. Then, in section III, we analyze the solutions for the gen
II, we
 ions for one- and two-dimensional case. Next, in section IV, we apply the proposed ro
        Model limited precision using
onstrate its channel noise: robustnessPcompared FORMULATION
        additive
                 considerable          II. Channel Noise against conventional image codes. Fina
                                             ROBLEM
                                                   n
 udies. (Fig. 1), we assume that the data is N -dimensional with zero mean and cov
model                    Encoder                            Decoder
rices W ∈ RM ×N and A ∈ RN ×M . For each data point x, its representation r in
                                                  r
                                              FORMULATION   A        ^
                x perturbed byII.uP ROBLEM noise (i.e., channel noise) n ∼ N (0, σ 2
                        W                                            x
rough matrix W,                 the additive                                             n
               Data           Noiseless         Noisy           Reconstruction
 model (Fig. 1), we assume that the data is Representation
                            Representation
                                             N -dimensional with zero mean and covari
trices W ∈ RM ×N and A ∈ r N ×M . For each= u + n. x, its representation r in th
                              R = Wx + n data point
                                                           channel noise) n ∼ N (0, σ 2 I
hrough matrix W, perturbed by the additive noise (i.e., vectors. The reconstructionnofM
 the encoding matrix and its row vectors as encoding
noisy representation using matrix A: + n = u + n.
                             r = Wx                                      Caveat: linear, but can
                                                                         evaluate for different
s the encoding matrix and its row= Ar = encoding vectors.
                              x vectors as AWx + An.
                              ˆ                                       Thenoise levels.
                                                                          reconstruction of   ad
noisy representation using matrix A:
     NIPS 2007 Tutorial                  88                 Michael The term AWx i
 the decoding matrix and its column vectors as decoding vectors. S. Lewicki ! Carnegie Mellon
Generalizing the model: sensory noise and optical blur

(a) Undistorted image       (b) Fovea                                     retinal image      (c) 40 degrees eccentricity




                             Intensity

                                            -4       0      4                                    -4         0          4
                                         Visual angle [arc min]




                                            sensory noise                    channel noise
                                                 ν                                δ                only implicit
                     optical blur                               encoder                      decoder
             s           H                       x               W                 r           A                   s
                                                                                                                   ˆ
         image                               observation                    representation                reconstruction



                     Can also add sparseness and resource constraints

NIPS 2007 Tutorial                                                 89                           Michael S. Lewicki ! Carnegie Mellon
REVIEWS
Robust coding is distinct from “reading” noisy populations
                         a                                                             b                                                        In EQN 2, si is the direction (th
                                             100                                                          100                                   triggers the strongest respo
                                                                                                                                                width of the tuning curve, s
                                              80                                                          80
                                                                                                                                                (so if s = 359° and si = 2°, the



                     Activity (spikes s–1)




                                                                                  Activity (spikes s–1)
                                              60                                                          60
                                                                                                                                                ing factor. In this case, all t
                                                                                                                                                share a common tuning cur
                                              40                                                          40                                    preferred directions, si (FIG. 1
                                                                                                                                                involve bell-shaped tuning cu
                                              20                                                          20                                        The inclusion of the no
                                                                                                                                                because neurons are known
                                              0                                                             0
                                                   –100         0           100                                 –100        0          100      neuron that has an average
                                                          Direction (deg)                                        Preferred direction (deg)      stimulus moving at 90° mig
                         c                                                             d                                                        occasion for a particular 90°
                                             100                                                          100
                                                                                                                                                another occasion for exactly
                                                                                                                                                factors contribute to this va
                                              80                                                          80                                    trolled or uncontrollable as
                     Activity (spikes s–1)




                                                                                  Activity (spikes s–1)
                                                                                                                                                presented to the monkey, a
                                              60                                                          60                                    neuronal responses. In the
                                                                                                                                                collectively considered as ran
                                              40                                                          40
                                                                                                                                                this noise causes important
                                              20                                                          20                                    transmission and processing
                                                                                                                                                which are solved by populati
                                              0                                                             0                                   we should be concerned no
                                                   –100         0           100                                 –100        0          100
                                                                                                                                                computes with population co
                                                    Preferred direction (deg)                                    Preferred direction (deg)   From Pouget et al, 2001
                                                                                                                                                reliably in the presence of suc
                        Figure 1 | The standard population coding model. a | Bell-shaped tuning curves to direction for
                        16 neurons. b | A population pattern of activity across 64 neurons with bell-shaped tuning curves in           Decoding population cod
                        response to an object moving at –40°. The activity of each cell was generated using EQN 1, and                 In this section, we shall addr
        Here, we want to learn an optimal image code using noisy neurons.
                        plotted at the location of the preferred direction of the cell. The overall activity looks like a ‘noisy’ hill what information about t
                        centred around the stimulus direction. c | Population vector decoding fits a cosine function to the
                                                                                                                                       object is available from the r
                        observed activity, and uses the peak of the cosine function, s as an estimate of the encoded
                                                                                          ˆ,
                        direction. d | Maximum likelihood fits a template derived from the tuning curves of the cells. More            neurons? Let us take a hypoth
NIPS 2007 Tutorial      precisely, the template is obtained from the noiseless (or average) population activity in response to S. Lewicki we record Mellon
                                                                            90                                                Michael  that ! Carnegie the activity of
                                                                                                                                                and that these neurons have
∆W ∝ −        (Cost) = 2 A (IN − AW)Σx − λ    diag
   (Cost) =               (Error) + λ (Var.∂W
                                            Const.)                         M (83)
    How do we learn robust codes?
                                    M      sensory noise   2         channel noise
                                                  u2                      A     = Σx WT (σn IM + WΣx WT
                                                                                          2
    (Var. Const.) =                        ln ν 2   i
                                                                           δ                      (84)
                                    i=1
                                                  σu
                            optical blur                   encoder                    decoder
                                                                                            ∂
                                                                                      ⇔           (Cost) = O
                 s              H               x              W           r              A ∂A               s
                                                                                                             ˆ
             image                         observation               representation                 reconstruction
                         4        ln{diag(WΣx WT )/c}             u2
2 A (IObjective: x − λ
   T
      N − AW)Σ              diag                    T)
                                                           WΣxi = i
                                                             SNR                                                        (85)
                        M            diag(WΣx W                    2
                                                                  σn
      !     Find W and A that minimize reconstruction error.

  A     •= Channel capacity of+ WΣneuron:−1
             Σx WT (σn IM the ith x WT )
                       2
                                                                                           u 2 = σu
                                                                                             i
                                                                                                  2                     (86)
                        1
                   Ci = ln(SNRi + 1)
                      ∂
                 ⇔      (Cost) = O
                        2                                                                                               (87)
                     ∂A
        •    To limit capacity, fix the coefficient signal to noise ratio:√
                                                                                    λ 1 + λ2
                                                                                            √                 √
                                                                                                              √λ1
                                u2                                               2(1 + M SNR)                   λ2
                         SNRi = 2i                                                      2                               (88)
                                σn
                                                               Now robust coding is formulated as a                              2
                                                               constrained optimization problem. N
                                                                                     1        1
                          u 2 = σu
                                 2                                       E= M                       λi
                            i
                                                                               N  · SNR + 1 N i=1 (89)
    NIPS 2007 Tutorial                                         91                         Michael S. Lewicki ! Carnegie Mellon
Robust coding of natural images

                       encoding neurons
                                                                       Add
                                                                 noise equivalent
                                                                to 1 bit precision
                           sensory input

                     Original              1x efficient coding     reconstruction        (34% error)

                                             1 bit precision




NIPS 2007 Tutorial                                 92                     Michael S. Lewicki ! Carnegie Mellon
Robust coding of natural images

                       encoding neurons
                                                              Weights adapted for optimal
                                                                      robustness
                           sensory input

                     Original              1x robust coding        reconstruction       (3.8% error)

                                            1 bit precision




NIPS 2007 Tutorial                                93                      Michael S. Lewicki ! Carnegie Mellon
Reconstruction improves by adding neurons

     encoding neurons
                                                                  Weights adapted for optimal
                                                                          robustness
                           sensory input

                     Original              8x robust coding   reconstruction error: 0.6%

                                            1 bit precision




NIPS 2007 Tutorial                                94                  Michael S. Lewicki ! Carnegie Mellon
Adding precision to 1x robust coding

                     original images   1 bit: 12.5% error   2 bit: 3.1% error




NIPS 2007 Tutorial                             95                Michael S. Lewicki ! Carnegie Mellon
1        M ·2 (1, ·√ ,1 T ∈ RM . It takes a convenient fo
                                  where 1M SNR + 1· · 1)
                                                 =
     Can                      2(1 + M SNR) 2 average error bound
               derive minimum theoretical
                                        2
                                                         λ2
                    2-D                          2σx
                    Isotropic
                                     E= M                                               diag(V
                                            2
                                               · SNR + 1
                                  when we define V (byλ1 + λ2 )    √ 22
                                                         N a linear transform of W,
                                                          √
                    2-D               1=          11
                       E = M E M · SNR + 1
                    Anisotropic                                 λi        if SNR ≥ SNRc
                               N  · SNR + 1 N i=1 2
                                            2                                             V≡
                                                 λ1
                                     E=          = EDET √
                                  where Σx· SNR + + λ2 the 
                                                            is            if SNR ≤ SNRc
                                                                   eigenvalue decomposition of
                                           M           1
                                          √                    λ1
                                  λi -≡ Dii are theof the data covariance. To summarize, the c
                                    N ith eigenvalue eigenvalues of Σx
                                       its λi       1 
                                  thatinputlinear transform. V should have row vectors of un
                                                                   
                       E = MN-      i=1      dimensionality .
                                                        . 
                                   · SNR + 1usN (neurons)
                               N M Next, coding units √ a necessary condition for the min
                                     - # of
                                             let     consider
                                                              λN
                                  all A PPLICATION Regarding A, ∂E/∂A = O yields (se
                                 IV. free parameters.TO IMAGE CODING
ata. we characterized the optimal solutions for 1-D and 2-D data. For1the high
 ection
                                                                      A = high-dim
                                                                               γ 2 ES(I
ata, or theto be investigated. Here, we data. lower bound solutions for u
s remains eigenvalues of isotoropictheoretical numerical
                       Algorithm achieves present                           σ
bustness to channel noise. To derive an optimal solution we can employ a grad
ons)                       (Note that Results      Bound
                                       the necessary condition with respect to W (or e
function (its details are given in [3], [4]).
                           W (or V) 19.9%
                                       is constrained to a certain subspace by the cha
performance of our proposed code when applied 20.3%test image. The data consists
                             0.5x                   to a
             sampled   T
                          ) = 1x 512×512 pixel image. We set the number andcodin
                              Using the channel T 12.5% constraint (eq. 9) of the n
e randomlydiag( uufrom the diag(WΣx   12.4% W capacity σ 2 1
                                                   ) = u M
                           be8xsimplified as (see Appendix B) Balcan, Doi, and Lewicki, 2007;
                                       2.0%         2.0%                    Balcan and Lewicki, 2007

     NIPS 2007 Tutorial                         96
                                                                                      E = tr{D · (
                                                                      Michael S. Lewicki ! Carnegie Mellon
Sparseness localizes the vectors and increases coding efficiency
                                                                                                             normalized histogram
                                encoding vectors (W)            decoding vectors (A)
                                                                                                               of coeff. kurtosis
                                                                                                                         Kurtosis of RC
                                                                                                       100               Kurtosis of RC
                                                                                                       100
                                                                                                                                    avg. = 4.9
                                                                                                       80
                                                                                                       80
robust coding




                                                                                       % % frequency
                                                                                                       60




                                                                                         frequency
                                                                                                       60
                                                                                                 %
                                                                                                       40
                                                                                                       40

                                                                                                       20
                                                                                                       20

                                                                                                        0
                                                                                                        0      4            6             8          10
                                                                                                               4            6
                                                                                                                            kurtosis      8          10
                                            average error is unchanged                                                  kurtosis
                                                                                                                            kurtosis

                                                 12.5% & 12.5%                                                          Kurtosis of RSC
                                                                                                       100              Kurtosis of RSC
                                                                                                       100

                                                                                                                                    avg. = 7.3
robust sparse coding




                                                                                                       80
                                                                                                       80




                                                                                       % % frequency
                                                                                                       60




                                                                                         frequency
                                                                                                       60

                                                                                                 %40
                                                                                                        40

                                                                                                       20
                                                                                                       20

                                                                                                        0
                                                                                                        0      4            6             8          10
                                                                                                               4            6
                                                                                                                            kurtosis      8          10
                                                                                                                        kurtosis
                                                                                                                            kurtosis

                       NIPS 2007 Tutorial                                97                                        Michael S. Lewicki ! Carnegie Mellon
Non-zero resource constraints localize weights
                     (a) 20dB      10dB             0dB           -10dB
 fovea                                                                                     (d)
                                                                                                                 20dB
                     (b)




                                                                                         Magnitude
                                                                                                            -10dB

                     (c)
                                                                                                     Spatial freq.




                                average error is also unchanged
 40 degrees
20dB                                               -10dB




NIPS 2007 Tutorial                            98                          Michael S. Lewicki ! Carnegie Mellon
$   explains data from principles
               Theory
                         $   requires idealization and abstraction


                         $   code signals accurately, efficiently, robustly
             Principle
                         $   adapted to natural sensory environment

                         $   cell response is linear, noise is additive
         Idealization
                         $   simple, additive resource constraints

                         $   information theory, natural images
        Methodology
                         $   constrained optimization

                         $   explains individual receptive fields
           Prediction    $   explains non-linear response
                         $   explains population organization


NIPS 2007 Tutorial                      99                      Michael S. Lewicki ! Carnegie Mellon
What happens next?




                                                                           -
                                                                    -     +      -
                                                                           -




                                                        Learned




                     from Hubel, 1995

NIPS 2007 Tutorial                      100   Michael S. Lewicki ! Carnegie Mellon
What happens next?




NIPS 2007 Tutorial   101   Michael S. Lewicki ! Carnegie Mellon
of filters with strongly    matic gain control known as ‘divisive normalization’ that has been
is larger; for a pair that   used to account for many nonlinear steady-state behaviors of neu-
        Joint statisticsrons in primary visual cortex10,20,21. Normalization models have
rtion is zero. Because L2     of filter outputs show magnitude dependence
ber of other filters with-   been motivated by several basic properties. First, gain control
 ralization of this condi-
 portional to a weighted
  neighborhood and an                histo{ L | L ! 0.1 }          histo{ L2 | L1 ! 0.9 }
                                             2 1
optimal weights and an             1                             1
ihood of the conditional
es or sounds (Methods,
                                 0.6                          0.6
er for pairs of filters that
                                  0.2                                  0.2
ge as seen through two lin-
 ical filter (L2), conditioned          -1      0          1             -1          0          1
diagonal spatially shifted fil-
r all image positions, and a                    1
he frequency of occurence
nsional histograms are ver-
widths of these histograms
e not statistically indepen-                 L2 0                                 histo{ L2| L1 }
 ull two-dimensional condi-
 tional to the bin counts,
escaled to fill the range of
hly decorrelated (expected                     -1
  of L1) but not statistically                      -1         0              1
 stribution of L2 increases                                L
tive) of L1.                                                       1
                                                nature neuroscience • volume 4 no 8 • Schwartz and Simoncelli (2001)
                                                                               from august 2001


         NIPS 2007 Tutorial                                    102                                  Michael S. Lewicki ! Carnegie Mellon
Using sensory gain control to reduce redundancy
Schwartz and Simoncelli (2001)

                          1
                                                              4
                                                                                            Fig. 4. Generic no
                                                                                            response is divided
        A divisive                                                                          boring filters and a
      normalization       0                                                                 Maximum Likeliho
         model                                                                              The conditional his
                                                                                            that the variance o
                                                              0
                          -1
                            -1     0              1               0                  4      gram is a represen
                                           Other squared                                    a particular mecha
                                           filter responses
                                       2
                                   !

                  F1
                                                                                            that is primarily e
                                                                                            In Fig. 7b, the hig
  Stimulus                                                                                  quency bands wi
                  F2
                                       2                              Adapt weights to
                                                                                            DISCUSSION
                                   !
                                           Other squared                    We have describ
                                                                      minimize higher-
                                           filter responses           order dependencies
                                                                            processing, in w
                                                                      from natural images
                                                                            divided by a gain
                                                                            the squared linea
NIPS 2007 Tutorial                      103                    Michael S. Lewicki ! Carnegie Mellon
masking tone. As in the visual data, the rate–level curves of the           stant. The form
Model accounts for several non-linear response properties
 muli. Schwartz and Simoncelli (2001)
 audi-
         a                    Cell             Model
muli.
 grat- a
                                                                                                  Mask       Signal
                      (Cavanaugh et al., 2000)
 udi-                         Cell             Model
  of a                                                                                            Mask         Signal
grat-
                    MeanMean firing rate
                     (Cavanaugh et al., 2000)
                   80
 e fits
 of a                                                         Mask contrast:
                        firing rate



Mean               80                                           No mask
e fits             40                                           0.13
gonal                                                          Mask contrast:
Mean                                                            0.5 mask
                                                                 No
 of an             40                                            0.13
 onal                0
  of a                 0.03     0.3 1          0.03     0.3 1 0.5
of an
Mean                0 Signal contrast          Signal contrast
 of a
 non-
                       0.03     0.3 1          0.03     0.3 1
                                                                                                  Mask        Signal
                                                Signal contrast
Mean
maxi-
                        Signal contrast
non- b
                    Mean Mean rate rate




                                                                                                   Mask         Signal
                   80
maxi-
           b                                                   Mask contrast:
                         firing firing




                                           80                                     No mask
                                           40                                     0.13
                                                                                 Mask contrast:
                                                                                  0.5
                                                                                  No mask
 udi-                                      40
                                            0                                      0.13
                                             0.03     0.3 1     0.03     0.3 1     0.5
cha-                                          Signal contrast   Signal contrast
udi-
have
                                           0
                                             0.03     0.3 1     0.03     0.3 1
 ha-
a V1                                         Signal contrast     Signal contrast
have NIPS 2007 Tutorial
odel                                                                       104                           Michael S. Lewicki ! Carnegie Mellon
What about image structure?

    •    Bottom-up approaches focus on the non-linearity.

    •    Our aim here is to focus on the computational problem:

              How do we learn the intrinsic dimensions of natural image structure?

    •    Idea: characterize how the local image distribution changes.




NIPS 2007 Tutorial                                 105                       Michael S. Lewicki ! Carnegie Mellon
Perceptual organization in natural scenes




                                       image of Kyoto, Japan from E. Doi
NIPS 2007 Tutorial               106                      Michael S. Lewicki ! Carnegie Mellon
Palmer and Rock’s theory of perceptual organization




                                        Palmer and Rock, 1994



NIPS 2007 Tutorial             107                   Michael S. Lewicki ! Carnegie Mellon
Gestalt grouping principles




                                            from Palmer, 1999
NIPS 2007 Tutorial            108   Michael S. Lewicki ! Carnegie Mellon
Malik and Perona’s (1991) model of texture segregation




                                                              input image




                                                             output of dark
                                                             bar filters




                                                             output of light
                                                             bar filters




                     Visual Features
NIPS 2007 Tutorial                     109        Michael S. Lewicki ! Carnegie Mellon
Malik and Perona algorithm can find texture boundaries




              Painting by Gustav Klimt   texture boundaries from Malik and
                                                  Perona algorithm




NIPS 2007 Tutorial                       110                        Michael S. Lewicki ! Carnegie Mellon
NIPS2007: sensory coding and hierarchical representations
NIPS2007: sensory coding and hierarchical representations
NIPS2007: sensory coding and hierarchical representations
NIPS2007: sensory coding and hierarchical representations
NIPS2007: sensory coding and hierarchical representations
NIPS2007: sensory coding and hierarchical representations
NIPS2007: sensory coding and hierarchical representations
NIPS2007: sensory coding and hierarchical representations
NIPS2007: sensory coding and hierarchical representations
NIPS2007: sensory coding and hierarchical representations
NIPS2007: sensory coding and hierarchical representations
NIPS2007: sensory coding and hierarchical representations
NIPS2007: sensory coding and hierarchical representations
NIPS2007: sensory coding and hierarchical representations
NIPS2007: sensory coding and hierarchical representations
NIPS2007: sensory coding and hierarchical representations
NIPS2007: sensory coding and hierarchical representations
NIPS2007: sensory coding and hierarchical representations
NIPS2007: sensory coding and hierarchical representations
NIPS2007: sensory coding and hierarchical representations
NIPS2007: sensory coding and hierarchical representations
NIPS2007: sensory coding and hierarchical representations
NIPS2007: sensory coding and hierarchical representations
NIPS2007: sensory coding and hierarchical representations
NIPS2007: sensory coding and hierarchical representations
NIPS2007: sensory coding and hierarchical representations
NIPS2007: sensory coding and hierarchical representations
NIPS2007: sensory coding and hierarchical representations
NIPS2007: sensory coding and hierarchical representations
NIPS2007: sensory coding and hierarchical representations
NIPS2007: sensory coding and hierarchical representations
NIPS2007: sensory coding and hierarchical representations
NIPS2007: sensory coding and hierarchical representations
NIPS2007: sensory coding and hierarchical representations
NIPS2007: sensory coding and hierarchical representations
NIPS2007: sensory coding and hierarchical representations
NIPS2007: sensory coding and hierarchical representations
NIPS2007: sensory coding and hierarchical representations
NIPS2007: sensory coding and hierarchical representations
NIPS2007: sensory coding and hierarchical representations
NIPS2007: sensory coding and hierarchical representations
NIPS2007: sensory coding and hierarchical representations
NIPS2007: sensory coding and hierarchical representations
NIPS2007: sensory coding and hierarchical representations
NIPS2007: sensory coding and hierarchical representations
NIPS2007: sensory coding and hierarchical representations
NIPS2007: sensory coding and hierarchical representations
NIPS2007: sensory coding and hierarchical representations

More Related Content

More from zukun

ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVzukun
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Informationzukun
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statisticszukun
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibrationzukun
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionzukun
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluationzukun
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-softwarezukun
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptorszukun
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectorszukun
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-introzukun
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video searchzukun
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video searchzukun
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video searchzukun
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learningzukun
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionzukun
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick startzukun
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysiszukun
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structureszukun
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities zukun
 
Icml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant featuresIcml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant featureszukun
 

More from zukun (20)

ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCV
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Information
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statistics
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibration
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer vision
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluation
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-software
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptors
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectors
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-intro
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video search
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video search
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video search
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learning
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer vision
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick start
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysis
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structures
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities
 
Icml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant featuresIcml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant features
 

Recently uploaded

1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17Celine George
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxVishalSingh1417
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxDenish Jangid
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.MaryamAhmad92
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfPoh-Sun Goh
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfNirmal Dwivedi
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...Nguyen Thanh Tu Collection
 
Third Battle of Panipat detailed notes.pptx
Third Battle of Panipat detailed notes.pptxThird Battle of Panipat detailed notes.pptx
Third Battle of Panipat detailed notes.pptxAmita Gupta
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
Dyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxDyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxcallscotland1987
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and ModificationsMJDuyan
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxnegromaestrong
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxAmanpreet Kaur
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhikauryashika82
 

Recently uploaded (20)

1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 
Third Battle of Panipat detailed notes.pptx
Third Battle of Panipat detailed notes.pptxThird Battle of Panipat detailed notes.pptx
Third Battle of Panipat detailed notes.pptx
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Dyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxDyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptx
 
Asian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptxAsian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptx
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 

NIPS2007: sensory coding and hierarchical representations

  • 1. Sensory Coding & Hierarchical Representations Michael S. Lewicki Computer Science Department & Center for the Neural Basis of Cognition Carnegie Mellon University
  • 2. “IT” V4 V2 retina LGN V1 NIPS 2007 Tutorial 2 Michael S. Lewicki ! Carnegie Mellon
  • 3. Fig. 1. Anatomical and functional hierarchies in the macaque visual system. The human Hedgé and Felleman, 2007 from visual system (not shown) is believed to be roughly similar. A, A schematic summary of the laminar patterns of feed-forward (or ascending) and feed- back (or descending) connections for visual area V1. The laminar patterns vary somewhat from one visual area to the next. But in general, the connections are complementary, so that the ascending connections terminate in the granular layer (layer 4) and the descending connections avoid it. The connections are generally reciprocal, in that an area that sends feed- forward connections to another area also receives feedback connections from it. The visual anatomical hierarchy is defined based on, among other things, the laminar patterns of these interconnections among the various areas. See text for details. B, A simplified version of the visual anatomical hierarchy in the macaque monkey. For the complete version, see Felleman and Van Essen (1991). See text for additional details. AIT = anterior inferotemporal; LGN = lateral geniculate nucleus; LIP = lateral intraparietal; MT = middle temporal; MST = medial superior temporal; PIT = posterior inferotemporal; V1 = visual area 1; V2 = visual area 2; V4 = visual area 4; VIP = ventral intraparietal. C, A model of hierarchical processing of visual information proposed by David Marr (1982). D, A schematic illustration of the presumed parallels between the anatomical and functional hierarchies. It is widely presumed 3 only that visual processingMichael S. Lewicki ! Carnegie Mellon the NIPS 2007 Tutorial not is hierarchical but also that
  • 4. Palmer and Rock, 1994 NIPS 2007 Tutorial 4 Michael S. Lewicki ! Carnegie Mellon
  • 5. V1 simple cells !"#"$%&'"()&"*+(,)(-(./(0&1$*"(#" 2"345"*&06 789-: Hubel and Wiesel, 1959 DeAngelis, et al, 1995 NIPS 2007 Tutorial 5 Michael S. Lewicki ! Carnegie Mellon
  • 6. NIPS 2007 Tutorial 6 Michael S. Lewicki ! Carnegie Mellon
  • 7. NIPS 2007 Tutorial 7 Michael S. Lewicki ! Carnegie Mellon
  • 8. Out of the retina • > 23 distinct neural pathways, no simple function division • Suprachiasmatic nucleus: circadian rhythm • Accessory optic system: stabilize retinal image • Superior colliculus: integrate visual and auditory information with head movements, direct eyes • Pretectum: plays adjusting pupil size, track large moving objects • Pregeniculate: cells responsive to ambient light • Lateral geniculate (LGN): main “relay” to visual cortex; contains 6 distinct layers, each with 2 sublayers. Organization very complex NIPS 2007 Tutorial 8 Michael S. Lewicki ! Carnegie Mellon
  • 9. V1 simple cell integration of LGN cells !"#"$%&"'(()'*+,-+)".*)(/"*'"&0.1/("2(//& !"#$%&'(")&* +,%"- 345 2(//& 60.1/( 2(// Hubel and Wiesel, 1963 !78(/"#"$0(&(/9":;<= NIPS 2007 Tutorial 9 Michael S. Lewicki ! Carnegie Mellon
  • 10. Hyper-column model Hubel and Wiesel, 1978 NIPS 2007 Tutorial 10 Michael S. Lewicki ! Carnegie Mellon
  • 11. Anatomical circuitry in V1 54 CALLAWAY Figure 1 Contributions of individual neurons to local excitatory connections between cortical layers. From left to right, typical spiny from Callaway, 1998 4C, 2–4B, 5, and 6 are shown. Dendritic neurons in layers arbors are illustrated with thick lines and axonal arbors with finer lines. Each cell projects axons specifically to only a subset of the layers. This simplified diagram focuses on connections between layers 2–4B, 4C, 5, and 6 and does not incorporate details of circuitry specific for subdivisions of NIPS 2007 Tutorial Michael S. Lewicki ! Carnegie these layers. A model of the interactions between these neurons is shown in Figure 2. The neurons Mellon 11
  • 12. Where is this headed? Ramon y Cajal NIPS 2007 Tutorial 12 Michael S. Lewicki ! Carnegie Mellon
  • 13. A wing would be a most mystifying structure if one did not know that birds flew. Horace Barlow, 1961 An algorithm is likely to be understood more readily by understanding the nature of the problem being solved than by examining the mechanism in which it is embodied. David Marr, 1982
  • 14. A theoretical approach • Look at the system from a functional perspective: What problems does it need to solve? • Abstract from the details: Make predictions from theoretical principles. • Models are bottom-up; theories are top-down. What are the relevant computational principles? NIPS 2007 Tutorial 14 Michael S. Lewicki ! Carnegie Mellon
  • 15. Representing structure in natural images image from Field (1994) NIPS 2007 Tutorial 15 Michael S. Lewicki ! Carnegie Mellon
  • 16. Representing structure in natural images image from Field (1994) NIPS 2007 Tutorial 16 Michael S. Lewicki ! Carnegie Mellon
  • 17. Representing structure in natural images image from Field (1994) NIPS 2007 Tutorial 17 Michael S. Lewicki ! Carnegie Mellon
  • 18. Representing structure in natural images Need to describe all image structure in the scene. What representation is best? image from Field (1994) NIPS 2007 Tutorial 18 Michael S. Lewicki ! Carnegie Mellon
  • 19. V1 simple cells !"#"$%&'"()&"*+(,)(-(./(0&1$*"(#" 2"345"*&06 789-: Hubel and Wiesel, 1959 DeAngelis, et al, 1995 NIPS 2007 Tutorial 19 Michael S. Lewicki ! Carnegie Mellon
  • 20. Oriented Gabor models of individual simple cells figure from Daugman, 1990; data from Jones and Palmer, 1987 NIPS 2007 Tutorial 20 Michael S. Lewicki ! Carnegie Mellon
  • 21. 2D Gabor wavelet captures spatial structure • 2D Gabor functions • Wavelet basis generated by dilations, translations, and rotations of a single basis function • Can also control phase and aspect ratio • (drifting) Gabor functions are what the eye “sees best” NIPS 2007 Tutorial 21 Michael S. Lewicki ! Carnegie Mellon
  • 22. Recoding with Gabor functions Pixel entropy = 7.57 bits Recoding with 2D Gabor functions Coefficient entropy = 2.55 bits NIPS 2007 Tutorial 22 Michael S. Lewicki ! Carnegie Mellon
  • 23. How is the V1 population organized? NIPS 2007 Tutorial 23 Michael S. Lewicki ! Carnegie Mellon
  • 24. A general approach to coding: redundancy reduction Correlation of adjacent pixels 1 0.8 0.6 0.4 0.2 0 0 0.2 0.4 0.6 0.8 1 Redundancy reduction is equivalent to efficient coding. NIPS 2007 Tutorial 24 Michael S. Lewicki ! Carnegie Mellon
  • 25. Describing signals with a simple statistical model Principle Good codes capture the statistical distribution of sensory patterns. How do we describe the distribution? • Goal is to encode the data to desired precision x = a1 s1 + a2 s2 + · · · + aL sL + = As + • Can solve for the coefficients in the no noise case ˆ=A s −1 x NIPS 2007 Tutorial 25 Michael S. Lewicki ! Carnegie Mellon
  • 26. An algorithm for deriving efficient linear codes: ICA Learning objective: Using independent component analysis (ICA) to optimize A: maximize coding efficiency maximize P(x|A) over A. ∂ ∆A ∝ AA log P (x|A) T ∂A = −A(zsT − I) Probability of the pattern ensemble is: where z = (log P(s))’. P (x1 , x2 , ..., xN |A) = P (xk |A) k This learning rule: To obtain P(x|A) marginalize over s: • learns the features that capture the most structure P (x|A) = ds P (x|A, s)P (s) • optimizes the efficiency of the code P (s) = | det A| What should we use for P(s)? NIPS 2007 Tutorial 26 Michael S. Lewicki ! Carnegie Mellon
  • 27. Modeling Non-Gaussian distributions • Typical coeff. distributions of natural signals are non-Gaussian. !=2 !=4 !6 !4 !2 0 2 4 6 !6 !4 !2 0 2 4 6 s s 1 2 NIPS 2007 Tutorial 27 Michael S. Lewicki ! Carnegie Mellon
  • 28. The generalized Gaussian distribution 1 q P (x|q) ∝ exp(− |x| ) 2 • Or equivalently, and exponential power distribution (Box and Tiao, 1973): 2/(1+β) ω(β) x−µ P (x|µ, σ, β) = exp −c(β) σ σ • " varies monotonically with the kurtosis, #2: !=!0.75 " =!1.08 !=!0.25 " =!0.45 !=+0.00 (Normal) " =+0.00 2 2 2 !=+0.50 (ICA tanh) "2=+1.21 !=+1.00 (Laplacian) "2=+3.00 !=+2.00 "2=+9.26 NIPS 2007 Tutorial 28 Michael S. Lewicki ! Carnegie Mellon
  • 29. Modeling Gaussian distributions with PCA • !=0 !=0 Principal component analysis (PCA) describes the principal axes of variation in the data distribution. • This is equivalent to fitting the data with a multivariate Gaussian. !6 !4 !2 0 2 4 6 !6 !4 !2 0 2 4 6 s1 s2 NIPS 2007 Tutorial 29 Michael S. Lewicki ! Carnegie Mellon
  • 30. Modeling non-Gaussian distributions • !=2 !=4 What about non-Gaussian marginals? !6 !4 !2 0 2 4 6 !6 !4 !2 0 2 4 6 s1 s2 • How would this distribution be modeled by PCA? NIPS 2007 Tutorial 30 Michael S. Lewicki ! Carnegie Mellon
  • 31. Modeling non-Gaussian distributions • !=2 !=4 What about non-Gaussian marginals? !6 !4 !2 0 2 4 6 !6 !4 !2 0 2 4 6 s1 s2 • How would this distribution be modeled by PCA? • How should the distribution be described? The non-orthogonal ICA solution captures the non- Gaussian structure NIPS 2007 Tutorial 31 Michael S. Lewicki ! Carnegie Mellon
  • 32. Efficient coding of natural images: Olshausen and Field, 1996 naturalscene nature scene visual input visual input units image basis functions receptive fields ... ... ... before after learning learning Network weights are adapted to maximize coding efficiency: minimizes redundancy and maximizes the independence of the outputs NIPS 2007 Tutorial 32 Michael S. Lewicki ! Carnegie Mellon
  • 33. Model predicts local and global receptive field properties Learned basis for natural images Overlaid basis function properties from Lewicki and Olshausen, 1999 NIPS 2007 Tutorial 33 Michael S. Lewicki ! Carnegie Mellon
  • 34. Algorithm selects best of many possible sensory codes Haar Learned Wavelet Gabor Fourier PCA Theoretical perspective: Not edge “detectors.” from Lewicki and Olshausen, 1999 An efficient code for natural images. NIPS 2007 Tutorial 34 Michael S. Lewicki ! Carnegie Mellon
  • 35. Comparing coding efficiency on natural images Comparing codes on natural images 8 7 6 estimated bits per pixel 5 4 3 2 1 0 Gabor wavelet Haar Fourier PCA learned NIPS 2007 Tutorial 35 Michael S. Lewicki ! Carnegie Mellon
  • 36. Responses in primary visual cortex to visual motion from Wandell, 1995 NIPS 2007 Tutorial 36 Michael S. Lewicki ! Carnegie Mellon
  • 37. Sparse coding of time-varying images (Olshausen, 2002) "#$!% !( ! ' "#$&)')!!!(% & ! ' *$&)')!% ... & ! I(x, y, t) = ai (t )φi (x, y, t − t ) + (x, y, t) i t = ai (t) ∗ φi (x, y, t) + (x, y, t) i NIPS 2007 Tutorial 37 Michael S. Lewicki ! Carnegie Mellon
  • 38. Sparse decomposition of image sequences convolution posterior maximum %#$ ! "% "% %#5 "#$ !% !% %#7 " 7% 7% %#! %#$ %#" 5% 5% 34),,'3')/& 34),,'3')/& $% % $% % !%#$ !%#" 6% 6% !%#! 8% !" 8% !%#7 9% !"#$ 9% !%#5 :% :% !! !%#$ !% 5% 6% !% 5% 6% &'()*+,-.()*/0(1)-2 &'()*+,-.()*/0(1)-2 reconstruction input sequence from Olshausen, 2002 NIPS 2007 Tutorial 38 Michael S. Lewicki ! Carnegie Mellon
  • 39. Learned spatio-temporal basis functions time basis function from Olshausen, 2002 NIPS 2007 Tutorial 39 Michael S. Lewicki ! Carnegie Mellon
  • 40. Animated spatial-temporal basis functions from Olshausen, 2002 NIPS 2007 Tutorial 40 Michael S. Lewicki ! Carnegie Mellon
  • 41. $ explains data from principles Theory $ requires idealization and abstraction $ code signals accurately and efficiently Principle $ adapted to natural sensory environment Idealization $ cell response is linear Methodology $ information theory, natural images $ explains individual receptive fields Prediction $ explains population organization NIPS 2007 Tutorial 41 Michael S. Lewicki ! Carnegie Mellon
  • 42. How general is the efficient coding principle? Can it explain auditory coding? NIPS 2007 Tutorial 42 Michael S. Lewicki ! Carnegie Mellon
  • 43. Limitations of the linear model • linear • only optimal for block, not whole signal • no phase-locking • representation depends on block alignment NIPS 2007 Tutorial Michael S. Lewicki ! Carnegie Mellon
  • 44. A continuous filterbank does not form an efficient code ⊗ ← → → → Goal: find a representation that is both time-relative and efficient NIPS 2007 Tutorial 44 Michael S. Lewicki ! Carnegie Mellon
  • 45. Efficient signal representation using time-shiftable kernels (spikes) Smith and Lewicki (2005) Neural Comp. 17:19-45 M nm x(t) = sm,i φm (t − τm,i ) + (t) m=1 i=1 • Each spike encodes the precise time and magnitude of an acoustic feature • Figure 3: Smith, NC ms 2956 Two important theoretical abstractions for “spikes” - not binary - not probabilistic • Can convert to a population of stochastic, binary spikes NIPS 2007 Tutorial 45 Michael S. Lewicki ! Carnegie Mellon
  • 46. Coding audio signals with spikes 5000 Kernel CF (Hz) 2000 1000 K 500 200 100 0 5 10 15 20 25 ms Input Reconstruction Residual NIPS 2007 Tutorial 46 Michael S. Lewicki ! Carnegie Mellon
  • 47. Kernel CF (Hz) 2000 Comparing a spike code to a spectrogram 1000 500 a 200 100 b c 5000 5000 FrequencyCF (Hz) 2000 Kernel (Hz) 4000 1000 3000 500 2000 200 a 1000 100 c b 100 200 300 400 500 600 700 800 ms 5000 Frequency (Hz) 4000 Kernel CF (Hz) 2000 3000 1000 2000 500 1000 200 100 100 200 300 400 500 600 700 800 ms c How do we compute the spikes? NIPS 2007 Tutorial 5000 47 Michael S. Lewicki ! Carnegie Mellon
  • 48. “can” with filter-threshold 5000 Kernel CF (Hz) 2000 1000 500 200 100 0 5 10 15 20 ms Input Reconstruction Residual NIPS 2007 Tutorial 48 Michael S. Lewicki ! Carnegie Mellon
  • 49. Spike Coding with Matching Pursuit 1. convolve signal with kernels NIPS 2007 Tutorial 49 Michael S. Lewicki ! Carnegie Mellon
  • 50. Spike Coding with Matching Pursuit 1. convolve signal with kernels 2. find largest peak over convolution set NIPS 2007 Tutorial 50 Michael S. Lewicki ! Carnegie Mellon
  • 51. Spike Coding with Matching Pursuit 1. convolve signal with kernels 2. find largest peak over convolution set 3. fit signal with kernel NIPS 2007 Tutorial 51 Michael S. Lewicki ! Carnegie Mellon
  • 52. Spike Coding with Matching Pursuit 1. convolve signal with kernels 2. find largest peak over convolution set 3. fit signal with kernel 4. subtract kernel from signal, record spike, and adjust convolutions NIPS 2007 Tutorial 52 Michael S. Lewicki ! Carnegie Mellon
  • 53. Spike Coding with Matching Pursuit 1. convolve signal with kernels 2. find largest peak over convolution set 3. fit signal with kernel 4. subtract kernel from signal, record spike, and adjust convolutions 5. repeat NIPS 2007 Tutorial 53 Michael S. Lewicki ! Carnegie Mellon
  • 54. Spike Coding with Matching Pursuit 1. convolve signal with kernels 2. find largest peak over convolution set 3. fit signal with kernel 4. subtract kernel from signal, record spike, and adjust convolutions 5. repeat NIPS 2007 Tutorial 54 Michael S. Lewicki ! Carnegie Mellon
  • 55. Spike Coding with Matching Pursuit 1. convolve signal with kernels 2. find largest peak over convolution set 3. fit signal with kernel 4. subtract kernel from signal, record spike, and adjust convolutions 5. repeat NIPS 2007 Tutorial 55 Michael S. Lewicki ! Carnegie Mellon
  • 56. Spike Coding with Matching Pursuit 1. convolve signal with kernels 2. find largest peak over convolution set 3. fit signal with kernel 4. subtract kernel from signal, record spike, and adjust convolutions 5. repeat NIPS 2007 Tutorial 56 Michael S. Lewicki ! Carnegie Mellon
  • 57. Spike Coding with Matching Pursuit 1. convolve signal with kernels 2. find largest peak over convolution set 3. fit signal with kernel 4. subtract kernel from signal, record spike, and adjust convolutions 5. repeat . . . NIPS 2007 Tutorial 57 Michael S. Lewicki ! Carnegie Mellon
  • 58. Spike Coding with Matching Pursuit 1. convolve signal with kernels 2. find largest peak over convolution set 3. fit signal with kernel 4. subtract kernel from signal, record spike, and adjust convolutions 5. repeat . . . 6. halt when desired fidelity is reached NIPS 2007 Tutorial 58 Michael S. Lewicki ! Carnegie Mellon
  • 59. “can” 5 dB SNR, 36 spikes, 145 sp/sec 5000 Kernel CF (Hz) 2000 1000 500 200 100 0 50 100 150 200 ms Input Reconstruction Residual NIPS 2007 Tutorial 59 Michael S. Lewicki ! Carnegie Mellon
  • 60. “can” 10 dB SNR, 93 spikes, 379 sp/sec 5000 Kernel CF (Hz) 2000 1000 500 200 100 0 50 100 150 200 ms Input Reconstruction Residual NIPS 2007 Tutorial 60 Michael S. Lewicki ! Carnegie Mellon
  • 61. “can” 20 dB SNR, 391 spikes, 1700 sp/sec 5000 Kernel CF (Hz) 2000 1000 500 200 100 0 50 100 150 200 ms Input Reconstruction Residual NIPS 2007 Tutorial 61 Michael S. Lewicki ! Carnegie Mellon
  • 62. “can” 40 dB SNR, 1285 spikes, 5238 sp/sec 5000 Kernel CF (Hz) 2000 1000 500 200 100 0 50 100 150 200 ms Input Reconstruction Residual NIPS 2007 Tutorial 62 Michael S. Lewicki ! Carnegie Mellon
  • 63. Efficient auditory coding with optimized kernel shapes Smith and Lewicki (2006) Nature 439:978-982 What are the optimal kernel shapes? M nm x(t) = sm,i φm (t − τm,i ) + (t) m=1 i=1 Adapt algorithm of Olshausen (2002) NIPS 2007 Tutorial 63 Michael S. Lewicki ! Carnegie Mellon
  • 64. Optimizing the probabilistic model M nm x(t) = sm φm(t − τim) + (t), i m=1 i=1 p(x|Φ) = p(x|Φ, s, τ )p(s)p(τ )dsdτ ≈ p(x|Φ, s, τ )p(ˆ)p(ˆ) ˆ ˆ s τ (t) ∼ N (0, σ ) Learning (Olshausen, 2002): ∂ ∂ log p(x|Φ) = log p(x|Φ, s, τ ) + log p(ˆ)p(ˆ) ˆ ˆ s τ ∂φm ∂φm M n 1 ∂ m = [x − sm φm(t − τim)]2 ˆi 2σε ∂φm m=1 i=1 1 = [x − x] ˆ ˆi sm σε i Also adapt kernel lengths NIPS 2007 Tutorial 64 Michael S. Lewicki ! Carnegie Mellon
  • 65. Adapting the optimal kernel shapes NIPS 2007 Tutorial 65 Michael S. Lewicki ! Carnegie Mellon
  • 66. Kernel functions optimized for coding speech NIPS 2007 Tutorial 66 Michael S. Lewicki ! Carnegie Mellon
  • 67. Quantifying coding efficiency 5000 1. fit signal Kernel CF (Hz) 2000 2. quantize time and amplitude values 1000 spike code 3. prune zero values 4. measure coding efficiency using the 500 entropy of quantized values 200 5. reconstruct signal using quantized 100 0 50 150 values 6. measure fidelity using Input signal-to-noise ratio (SNR) of residual error original • identical procedure for other codes (e.g. Fourier and wavelet) Reconstruction reconstruction M nm x(t) = sm,i φm (tResidual) + (t) − τm,i m=1 i=1 residual error NIPS 2007 Tutorial Michael S. Lewicki ! Carnegie Mellon 67
  • 68. Coding efficiency curves 40 35 30 25 SNR (dB) +14 dB 20 15 4x more efficient 10 Spike Code: adapted Spike Code: gammatone 5 Block Code: wavelet Block Code: Fourier 0 0 10 20 30 40 50 60 70 80 90 Rate (Kbps) NIPS 2007 Tutorial 68 Michael S. Lewicki ! Carnegie Mellon
  • 69. Using efficient coding theoryComparing atheoretical a spectrogram to make spike code to predictions a b 5000 Natural Sound Environment Kernel CF (Hz) 2000 evolution theory 1000 500 A simple model of auditory coding 200 physiological data: 100 optimal kernels: How do we compute the spikes? sound • auditory nerve non-linearities waveform filterbank filter shapes static c stochastic spiking population spike code • properties • population trends 5000 • coding efficiency Frequency (Hz) 4000 auditory revcor filters: gammatones 5 3000 Speech Prediction ? Auditory Nerve Filters 1 2 3 4 5 6 7 8 9 10 2000 time (ms) 2 1000 Bandwidth (kHz) 1 More theoretical questions: • Why gammatones? 1 2 3 4 5 time (ms) 0.5 100 200 300 only compare 500 the data after optimizing 400 to 600 700 800 ms • Why spikes? Michael S. Lewicki ! Carnegie Mellon • How is sound coded by we do not fit theBad Zwischenahn ! Aug 21, 2004 data 0.2 1 2 3 4 5 the spike population? time (ms) 0.1 0.1 0.2 0.5 1 2 5 Center Frequency (kHz) How do we develop a theory? Prediction depends on sound ensemble. deBoer and deJongh, 1978 Carney and Yin, 1988 Michael S. Lewicki ! Carnegie Mellon Bad Zwischenahn ! Aug 21, 2004 NIPS 2007 Tutorial 69 Michael S. Lewicki ! Carnegie Mellon
  • 70. Learned kernels share features of auditory nerve filters Auditory nerve filters Optimized kernels from Carney, McDuffy, and Shekhter, 1999 scale bar = 1 msec NIPS 2007 Tutorial 70 Michael S. Lewicki ! Carnegie Mellon
  • 71. Learned kernels closely match individual auditory nerve filters for each kernel find closet matching auditory nerve filter in Laurel Carney’s database of ~100 filters. NIPS 2007 Tutorial 71 Michael S. Lewicki ! Carnegie Mellon
  • 72. Learned kernels overlaid on selected auditory nerve filters For almost all learned kernels there is a closely matching auditory nerve filter. NIPS 2007 Tutorial 72 Michael S. Lewicki ! Carnegie Mellon
  • 73. Coding of a speech consonant a Input Reconstruction Residual b 5000 Kernel CF (Hz) 2000 4800 1000 4000 500 3400 200 38 39 40 100 c 5000 Frequency (Hz) 4000 3000 2000 1000 10 20 30 40 50 60ms NIPS 2007 Tutorial 73 Michael S. Lewicki ! Carnegie Mellon
  • 74. How is this achieving an efficient, time-relative code? Time-relative coding of glottal pulses a b 5000 Kernel CF (Hz) 2000 1000 500 200 100 0 20 40 60 80 100 ms NIPS 2007 Tutorial 74 Michael S. Lewicki ! Carnegie Mellon
  • 75. $ explains data from principles Theory $ requires idealization and abstraction $ code signals accurately and efficiently Principle $ adapted to natural sensory environment Idealization $ analog spikes $ information theory, natural sounds Methodology $ optimization Prediction $ explains individual receptive fields $ explains population organization NIPS 2007 Tutorial 75 Michael S. Lewicki ! Carnegie Mellon
  • 76. A gap in the theory? - - + - - Learned from Hubel, 1995 NIPS 2007 Tutorial 76 Michael S. Lewicki ! Carnegie Mellon
  • 77. Redundancy reduction for noisy channels (Atick, 1992) s input x recoding Ax output y channel A channel y = x +" y = Ax +! Mutual information P (x, s) I(x, s) = P (x, s) log2 s,x P (s)P (x) I(x, s) = 0 iff P (x, s) = P (x)P (s), i.e. x and s are independent. NIPS 2007 Tutorial 77 Michael S. Lewicki ! Carnegie Mellon
  • 78. Profiles of optimal filters • high SNR - - reduce redundancy - + - - - center-surround - - + - - + • low SNR - average - low-pass filter • matches behavior of retinal ganglion cells NIPS 2007 Tutorial 78 Michael S. Lewicki ! Carnegie Mellon
  • 79. An observation: Contrast sensitivity of ganglion cells Luminance level decreases one log unit each time we go to lower curve. NIPS 2007 Tutorial 79 Michael S. Lewicki ! Carnegie Mellon
  • 80. Natural images have a 1/f amplitude spectrum Field, 1987 NIPS 2007 Tutorial 80 Michael S. Lewicki ! Carnegie Mellon
  • 81. Components of predicted filters whitening filter low-pass filter optimal filter from Atick, 1992 NIPS 2007 Tutorial 81 Michael S. Lewicki ! Carnegie Mellon
  • 82. Predicted contrast sensitivity functions match neural data from Atick, 1992 NIPS 2007 Tutorial 82 Michael S. Lewicki ! Carnegie Mellon
  • 83. Robust coding of natural images • Theory refined: - image is noisy and blurred - neural population size changes - neurons are noisy from Hubel, 1995 (a) Undistorted image (b) Fovea retinal image (c) 40 degrees eccentricity Intensity -4 0 4 -4 0 4 Visual angle [arc min] from Doi and Lewicki, 2006 NIPS 2007 Tutorial 83 Michael S. Lewicki ! Carnegie Mellon
  • 84. Problem 1: Real neurons are “noisy” Estimates of neural information capacity system (area) stimulus bits / sec bits / spike fly visual (H1) motion 64 ~1 monkey visual (MT) motion 5.5 - 12 0.6 - 1.5 frog auditory (auditory nerve) noise & call 46 & 133 1.4 & 7.8 Salamander visual (ganglinon cells) rand. spots 3.2 1.6 cricket cercal (sensory afferent) mech. motion 294 3.2 cricket cercal (sensory afferent) wind noise 75 - 220 0.6 - 3.1 cricket cercal (10-2 and 10-3) wind noise 8 - 80 avg. = 1 Electric fish (P-afferent) amp. modulation 0 - 200 0 - 1.2 Limited capacity neural codes needBorst and Theunissen, 1999 After to be robust. NIPS 2007 Tutorial 84 Michael S. Lewicki ! Carnegie Mellon
  • 85. Traditional codes are not robust encoding neurons sensory input Original NIPS 2007 Tutorial 85 Michael S. Lewicki ! Carnegie Mellon
  • 86. Traditional codes are not robust encoding neurons Add noise equivalent to 1 bit precision sensory input Original 1x efficient coding reconstruction (34% error) 1 bit precision NIPS 2007 Tutorial 86 Michael S. Lewicki ! Carnegie Mellon
  • 87. al., to pairwise correla- al., st 2001; Schneidman et e correlations that can not just to pairwise correla- Redundancy plays an important role in neural coding ental Procedures). By can to all correlations that xperimental Procedures). By cy can be measured model ofcan be measured undancy the light re- g any model of the light re- ancy was defined as edundancy was defined as Response of salamander retinal ganglion cells to natural movies al information that that the mutual information the ne conveyed aboutstim- veyed about the the stim- information conveyed d the information conveyed a As information rates ).,Rb;S). As information rates n of ganglion cells, we pulation of ganglion cells, we as a fraction of the minimum action of the minimum al cells cells (Reichal., al., dividual (Reich et et min{I(Ra;S), I(Rb;S)}, is the ancy I(Rb;S)}, is the Ra;S),between two cells, so between be nocells, sothan ncy can two greater fractional redundancy an be 0 when the two cells cy is no greater than 0 when the two cells mation about the stimulus; its lls encode exactly the same about the stimulus; its ell’s information is a subset ode exactly the same ues of the redundancy mean formation is a subset c. the redundancy mean s of Ganglion Cells rocessing under realistic vi- anglion Cells ted retinas with a set of nat- sing underarealistic of envi- represent variety vi- From Puchalla et al, 2005 cell pair distance (%m) important a set of nat- of inas with characteristic sent a NIPS 2007 Tutorial by the variety of envi- field motion caused 87 Michael S. Lewicki ! Carnegie Mellon
  • 88. tance, the code. great numbersee, coding elementsintroduces redundancy into the cod ty of what if a As we will of robust coding are available while their coding pre o be Robust codingthe assumed noise in 2005,2006) case, separated from (Doi and Lewicki, the representation, unlike PCA or ICA the the apparent dimensionality could be increased (instead of reduced) while tha limited. Robust coding should make optimal use of such an arbitrary number of cod e present the optimal linear encoder andcoding introduces redundancy into the code lity of the code. As we will Encoder see, robust decoder when the coefficients are noisy, an Decoder to be separated from the assumedthe different representation, unlike PCA or ICApaper r to the number of units and to noise in the data and noise conditions. This that x W u A ^ x , we formulate the problem. Then, in section III, we analyze the solutions for the g we present theand two-dimensional case. Next, in when theIV, we applyare noisy, and ons for one- optimal Data encoder and decoder section coefficients the proposed linear Noiseless Reconstruction Representation y to theits considerable robustness different data and noise conditions. This codes. is nstrate number of units and to the compared against conventional image paper Fi udies. formulate the problem. Then, in section III, we analyze the solutions for the gen II, we ions for one- and two-dimensional case. Next, in section IV, we apply the proposed ro Model limited precision using onstrate its channel noise: robustnessPcompared FORMULATION additive considerable II. Channel Noise against conventional image codes. Fina ROBLEM n udies. (Fig. 1), we assume that the data is N -dimensional with zero mean and cov model Encoder Decoder rices W ∈ RM ×N and A ∈ RN ×M . For each data point x, its representation r in r FORMULATION A ^ x perturbed byII.uP ROBLEM noise (i.e., channel noise) n ∼ N (0, σ 2 W x rough matrix W, the additive n Data Noiseless Noisy Reconstruction model (Fig. 1), we assume that the data is Representation Representation N -dimensional with zero mean and covari trices W ∈ RM ×N and A ∈ r N ×M . For each= u + n. x, its representation r in th R = Wx + n data point channel noise) n ∼ N (0, σ 2 I hrough matrix W, perturbed by the additive noise (i.e., vectors. The reconstructionnofM the encoding matrix and its row vectors as encoding noisy representation using matrix A: + n = u + n. r = Wx Caveat: linear, but can evaluate for different s the encoding matrix and its row= Ar = encoding vectors. x vectors as AWx + An. ˆ Thenoise levels. reconstruction of ad noisy representation using matrix A: NIPS 2007 Tutorial 88 Michael The term AWx i the decoding matrix and its column vectors as decoding vectors. S. Lewicki ! Carnegie Mellon
  • 89. Generalizing the model: sensory noise and optical blur (a) Undistorted image (b) Fovea retinal image (c) 40 degrees eccentricity Intensity -4 0 4 -4 0 4 Visual angle [arc min] sensory noise channel noise ν δ only implicit optical blur encoder decoder s H x W r A s ˆ image observation representation reconstruction Can also add sparseness and resource constraints NIPS 2007 Tutorial 89 Michael S. Lewicki ! Carnegie Mellon
  • 90. REVIEWS Robust coding is distinct from “reading” noisy populations a b In EQN 2, si is the direction (th 100 100 triggers the strongest respo width of the tuning curve, s 80 80 (so if s = 359° and si = 2°, the Activity (spikes s–1) Activity (spikes s–1) 60 60 ing factor. In this case, all t share a common tuning cur 40 40 preferred directions, si (FIG. 1 involve bell-shaped tuning cu 20 20 The inclusion of the no because neurons are known 0 0 –100 0 100 –100 0 100 neuron that has an average Direction (deg) Preferred direction (deg) stimulus moving at 90° mig c d occasion for a particular 90° 100 100 another occasion for exactly factors contribute to this va 80 80 trolled or uncontrollable as Activity (spikes s–1) Activity (spikes s–1) presented to the monkey, a 60 60 neuronal responses. In the collectively considered as ran 40 40 this noise causes important 20 20 transmission and processing which are solved by populati 0 0 we should be concerned no –100 0 100 –100 0 100 computes with population co Preferred direction (deg) Preferred direction (deg) From Pouget et al, 2001 reliably in the presence of suc Figure 1 | The standard population coding model. a | Bell-shaped tuning curves to direction for 16 neurons. b | A population pattern of activity across 64 neurons with bell-shaped tuning curves in Decoding population cod response to an object moving at –40°. The activity of each cell was generated using EQN 1, and In this section, we shall addr Here, we want to learn an optimal image code using noisy neurons. plotted at the location of the preferred direction of the cell. The overall activity looks like a ‘noisy’ hill what information about t centred around the stimulus direction. c | Population vector decoding fits a cosine function to the object is available from the r observed activity, and uses the peak of the cosine function, s as an estimate of the encoded ˆ, direction. d | Maximum likelihood fits a template derived from the tuning curves of the cells. More neurons? Let us take a hypoth NIPS 2007 Tutorial precisely, the template is obtained from the noiseless (or average) population activity in response to S. Lewicki we record Mellon 90 Michael that ! Carnegie the activity of and that these neurons have
  • 91. ∆W ∝ − (Cost) = 2 A (IN − AW)Σx − λ diag (Cost) = (Error) + λ (Var.∂W Const.) M (83) How do we learn robust codes? M sensory noise 2 channel noise u2 A = Σx WT (σn IM + WΣx WT 2 (Var. Const.) = ln ν 2 i δ (84) i=1 σu optical blur encoder decoder ∂ ⇔ (Cost) = O s H x W r A ∂A s ˆ image observation representation reconstruction 4 ln{diag(WΣx WT )/c} u2 2 A (IObjective: x − λ T N − AW)Σ diag T) WΣxi = i SNR (85) M diag(WΣx W 2 σn ! Find W and A that minimize reconstruction error. A •= Channel capacity of+ WΣneuron:−1 Σx WT (σn IM the ith x WT ) 2 u 2 = σu i 2 (86) 1 Ci = ln(SNRi + 1) ∂ ⇔ (Cost) = O 2 (87) ∂A • To limit capacity, fix the coefficient signal to noise ratio:√ λ 1 + λ2 √ √ √λ1 u2 2(1 + M SNR) λ2 SNRi = 2i 2 (88) σn Now robust coding is formulated as a 2 constrained optimization problem. N 1 1 u 2 = σu 2 E= M λi i N · SNR + 1 N i=1 (89) NIPS 2007 Tutorial 91 Michael S. Lewicki ! Carnegie Mellon
  • 92. Robust coding of natural images encoding neurons Add noise equivalent to 1 bit precision sensory input Original 1x efficient coding reconstruction (34% error) 1 bit precision NIPS 2007 Tutorial 92 Michael S. Lewicki ! Carnegie Mellon
  • 93. Robust coding of natural images encoding neurons Weights adapted for optimal robustness sensory input Original 1x robust coding reconstruction (3.8% error) 1 bit precision NIPS 2007 Tutorial 93 Michael S. Lewicki ! Carnegie Mellon
  • 94. Reconstruction improves by adding neurons encoding neurons Weights adapted for optimal robustness sensory input Original 8x robust coding reconstruction error: 0.6% 1 bit precision NIPS 2007 Tutorial 94 Michael S. Lewicki ! Carnegie Mellon
  • 95. Adding precision to 1x robust coding original images 1 bit: 12.5% error 2 bit: 3.1% error NIPS 2007 Tutorial 95 Michael S. Lewicki ! Carnegie Mellon
  • 96. 1 M ·2 (1, ·√ ,1 T ∈ RM . It takes a convenient fo where 1M SNR + 1· · 1) = Can 2(1 + M SNR) 2 average error bound derive minimum theoretical 2 λ2 2-D 2σx Isotropic E= M diag(V 2 · SNR + 1 when we define V (byλ1 + λ2 ) √ 22 N a linear transform of W, √ 2-D 1= 11 E = M E M · SNR + 1 Anisotropic λi if SNR ≥ SNRc N · SNR + 1 N i=1 2 2 V≡ λ1 E= = EDET √ where Σx· SNR + + λ2 the  is if SNR ≤ SNRc eigenvalue decomposition of M 1 √ λ1 λi -≡ Dii are theof the data covariance. To summarize, the c N ith eigenvalue eigenvalues of Σx its λi 1  thatinputlinear transform. V should have row vectors of un  E = MN- i=1 dimensionality .  .  · SNR + 1usN (neurons) N M Next, coding units √ a necessary condition for the min - # of let consider λN all A PPLICATION Regarding A, ∂E/∂A = O yields (se IV. free parameters.TO IMAGE CODING ata. we characterized the optimal solutions for 1-D and 2-D data. For1the high ection A = high-dim γ 2 ES(I ata, or theto be investigated. Here, we data. lower bound solutions for u s remains eigenvalues of isotoropictheoretical numerical Algorithm achieves present σ bustness to channel noise. To derive an optimal solution we can employ a grad ons) (Note that Results Bound the necessary condition with respect to W (or e function (its details are given in [3], [4]). W (or V) 19.9% is constrained to a certain subspace by the cha performance of our proposed code when applied 20.3%test image. The data consists 0.5x to a sampled T ) = 1x 512×512 pixel image. We set the number andcodin Using the channel T 12.5% constraint (eq. 9) of the n e randomlydiag( uufrom the diag(WΣx 12.4% W capacity σ 2 1 ) = u M be8xsimplified as (see Appendix B) Balcan, Doi, and Lewicki, 2007; 2.0% 2.0% Balcan and Lewicki, 2007 NIPS 2007 Tutorial 96 E = tr{D · ( Michael S. Lewicki ! Carnegie Mellon
  • 97. Sparseness localizes the vectors and increases coding efficiency normalized histogram encoding vectors (W) decoding vectors (A) of coeff. kurtosis Kurtosis of RC 100 Kurtosis of RC 100 avg. = 4.9 80 80 robust coding % % frequency 60 frequency 60 % 40 40 20 20 0 0 4 6 8 10 4 6 kurtosis 8 10 average error is unchanged kurtosis kurtosis 12.5% & 12.5% Kurtosis of RSC 100 Kurtosis of RSC 100 avg. = 7.3 robust sparse coding 80 80 % % frequency 60 frequency 60 %40 40 20 20 0 0 4 6 8 10 4 6 kurtosis 8 10 kurtosis kurtosis NIPS 2007 Tutorial 97 Michael S. Lewicki ! Carnegie Mellon
  • 98. Non-zero resource constraints localize weights (a) 20dB 10dB 0dB -10dB fovea (d) 20dB (b) Magnitude -10dB (c) Spatial freq. average error is also unchanged 40 degrees 20dB -10dB NIPS 2007 Tutorial 98 Michael S. Lewicki ! Carnegie Mellon
  • 99. $ explains data from principles Theory $ requires idealization and abstraction $ code signals accurately, efficiently, robustly Principle $ adapted to natural sensory environment $ cell response is linear, noise is additive Idealization $ simple, additive resource constraints $ information theory, natural images Methodology $ constrained optimization $ explains individual receptive fields Prediction $ explains non-linear response $ explains population organization NIPS 2007 Tutorial 99 Michael S. Lewicki ! Carnegie Mellon
  • 100. What happens next? - - + - - Learned from Hubel, 1995 NIPS 2007 Tutorial 100 Michael S. Lewicki ! Carnegie Mellon
  • 101. What happens next? NIPS 2007 Tutorial 101 Michael S. Lewicki ! Carnegie Mellon
  • 102. of filters with strongly matic gain control known as ‘divisive normalization’ that has been is larger; for a pair that used to account for many nonlinear steady-state behaviors of neu- Joint statisticsrons in primary visual cortex10,20,21. Normalization models have rtion is zero. Because L2 of filter outputs show magnitude dependence ber of other filters with- been motivated by several basic properties. First, gain control ralization of this condi- portional to a weighted neighborhood and an histo{ L | L ! 0.1 } histo{ L2 | L1 ! 0.9 } 2 1 optimal weights and an 1 1 ihood of the conditional es or sounds (Methods, 0.6 0.6 er for pairs of filters that 0.2 0.2 ge as seen through two lin- ical filter (L2), conditioned -1 0 1 -1 0 1 diagonal spatially shifted fil- r all image positions, and a 1 he frequency of occurence nsional histograms are ver- widths of these histograms e not statistically indepen- L2 0 histo{ L2| L1 } ull two-dimensional condi- tional to the bin counts, escaled to fill the range of hly decorrelated (expected -1 of L1) but not statistically -1 0 1 stribution of L2 increases L tive) of L1. 1 nature neuroscience • volume 4 no 8 • Schwartz and Simoncelli (2001) from august 2001 NIPS 2007 Tutorial 102 Michael S. Lewicki ! Carnegie Mellon
  • 103. Using sensory gain control to reduce redundancy Schwartz and Simoncelli (2001) 1 4 Fig. 4. Generic no response is divided A divisive boring filters and a normalization 0 Maximum Likeliho model The conditional his that the variance o 0 -1 -1 0 1 0 4 gram is a represen Other squared a particular mecha filter responses 2 ! F1 that is primarily e In Fig. 7b, the hig Stimulus quency bands wi F2 2 Adapt weights to DISCUSSION ! Other squared We have describ minimize higher- filter responses order dependencies processing, in w from natural images divided by a gain the squared linea NIPS 2007 Tutorial 103 Michael S. Lewicki ! Carnegie Mellon masking tone. As in the visual data, the rate–level curves of the stant. The form
  • 104. Model accounts for several non-linear response properties muli. Schwartz and Simoncelli (2001) audi- a Cell Model muli. grat- a Mask Signal (Cavanaugh et al., 2000) udi- Cell Model of a Mask Signal grat- MeanMean firing rate (Cavanaugh et al., 2000) 80 e fits of a Mask contrast: firing rate Mean 80 No mask e fits 40 0.13 gonal Mask contrast: Mean 0.5 mask No of an 40 0.13 onal 0 of a 0.03 0.3 1 0.03 0.3 1 0.5 of an Mean 0 Signal contrast Signal contrast of a non- 0.03 0.3 1 0.03 0.3 1 Mask Signal Signal contrast Mean maxi- Signal contrast non- b Mean Mean rate rate Mask Signal 80 maxi- b Mask contrast: firing firing 80 No mask 40 0.13 Mask contrast: 0.5 No mask udi- 40 0 0.13 0.03 0.3 1 0.03 0.3 1 0.5 cha- Signal contrast Signal contrast udi- have 0 0.03 0.3 1 0.03 0.3 1 ha- a V1 Signal contrast Signal contrast have NIPS 2007 Tutorial odel 104 Michael S. Lewicki ! Carnegie Mellon
  • 105. What about image structure? • Bottom-up approaches focus on the non-linearity. • Our aim here is to focus on the computational problem: How do we learn the intrinsic dimensions of natural image structure? • Idea: characterize how the local image distribution changes. NIPS 2007 Tutorial 105 Michael S. Lewicki ! Carnegie Mellon
  • 106. Perceptual organization in natural scenes image of Kyoto, Japan from E. Doi NIPS 2007 Tutorial 106 Michael S. Lewicki ! Carnegie Mellon
  • 107. Palmer and Rock’s theory of perceptual organization Palmer and Rock, 1994 NIPS 2007 Tutorial 107 Michael S. Lewicki ! Carnegie Mellon
  • 108. Gestalt grouping principles from Palmer, 1999 NIPS 2007 Tutorial 108 Michael S. Lewicki ! Carnegie Mellon
  • 109. Malik and Perona’s (1991) model of texture segregation input image output of dark bar filters output of light bar filters Visual Features NIPS 2007 Tutorial 109 Michael S. Lewicki ! Carnegie Mellon
  • 110. Malik and Perona algorithm can find texture boundaries Painting by Gustav Klimt texture boundaries from Malik and Perona algorithm NIPS 2007 Tutorial 110 Michael S. Lewicki ! Carnegie Mellon