SlideShare una empresa de Scribd logo
1 de 25
Descargar para leer sin conexión
The 2012 ICSI / Berkeley
                          Location Estimation System
                              Jaeyoung Choi,Venkatesan Ekambaram,
                             Gerald Friedland and Kannan Ramchandran

                                     ICSI / UC Berkeley, USA
                                       October 4th, 2012




Thursday, October 4, 12                                                1
Agenda

                     • Baseline Approach
                      • Drawbacks
                     • Graphical Model Framework
                     • Result

Thursday, October 4, 12                            2
Baseline Approach

                     • Investigate ‘Spatial Variance’ of feature:
                      • spatial variance is small : feature is likely
                            location-indicative
                          • spatial variance is large : feature is likely
                            not indicative



Thursday, October 4, 12                                                     3
Example
                              Tag          Matches in     Spatial Variance
                                           Training set
                           pavement             2              5.739
                          ucberkeley            4              0.132
                            berkeley           14             68.138
                             greek              0              N/A
                          greektheatre          0              N/A
                     spitonastranger            0              N/A
                              live             91            6453.109
                             video            2967           6735.844




Thursday, October 4, 12                                                      4
Problem:
       Sparsity coming from biased dataset




Thursday, October 4, 12                      5
The effect of sparsity
                                   60"

                                   50"
                 Percentage&[%]&




                                   40"

                                   30"
                                                                                                     >6400"
                                   20"                                                               6400"
                                                                                                     1600"
                                   10"                                                               400"
                                                                                                     100"
                                    0"
                                             &




                                                        &



                                                                    0&




                                                                               &



                                                                                       e&
                                           <1




                                                      00




                                                                             00



                                                                                       0≤
                                                                  00
                                            e



                                                    <1




                                                                           00
                                         0≤




                                                                                     00
                                                                <1
                                                   ≤e




                                                                         <1



                                                                                   10
                                                                 e
                                                 10




                                                              0≤




                                                                         ≤e
                                                            10




                                                                       00
                                                                     10




                                         Distance&error&(e)&between&ground&truth&and&es<ma<on&[km]

                             *  Test"video"from"a"dense"area"has"higher"chance"of"being"
                                es<mated"with"lower"error"in"distance."""                                     6

Thursday, October 4, 12                                                                                           6
Geo-­‐tagging:	
  an	
  es-ma-on
                                   -­‐theore-c	
  viewpoint
  Observa(ons:



  Images:



  Tags:                   {berkeley,	
  sathergate,	
  
                          campanile}
                                                          ,   {berkeley,	
  haas}
                                                                                    ,          ,
                                                                                        {campanile}   {campanile,	
  haas}



                                 k
                               {t1 }                      ,          k
                                                                   {t2 }            ,      k ,
                                                                                         {t3 }                  k
                                                                                                              {t4 }
Es(mate:
 Geo                               x1                     ,           x2            ,        x3 ,             x4
 loca-ons:
Thursday, October 4, 12                                                                                                      7
Interpre-ng	
  tradi-onal	
  approaches

     Loca-ons	
  are	
  random	
  variables:   {x1 , x2 , ....., xN }




Thursday, October 4, 12                                                 8
Interpre-ng	
  tradi-onal	
  approaches

     Loca-ons	
  are	
  random	
  variables:             {x1 , x2 , ....., xN }
                                                         Probability	
  of	
  loca-on	
  given	
  tags
                                                                                          Y
     Tradi-onal	
  approaches	
  es-mate:                          k                                     k
                                                           p(xi |{ti })                           p(xi |ti )
                                                                                   k
     where                        k is	
  obtained	
  from	
  the	
  training	
  set
                           p(xi |ti )




Thursday, October 4, 12                                                                                        8
Interpre-ng	
  tradi-onal	
  approaches

     Loca-ons	
  are	
  random	
  variables:             {x1 , x2 , ....., xN }
                                                         Probability	
  of	
  loca-on	
  given	
  tags
                                                                                          Y
     Tradi-onal	
  approaches	
  es-mate:                          k                                     k
                                                           p(xi |{ti })                           p(xi |ti )
                                                                                   k
     where                        k is	
  obtained	
  from	
  the	
  training	
  set
                           p(xi |ti )

       Example:	
  the	
  distribu-on	
  for	
  the	
  tag	
  
       “washington”	
  is	
  depicted	
  here




Thursday, October 4, 12                                                                                        8
Interpre-ng	
  tradi-onal	
  approaches

     Loca-ons	
  are	
  random	
  variables:             {x1 , x2 , ....., xN }
                                                         Probability	
  of	
  loca-on	
  given	
  tags
                                                                                          Y
     Tradi-onal	
  approaches	
  es-mate:                          k                                     k
                                                           p(xi |{ti })                           p(xi |ti )
                                                                                   k
     where                        k is	
  obtained	
  from	
  the	
  training	
  set
                           p(xi |ti )

       Example:	
  the	
  distribu-on	
  for	
  the	
  tag	
  
       “washington”	
  is	
  depicted	
  here
                                       Z
   Loca-on	
  es-mate:                                 k
                                            xi p(xi |{ti })dxi


Thursday, October 4, 12                                                                                        8
Drawbacks
     Data	
  sparsity:
     	
   	
  Not	
  all	
  tags	
  in	
  test	
  set	
  are	
  available	
  in	
  training	
  set.	
  
     	
  	
  	
  	
  	
  	
  	
  	
  Hence	
  es-mate	
  of	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  i	
  |tk	
  )can	
  be	
  bad	
  	
  
                                                                   p(x 	
   	
  	
  	
  	
  	
   	
  	
  
                                                                                   i
   Sub-­‐op(mality:
   	
   	
  The	
  approaches	
  are	
  subop-mal	
  given	
  the	
  data.

    What	
  we	
  ideally	
  want:                                                          k      k           k
                                                                   p(x1 , x2 , ....., xN |{t1 }, {t2 }, ..., {tN })
   Mean	
  of	
  the	
  above	
  distribu-on	
  gives	
  the	
  best	
  es-mate	
  of	
  the	
  loca-ons
   i.e.	
  for	
  each	
  image	
  we	
  want                                      k      k            k
                                                                           p(xi |{t1 }, {t2 }, ...., {tN })
   Tradi-onal	
  algorithms	
  only	
  give:                                                   k
                                                                                       p(xi |{ti })

Thursday, October 4, 12                                                                                                                        9
Bayesian	
  graphical	
  framework
                            {berkeley,	
  sathergate,	
                    {berkeley,	
  haas}
                            campanile}




                                         Edge:	
  Correlated	
  loca-ons	
  
                                         (e.g.	
  common	
  tag)
                                                                                             Node:	
  Geoloca-on	
  of	
  the	
  
                                                                                             image
                                   k                                      p(xj |{tk })
                           p(xi |{ti })                                           j


                                          p(xi , xj |{tk }
                                                       i          {tk })
                                                                    j
                          {campanile}                              {campanile,	
  haas}
                                Edge	
  Poten(al:	
  Strength	
  of	
  an	
  edge,	
  (e.g.	
  
                                posterior	
  distribu-on	
  of	
  loca-ons	
  given	
  
                                common	
  tags)
Thursday, October 4, 12                                                                                                             10
Coopera-ve	
  geo-­‐tagging
     Intui-on:	
  Images	
  in	
  the	
  training	
  set	
  having	
  common	
  tags	
  have	
  
     	
   	
   	
  	
  	
  	
  correlated	
  geo-­‐loca-ons	
  captured	
  by	
  the	
  joint	
  distribu-on




Thursday, October 4, 12                                                                                        11
Coopera-ve	
  geo-­‐tagging
     Intui-on:	
  Images	
  in	
  the	
  training	
  set	
  having	
  common	
  tags	
  have	
  
     	
   	
   	
  	
  	
  	
  correlated	
  geo-­‐loca-ons	
  captured	
  by	
  the	
  joint	
  distribu-on
     Joint	
  probability	
  modeling:
                                                          Y                       Y
    p(x1 , x2 , ....., xN |{tk }, {tk }, ..., {tk })
                             1      2           N               p(xi |{tk })
                                                                        i                p(xi , xj |{tk } ⇥ {tk })
                                                                                                      i       j
                                                            i                    (i,j)
                                                   Pairwise	
  distribu-on	
  given	
  at	
  least	
  one	
  common	
  tag




Thursday, October 4, 12                                                                                                      11
Coopera-ve	
  geo-­‐tagging
     Intui-on:	
  Images	
  in	
  the	
  training	
  set	
  having	
  common	
  tags	
  have	
  
     	
   	
   	
  	
  	
  	
  correlated	
  geo-­‐loca-ons	
  captured	
  by	
  the	
  joint	
  distribu-on
     Joint	
  probability	
  modeling:
                                                           Y                       Y
    p(x1 , x2 , ....., xN |{tk }, {tk }, ..., {tk })
                             1      2           N                p(xi |{tk })
                                                                         i                p(xi , xj |{tk } ⇥ {tk })
                                                                                                       i       j
                                                             i                    (i,j)
                                                    Pairwise	
  distribu-on	
  given	
  at	
  least	
  one	
  common	
  tag

            k
    p(xi |{ti })           is	
  obtained	
  from	
  the	
  training	
  set	
  as	
  before

   p(xi , xj |{tk }
                i          {tk }) Modeled	
  as	
  an	
  indicator	
  func-on
                             j                                                                      I(xi = xj )
If	
  the	
  common	
  tag	
  has	
  low	
  spa-al	
  variance	
  or	
  occurs	
  infrequently,	
  
e.g.	
  if	
  the	
  common	
  tag	
  is	
  “haas”,	
  its	
  very	
  likely	
  the	
  loca-ons	
  are	
  the	
  same




Thursday, October 4, 12                                                                                                       11
Coopera-ve	
  geo-­‐tagging
     Intui-on:	
  Images	
  in	
  the	
  training	
  set	
  having	
  common	
  tags	
  have	
  
     	
   	
   	
  	
  	
  	
  correlated	
  geo-­‐loca-ons	
  captured	
  by	
  the	
  joint	
  distribu-on
     Joint	
  probability	
  modeling:
                                                             Y                       Y
    p(x1 , x2 , ....., xN |{tk }, {tk }, ..., {tk })
                             1      2           N                  p(xi |{tk })
                                                                           i                p(xi , xj |{tk } ⇥ {tk })
                                                                                                         i       j
                                                               i                    (i,j)
                                                      Pairwise	
  distribu-on	
  given	
  at	
  least	
  one	
  common	
  tag

            k
    p(xi |{ti })             is	
  obtained	
  from	
  the	
  training	
  set	
  as	
  before

   p(xi , xj |{tk }
                i            {tk }) Modeled	
  as	
  an	
  indicator	
  func-on
                               j                                                                      I(xi = xj )
If	
  the	
  common	
  tag	
  has	
  low	
  spa-al	
  variance	
  or	
  occurs	
  infrequently,	
  
e.g.	
  if	
  the	
  common	
  tag	
  is	
  “haas”,	
  its	
  very	
  likely	
  the	
  loca-ons	
  are	
  the	
  same

 Ques-on:                 How	
  to	
  es-mate	
  to	
  op-mal	
  marginal	
  distribu-on	
  ?
                                               k      k            k
                                       p(xi |{t1 }, {t2 }, ...., {tN })
Thursday, October 4, 12                                                                                                         11
Belief	
  propaga-on	
  updates
   Itera-ve	
  algorithm	
  to	
  approximate	
                          k      k            k
                                                                 p(xi |{t1 }, {t2 }, ...., {tN })
   the	
  posterior	
  distribu-on

                                                                             k                                    2
     Gaussian	
  modeling                                            p(xi |{ti })               N (µi ,           i)

                                                                                       2
   At	
  itera-on	
  0	
  each	
  node	
  calculates                      (µi ,        i)

                                                                           1   (t 1)                P                  1(t)
                                                                        (t) 2 µi                +     k⇥N (i) ( (t) )2 µk
                                                         (t)          ( i )                                     k
                                                        µi       =                                  (t) 2
At	
  itera-on	
  t	
  each	
  node	
  updates	
                                                (   i )
its	
  loca-on	
  as	
  a	
  weighted	
  mean	
  of	
  its	
  
previous	
  loca-on	
  and	
  that	
  of	
  its	
                     1                     1           X              1
neighbors                                                             (t) 2
                                                                               =       (t 1) 2
                                                                                                    +             (t 1) 2
                                                                 (    i )          (   i    )           k2i   (   k    )
                          The	
  weights	
  reflect	
  the	
  confidence	
  in	
  that	
  measurements,
                          	
  i.e.	
  higher	
  the	
  spa-al	
  variance	
  lower	
  is	
  the	
  weight
Thursday, October 4, 12                                                                                                     12
Belief	
  propaga-on


                                                                        2
                                                               (µ2 ,    2)

                                                                                Posterior	
  mean	
  and	
  variance	
  
                                              2
                                    (µ3 ,     3)                                assuming	
  Gaussian	
  beliefs

                                                                                 2
                                                                        (µ1 ,    1)




    Audio	
  visual	
  features	
  are	
  incorporated	
  in	
  modeling	
  the	
  edge	
  and	
  node	
  poten-als

Thursday, October 4, 12                                                                                                    13
Incorpora-ng	
  Audio-­‐Visual	
  features
        • GIST	
  features	
  are	
  extracted	
  for	
  the	
  images.
        • MFCC	
  features	
  are	
  extracted	
  for	
  the	
  audio.
        • These	
  are	
  now	
  incorporated	
  into	
  the	
  node	
  and	
  edge	
  poten-als	
  as	
  
          exponen-al	
  distribu-ons.
                                                                        ||xi xj ||
                                  p(xi , xj |ai , aj ) ⇥ exp(                       )
                                                                         ||ai aj ||

                    ai    are	
  the	
  audio	
  features	
  associated	
  with	
  image	
  i
        The	
  intui-on	
  is	
  that	
  closer	
  the	
  audio	
  features	
  are,	
  higher	
  the	
  
        probability	
  that	
  the	
  geo-­‐loca-ons	
  are	
  closer.
        Similarly	
  this	
  can	
  be	
  included	
  in	
  the	
  node	
  poten-als	
  as	
  well	
  as	
  for	
  
        the	
  visual	
  features.

Thursday, October 4, 12                                                                                               14
Result
        • Percentage of test videos (out of 4182 videos)	
  correctly	
  es-mated	
  under	
  
          distances	
  in	
  the	
  top	
  row	
  from	
  the	
  groundtruth	
  loca-on.	
  




              – run1	
  -­‐	
  baseline	
  approach	
  without	
  using	
  gaze_eer
              – run2	
  -­‐	
  graphical	
  model	
  based	
  approach	
  with	
  gaze_eer
              – run3	
  -­‐	
  baseline	
  approach	
  with	
  gaze_eer
              – run4	
  -­‐	
  k-­‐NN	
  with	
  gist	
  visual	
  feature


        • Graphical	
  model	
  approach	
  with	
  gaze_eer	
  outperforms	
  baseline	
  approaches	
  in	
  
          range	
  above	
  1km.	
  	
  

                                                                                                            14

Thursday, October 4, 12                                                                                           15
Conclusion
        • graphical	
  model	
  framework	
  can	
  achieve	
  
          performance	
  improvement	
  over	
  baseline	
  
          approach	
  by	
  incorpora-ng	
  results	
  from	
  test	
  data	
  
        • various	
  issues	
  remain	
  to	
  be	
  explored
              –	
  the	
  modeling	
  of	
  edge	
  poten-al	
  
                     • text	
  :	
  hard	
  threshold	
  (current)	
  -­‐-­‐>	
  sod
                     • visual/audio	
  features	
  	
  
              –	
  assump-on	
  of	
  condi-onal	
  independence	
  of	
  loca-on	
  
               distribu-on	
  given	
  mul-ple	
  tags	
  

                                                                                        15

Thursday, October 4, 12                                                                      16
Thank You!
                           Questions?
                           http://mmle.icsi.berkeley.edu

                                 Work together with:
                           Venkatesan Ekambaram, Kannan
                              Ramchandran, Giulia Fanti
                          Howard Lei, Adam Janin, and Gerald
                                       Friedland               16

Thursday, October 4, 12                                             17
Thursday, October 4, 12   18
Thursday, October 4, 12   19

Más contenido relacionado

Destacado

DW dealer training 2014
DW  dealer training 2014DW  dealer training 2014
DW dealer training 2014
marypdansr
 
Spirometry Interpretation
Spirometry Interpretation Spirometry Interpretation
Spirometry Interpretation
Ashraf ElAdawy
 

Destacado (9)

Finalprez
FinalprezFinalprez
Finalprez
 
DW dealer training 2014
DW  dealer training 2014DW  dealer training 2014
DW dealer training 2014
 
Bronchial asthma
Bronchial asthmaBronchial asthma
Bronchial asthma
 
Asthma Basics
Asthma BasicsAsthma Basics
Asthma Basics
 
Spirometry and peak flow metry in bronchial asthma
Spirometry and peak flow metry in bronchial asthmaSpirometry and peak flow metry in bronchial asthma
Spirometry and peak flow metry in bronchial asthma
 
PEFR & mini peak flow meter
PEFR & mini peak flow meterPEFR & mini peak flow meter
PEFR & mini peak flow meter
 
Spirometry Interpretation
Spirometry Interpretation Spirometry Interpretation
Spirometry Interpretation
 
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika AldabaLightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
 
Succession “Losers”: What Happens to Executives Passed Over for the CEO Job?
Succession “Losers”: What Happens to Executives Passed Over for the CEO Job? Succession “Losers”: What Happens to Executives Passed Over for the CEO Job?
Succession “Losers”: What Happens to Executives Passed Over for the CEO Job?
 

Similar a The 2012 ICSI/Berkeley Video Location Estimation System (8)

Too many websites v2
Too many websites v2Too many websites v2
Too many websites v2
 
Examenolimpiada
ExamenolimpiadaExamenolimpiada
Examenolimpiada
 
Wikipedia ws
Wikipedia wsWikipedia ws
Wikipedia ws
 
Hahaha
HahahaHahaha
Hahaha
 
Enter ReLEx-goodbye excimer
Enter ReLEx-goodbye excimerEnter ReLEx-goodbye excimer
Enter ReLEx-goodbye excimer
 
Wikimedia Conference 2009 presentation
Wikimedia Conference 2009 presentationWikimedia Conference 2009 presentation
Wikimedia Conference 2009 presentation
 
Copy Of Cost Of Goods Formula 3 Year
Copy Of Cost Of Goods Formula 3 YearCopy Of Cost Of Goods Formula 3 Year
Copy Of Cost Of Goods Formula 3 Year
 
Sop control paradox slides - 07 feb12
Sop control paradox slides - 07 feb12Sop control paradox slides - 07 feb12
Sop control paradox slides - 07 feb12
 

Más de MediaEval2012

MediaEval 2012 Opening
MediaEval 2012 OpeningMediaEval 2012 Opening
MediaEval 2012 Opening
MediaEval2012
 
A Multimodal Approach for Video Geocoding
A Multimodal Approach for   Video Geocoding A Multimodal Approach for   Video Geocoding
A Multimodal Approach for Video Geocoding
MediaEval2012
 
Brave New Task: Musiclef Multimodal Music Tagging
Brave New Task: Musiclef Multimodal Music TaggingBrave New Task: Musiclef Multimodal Music Tagging
Brave New Task: Musiclef Multimodal Music Tagging
MediaEval2012
 
Search and Hyperlinking Task at MediaEval 2012
Search and Hyperlinking Task at MediaEval 2012Search and Hyperlinking Task at MediaEval 2012
Search and Hyperlinking Task at MediaEval 2012
MediaEval2012
 
CUNI at MediaEval 2012: Search and Hyperlinking Task
CUNI at MediaEval 2012: Search and Hyperlinking TaskCUNI at MediaEval 2012: Search and Hyperlinking Task
CUNI at MediaEval 2012: Search and Hyperlinking Task
MediaEval2012
 
DCU Search Runs at MediaEval 2012: Search and Hyperlinking Task
DCU Search Runs at MediaEval 2012: Search and Hyperlinking TaskDCU Search Runs at MediaEval 2012: Search and Hyperlinking Task
DCU Search Runs at MediaEval 2012: Search and Hyperlinking Task
MediaEval2012
 
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
MediaEval2012
 
Brave New Task: User Account Matching
Brave New Task: User Account MatchingBrave New Task: User Account Matching
Brave New Task: User Account Matching
MediaEval2012
 
The CLEF Initiative From 2010 to 2012 and Onwards
The CLEF Initiative From 2010 to 2012 and OnwardsThe CLEF Initiative From 2010 to 2012 and Onwards
The CLEF Initiative From 2010 to 2012 and Onwards
MediaEval2012
 
Overview of MediaEval 2012 Visual Privacy Task
Overview of MediaEval 2012 Visual Privacy TaskOverview of MediaEval 2012 Visual Privacy Task
Overview of MediaEval 2012 Visual Privacy Task
MediaEval2012
 
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...
MediaEval2012
 
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...
MediaEval2012
 
Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...
Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...
Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...
MediaEval2012
 
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...
MediaEval2012
 
The MediaEval 2012 Affect Task: Violent Scenes Detectio
The MediaEval 2012 Affect Task: Violent Scenes DetectioThe MediaEval 2012 Affect Task: Violent Scenes Detectio
The MediaEval 2012 Affect Task: Violent Scenes Detectio
MediaEval2012
 
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect TaskNII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task
MediaEval2012
 
LIG at MediaEval 2012 affect task: use of a generic method
LIG at MediaEval 2012 affect task: use of a generic methodLIG at MediaEval 2012 affect task: use of a generic method
LIG at MediaEval 2012 affect task: use of a generic method
MediaEval2012
 
Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...
Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...
Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...
MediaEval2012
 

Más de MediaEval2012 (20)

MediaEval 2012 Opening
MediaEval 2012 OpeningMediaEval 2012 Opening
MediaEval 2012 Opening
 
Closing
ClosingClosing
Closing
 
A Multimodal Approach for Video Geocoding
A Multimodal Approach for   Video Geocoding A Multimodal Approach for   Video Geocoding
A Multimodal Approach for Video Geocoding
 
Brave New Task: Musiclef Multimodal Music Tagging
Brave New Task: Musiclef Multimodal Music TaggingBrave New Task: Musiclef Multimodal Music Tagging
Brave New Task: Musiclef Multimodal Music Tagging
 
Search and Hyperlinking Task at MediaEval 2012
Search and Hyperlinking Task at MediaEval 2012Search and Hyperlinking Task at MediaEval 2012
Search and Hyperlinking Task at MediaEval 2012
 
CUNI at MediaEval 2012: Search and Hyperlinking Task
CUNI at MediaEval 2012: Search and Hyperlinking TaskCUNI at MediaEval 2012: Search and Hyperlinking Task
CUNI at MediaEval 2012: Search and Hyperlinking Task
 
DCU Search Runs at MediaEval 2012: Search and Hyperlinking Task
DCU Search Runs at MediaEval 2012: Search and Hyperlinking TaskDCU Search Runs at MediaEval 2012: Search and Hyperlinking Task
DCU Search Runs at MediaEval 2012: Search and Hyperlinking Task
 
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
 
Brave New Task: User Account Matching
Brave New Task: User Account MatchingBrave New Task: User Account Matching
Brave New Task: User Account Matching
 
The CLEF Initiative From 2010 to 2012 and Onwards
The CLEF Initiative From 2010 to 2012 and OnwardsThe CLEF Initiative From 2010 to 2012 and Onwards
The CLEF Initiative From 2010 to 2012 and Onwards
 
Overview of MediaEval 2012 Visual Privacy Task
Overview of MediaEval 2012 Visual Privacy TaskOverview of MediaEval 2012 Visual Privacy Task
Overview of MediaEval 2012 Visual Privacy Task
 
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...
 
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...
 
Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...
Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...
Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...
 
mevd2012 esra_
 mevd2012 esra_ mevd2012 esra_
mevd2012 esra_
 
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...
 
The MediaEval 2012 Affect Task: Violent Scenes Detectio
The MediaEval 2012 Affect Task: Violent Scenes DetectioThe MediaEval 2012 Affect Task: Violent Scenes Detectio
The MediaEval 2012 Affect Task: Violent Scenes Detectio
 
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect TaskNII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task
 
LIG at MediaEval 2012 affect task: use of a generic method
LIG at MediaEval 2012 affect task: use of a generic methodLIG at MediaEval 2012 affect task: use of a generic method
LIG at MediaEval 2012 affect task: use of a generic method
 
Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...
Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...
Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...
 

Último

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Último (20)

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 

The 2012 ICSI/Berkeley Video Location Estimation System

  • 1. The 2012 ICSI / Berkeley Location Estimation System Jaeyoung Choi,Venkatesan Ekambaram, Gerald Friedland and Kannan Ramchandran ICSI / UC Berkeley, USA October 4th, 2012 Thursday, October 4, 12 1
  • 2. Agenda • Baseline Approach • Drawbacks • Graphical Model Framework • Result Thursday, October 4, 12 2
  • 3. Baseline Approach • Investigate ‘Spatial Variance’ of feature: • spatial variance is small : feature is likely location-indicative • spatial variance is large : feature is likely not indicative Thursday, October 4, 12 3
  • 4. Example Tag Matches in Spatial Variance Training set pavement 2 5.739 ucberkeley 4 0.132 berkeley 14 68.138 greek 0 N/A greektheatre 0 N/A spitonastranger 0 N/A live 91 6453.109 video 2967 6735.844 Thursday, October 4, 12 4
  • 5. Problem: Sparsity coming from biased dataset Thursday, October 4, 12 5
  • 6. The effect of sparsity 60" 50" Percentage&[%]& 40" 30" >6400" 20" 6400" 1600" 10" 400" 100" 0" & & 0& & e& <1 00 00 0≤ 00 e <1 00 0≤ 00 <1 ≤e <1 10 e 10 0≤ ≤e 10 00 10 Distance&error&(e)&between&ground&truth&and&es<ma<on&[km] *  Test"video"from"a"dense"area"has"higher"chance"of"being" es<mated"with"lower"error"in"distance.""" 6 Thursday, October 4, 12 6
  • 7. Geo-­‐tagging:  an  es-ma-on -­‐theore-c  viewpoint Observa(ons: Images: Tags: {berkeley,  sathergate,   campanile} , {berkeley,  haas} , , {campanile} {campanile,  haas} k {t1 } , k {t2 } , k , {t3 } k {t4 } Es(mate: Geo x1 , x2 , x3 , x4 loca-ons: Thursday, October 4, 12 7
  • 8. Interpre-ng  tradi-onal  approaches Loca-ons  are  random  variables: {x1 , x2 , ....., xN } Thursday, October 4, 12 8
  • 9. Interpre-ng  tradi-onal  approaches Loca-ons  are  random  variables: {x1 , x2 , ....., xN } Probability  of  loca-on  given  tags Y Tradi-onal  approaches  es-mate: k k p(xi |{ti }) p(xi |ti ) k where k is  obtained  from  the  training  set p(xi |ti ) Thursday, October 4, 12 8
  • 10. Interpre-ng  tradi-onal  approaches Loca-ons  are  random  variables: {x1 , x2 , ....., xN } Probability  of  loca-on  given  tags Y Tradi-onal  approaches  es-mate: k k p(xi |{ti }) p(xi |ti ) k where k is  obtained  from  the  training  set p(xi |ti ) Example:  the  distribu-on  for  the  tag   “washington”  is  depicted  here Thursday, October 4, 12 8
  • 11. Interpre-ng  tradi-onal  approaches Loca-ons  are  random  variables: {x1 , x2 , ....., xN } Probability  of  loca-on  given  tags Y Tradi-onal  approaches  es-mate: k k p(xi |{ti }) p(xi |ti ) k where k is  obtained  from  the  training  set p(xi |ti ) Example:  the  distribu-on  for  the  tag   “washington”  is  depicted  here Z Loca-on  es-mate: k xi p(xi |{ti })dxi Thursday, October 4, 12 8
  • 12. Drawbacks Data  sparsity:    Not  all  tags  in  test  set  are  available  in  training  set.                  Hence  es-mate  of                      i  |tk  )can  be  bad     p(x                 i Sub-­‐op(mality:    The  approaches  are  subop-mal  given  the  data. What  we  ideally  want: k k k p(x1 , x2 , ....., xN |{t1 }, {t2 }, ..., {tN }) Mean  of  the  above  distribu-on  gives  the  best  es-mate  of  the  loca-ons i.e.  for  each  image  we  want k k k p(xi |{t1 }, {t2 }, ...., {tN }) Tradi-onal  algorithms  only  give: k p(xi |{ti }) Thursday, October 4, 12 9
  • 13. Bayesian  graphical  framework {berkeley,  sathergate,   {berkeley,  haas} campanile} Edge:  Correlated  loca-ons   (e.g.  common  tag) Node:  Geoloca-on  of  the   image k p(xj |{tk }) p(xi |{ti }) j p(xi , xj |{tk } i {tk }) j {campanile} {campanile,  haas} Edge  Poten(al:  Strength  of  an  edge,  (e.g.   posterior  distribu-on  of  loca-ons  given   common  tags) Thursday, October 4, 12 10
  • 14. Coopera-ve  geo-­‐tagging Intui-on:  Images  in  the  training  set  having  common  tags  have              correlated  geo-­‐loca-ons  captured  by  the  joint  distribu-on Thursday, October 4, 12 11
  • 15. Coopera-ve  geo-­‐tagging Intui-on:  Images  in  the  training  set  having  common  tags  have              correlated  geo-­‐loca-ons  captured  by  the  joint  distribu-on Joint  probability  modeling: Y Y p(x1 , x2 , ....., xN |{tk }, {tk }, ..., {tk }) 1 2 N p(xi |{tk }) i p(xi , xj |{tk } ⇥ {tk }) i j i (i,j) Pairwise  distribu-on  given  at  least  one  common  tag Thursday, October 4, 12 11
  • 16. Coopera-ve  geo-­‐tagging Intui-on:  Images  in  the  training  set  having  common  tags  have              correlated  geo-­‐loca-ons  captured  by  the  joint  distribu-on Joint  probability  modeling: Y Y p(x1 , x2 , ....., xN |{tk }, {tk }, ..., {tk }) 1 2 N p(xi |{tk }) i p(xi , xj |{tk } ⇥ {tk }) i j i (i,j) Pairwise  distribu-on  given  at  least  one  common  tag k p(xi |{ti }) is  obtained  from  the  training  set  as  before p(xi , xj |{tk } i {tk }) Modeled  as  an  indicator  func-on j I(xi = xj ) If  the  common  tag  has  low  spa-al  variance  or  occurs  infrequently,   e.g.  if  the  common  tag  is  “haas”,  its  very  likely  the  loca-ons  are  the  same Thursday, October 4, 12 11
  • 17. Coopera-ve  geo-­‐tagging Intui-on:  Images  in  the  training  set  having  common  tags  have              correlated  geo-­‐loca-ons  captured  by  the  joint  distribu-on Joint  probability  modeling: Y Y p(x1 , x2 , ....., xN |{tk }, {tk }, ..., {tk }) 1 2 N p(xi |{tk }) i p(xi , xj |{tk } ⇥ {tk }) i j i (i,j) Pairwise  distribu-on  given  at  least  one  common  tag k p(xi |{ti }) is  obtained  from  the  training  set  as  before p(xi , xj |{tk } i {tk }) Modeled  as  an  indicator  func-on j I(xi = xj ) If  the  common  tag  has  low  spa-al  variance  or  occurs  infrequently,   e.g.  if  the  common  tag  is  “haas”,  its  very  likely  the  loca-ons  are  the  same Ques-on: How  to  es-mate  to  op-mal  marginal  distribu-on  ? k k k p(xi |{t1 }, {t2 }, ...., {tN }) Thursday, October 4, 12 11
  • 18. Belief  propaga-on  updates Itera-ve  algorithm  to  approximate   k k k p(xi |{t1 }, {t2 }, ...., {tN }) the  posterior  distribu-on k 2 Gaussian  modeling p(xi |{ti }) N (µi , i) 2 At  itera-on  0  each  node  calculates (µi , i) 1 (t 1) P 1(t) (t) 2 µi + k⇥N (i) ( (t) )2 µk (t) ( i ) k µi = (t) 2 At  itera-on  t  each  node  updates   ( i ) its  loca-on  as  a  weighted  mean  of  its   previous  loca-on  and  that  of  its   1 1 X 1 neighbors (t) 2 = (t 1) 2 + (t 1) 2 ( i ) ( i ) k2i ( k ) The  weights  reflect  the  confidence  in  that  measurements,  i.e.  higher  the  spa-al  variance  lower  is  the  weight Thursday, October 4, 12 12
  • 19. Belief  propaga-on 2 (µ2 , 2) Posterior  mean  and  variance   2 (µ3 , 3) assuming  Gaussian  beliefs 2 (µ1 , 1) Audio  visual  features  are  incorporated  in  modeling  the  edge  and  node  poten-als Thursday, October 4, 12 13
  • 20. Incorpora-ng  Audio-­‐Visual  features • GIST  features  are  extracted  for  the  images. • MFCC  features  are  extracted  for  the  audio. • These  are  now  incorporated  into  the  node  and  edge  poten-als  as   exponen-al  distribu-ons. ||xi xj || p(xi , xj |ai , aj ) ⇥ exp( ) ||ai aj || ai are  the  audio  features  associated  with  image  i The  intui-on  is  that  closer  the  audio  features  are,  higher  the   probability  that  the  geo-­‐loca-ons  are  closer. Similarly  this  can  be  included  in  the  node  poten-als  as  well  as  for   the  visual  features. Thursday, October 4, 12 14
  • 21. Result • Percentage of test videos (out of 4182 videos)  correctly  es-mated  under   distances  in  the  top  row  from  the  groundtruth  loca-on.   – run1  -­‐  baseline  approach  without  using  gaze_eer – run2  -­‐  graphical  model  based  approach  with  gaze_eer – run3  -­‐  baseline  approach  with  gaze_eer – run4  -­‐  k-­‐NN  with  gist  visual  feature • Graphical  model  approach  with  gaze_eer  outperforms  baseline  approaches  in   range  above  1km.     14 Thursday, October 4, 12 15
  • 22. Conclusion • graphical  model  framework  can  achieve   performance  improvement  over  baseline   approach  by  incorpora-ng  results  from  test  data   • various  issues  remain  to  be  explored –  the  modeling  of  edge  poten-al   • text  :  hard  threshold  (current)  -­‐-­‐>  sod • visual/audio  features     –  assump-on  of  condi-onal  independence  of  loca-on   distribu-on  given  mul-ple  tags   15 Thursday, October 4, 12 16
  • 23. Thank You! Questions? http://mmle.icsi.berkeley.edu Work together with: Venkatesan Ekambaram, Kannan Ramchandran, Giulia Fanti Howard Lei, Adam Janin, and Gerald Friedland 16 Thursday, October 4, 12 17