SlideShare a Scribd company logo
1 of 21
Download to read offline
Improving Horn and Schunck’s Optical Flow
                       Algorithm

                                           Sylvain Lobry


                                 Technical Report no 1202, JULY 2012
                                            revision 2325



Computing the optical flow of a video has many applications including motion estimation, or as a first step
towards video inpainting. The optical flow equation can not be solved as is due to the aperture problem
(two unknowns for one single equation). One set of algorithms tries to solve this equation based on a
global strategy, that is, minimizing variations in the flow. The first of them is Horn-Schunck’s method.
This algorithm, even though it globally works, has several drawbacks, including being slow and unable
to find large displacements. In order to solve those problems, many strategies have been developed. We
present these strategies and analyze the benefits of a multi-layer strategy applied to Horn-Schunck.
Calculer le flux optique a de nombreuses applications telles que l’estimation de mouvement ou un pre-
mier pas vers l’inpainting vidéo. L’équation du flux optique ne peut pas être résolue telle quelle à cause
du problème de fenêtrage (deux inconnues pour une seule équation). Un ensemble d’algorithmes essaye
de résoudre cette équation en se basant sur une stratégie globale, c’est-à-dire en essayant d’avoir des pe-
tites variations dans le flux. Le premier d’entre eux est l’algorithme d’Horn-Schunck. Celui-ci, même si il
marche dans de nombreux cas, a plusieurs inconvénients, notamment celui d’être lent et de ne pas pou-
voir trouver les grands déplacements. Dans le but de résoudre ces problèmes, plusieurs stratégies ont été
développées. Dans ce rapport, nous présenterons les méthodes et analyserons les bénéfices apportés par
une stratégie multi-échelle à l’algorithme d’Horn-Schunck.
Keywords
Optical flow, Horn-Schunck’s algorithm, multi-layer approach, multiscale




                         Laboratoire de Recherche et Développement de l’Epita
                     14-16, rue Voltaire – F-94276 Le Kremlin-Bicêtre cedex – France
                              Tél. +33 1 53 14 59 47 – Fax. +33 1 53 14 59 22
                         lrde@lrde.epita.fr – http://www.lrde.epita.fr/
2


Copying this document
Copyright c 2012 LRDE.
  Permission is granted to copy, distribute and/or modify this document under the terms of
the GNU Free Documentation License, Version 1.2 or any later version published by the Free
Software Foundation; with the Invariant Sections being just “Copying this document”, no Front-
Cover Texts, and no Back-Cover Texts.
  A copy of the license is provided in the file COPYING.DOC.
Contents

1 Definition of the optical flow                                                                                                                                     6
  1.1 Basic equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                                                  6
  1.2 The aperture problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                                                   6

2 Overview of existing methods                                                                                                                                     8
  2.1 Local methods . . . . . . . . .      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   8
  2.2 Global methods . . . . . . . .       .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   9
      2.2.1 Basic idea . . . . . . .       .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   9
      2.2.2 Further improvements           .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   9

3   Implementation of a method                                                                                                                                     10
    3.1 Horn-Schunck’s method . . . . . . . . . . . . . . . . . . . . . .                                              .   .   .   .   .   .   .   .   .   .   .   10
    3.2 Coarse-to-fine version . . . . . . . . . . . . . . . . . . . . . . .                                            .   .   .   .   .   .   .   .   .   .   .   11
        3.2.1 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . .                                           .   .   .   .   .   .   .   .   .   .   .   11
        3.2.2 A generic way to implement pyramidal representation                                                      .   .   .   .   .   .   .   .   .   .   .   13

4   Results                                                                                                                                                        14
    4.1 Methodology . . . . . . . . . . . . . . . .                    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   14
        4.1.1 Representation of the optical flow                        .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   14
        4.1.2 Measures . . . . . . . . . . . . . . .                   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   16
    4.2 Results . . . . . . . . . . . . . . . . . . . .                .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   17
    4.3 Analysis of the results . . . . . . . . . . .                  .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   17

5 Bibliography                                                                                                                                                     21
Acknowledgments

The author thanks Thierry Géraud, Guillaume Lazzara, Coddy Levi and Roland Levillain.
Introduction

The optical flow is defined as the apparent motion of brightness patterns in an image. Comput-
ing it leads to many applications, including video compression and motion estimation. It is also
the first step of many video inpainting algorithms which is the reason why we want to work on
it.

  The LRDE1 has taken part into the “Terra Rush” project which has been chosen to be one of
the 18 projects from the “Investissement d’avenir”2 program sponsored by the Government of
France.
  In this project, one of our goal is the to remove subtitles embeded in videos. This can be
done using video inpainting, a subject developed by Levi (2012). Having the optical flow com-
puted over the image to be inpainted can be useful for an inpainting algorithm if well exploited.

   Even though the first attempts of computing the optical flow has been made in 1981 by Lucas
and Kanade (1981), the subject is the object of many recent studies leading to several improve-
ments. This report aims to give an overview of the state of the art in this domain and to provide
a fast implementation of one of these methods.

   In this technical report, we will first give a formal definition of the problem. The existing
methods are then overviewed and classified. We review one of those algorithms in its basic ver-
sion and in a coarse-to-fine adaptation. Finally an evaluation method is proposed and applied
to our implementation.




  1 http://lrde.epita.fr
  2 http://investissement-avenir.gouvernement.fr/
Chapter 1

Definition of the optical flow

The first definition of the optical flow given in the introduction can be easily formalized into
mathematical equations. Yet, we will see in this chapter the reasons that make this problem
ill-posed.


1.1     Basic equations
In the following, we will only handle the 2D case with 1D values (i.e. gray-level images).

   Let I(x, y, t) be the value of the (x, y) pixel at frame t and (u(x, y, t), v(x, y, t)), also denoted
(u, v), the flow at the (x, y) pixel at frame t.
   The main idea of most algorithms for computing optical flow is to use the Brightness Con-
stancy: from a frame to an other, the brightness of a pixel does not change according to the flow.
This can be written as :

                                  I(x, y, t) = I(x + u, y + v, t + 1)                                (1.1)
Equation 1.1 can be approximated with a Taylor series as:
                                                             ∂I    ∂I   ∂I
                              I(x, y, t) ≈ I(x, y, t) + u       +v    +                              (1.2)
                                                             ∂x    ∂y   ∂t
                                        ∂I            ∂I              ∂I
Introducing the abbreviations Ex =      ∂x ,   Ey =   ∂y   and Et =   ∂t   , the following holds :

                                         uEx + vEy + Et ≈ 0                                          (1.3)


1.2 The aperture problem
Equation 1.3 is often denoted as the Optical Flow Contraint. As we can see, we have a single
equation with two unknowns. This leads to an ill-posed problem which is often called the Aper-
ture Problem as illustrated by Figure 1.1.

   To compute the optical flow from Equation 1.3, we must add additional constraints to obtain
a system with more equations. As illustrated by Figure 1.1 , the optical flow cannot always be
7                                                                  Definition of the optical flow




                           (a) First frame.            (b) Second frame.

Figure 1.1: The Aperture Problem : we can not tell in which way the line is moving with only
those 2 frames.


determined by a human being for certain. However, most of the time, humans may have a good
intuition of the optical flow in a real scene by using additional information such as the context.
  Finding the constraints that are the most likely replicating this behaviour is the main chal-
lenge handled by articles about optical flow. chapter 2 presents some of these methods.
Chapter 2

Overview of existing methods

Methods for computing the optical flow can be classified in two categories. The first one is
known as local methods whereas the other takes a global approach to the problem.


2.1    Local methods
Local methods are based on the approximation that the flow does not vary inside small regions
of an image. The first method using this assumption is from Lucas and Kanade (1981).
  Following this simple idea, we can obtain the following equations, given a window of size
n × n and by denoting pk the k th pixel within this window:

                               Ex (p1 )u + Ey (p1 )v + Et (p1 )   =   0
                               Ex (p2 )u + Ey (p2 )v + Et (p2 )   =   0
                                                           ...
                           Ex (pn2 )u + Ey (pn2 )v + Et (pn2 )    =   0


   Thus, this assumption allows us to have n2 equations for 2 unknowns, giving a simple system
to solve.

  Further developments of this method includes a coarse-to-fine implementation by pyramidal
representation of the image. This have two advantages :

   • We can take into account larger displacements.
   • It can speed-up computations.
   This coarse-to-fine implementations seem to be the most used optical flow computation al-
gorithm into video inpainting solutions. On the opposite, it has not been the object of many
improvements during the recent years for optical flow algorithms. Indeed global methods seem
to have more potential since they do not rely on simplistic assumptions.
9                                                                  Overview of existing methods


2.2 Global methods
2.2.1   Basic idea
Global methods generally work by favoring small first-order derivates of the flow field. The
first method to follow this idea is from Horn and Schunck (1981) who consider the optical flow
computation by defining the measure of departure from smoothness in the flow using an L2
norm :
                                       2       2         2        2
                            2     ∂u       ∂u        ∂v       ∂v
                         Es =            +       +         +                             (2.1)
                                  ∂x       ∂y        ∂x       ∂y
  Using Equation 1.3 (denoted Ec = uEx + vEy + Et ) and Equation 2.1 and introducing a fixed
constant α2 to balance both terms, we can define the total error to be minimized :

                                 Err2 =      (α2 Es + Ec ) dx dy
                                                  2    2
                                                                                             (2.2)

We can note that having a high value for α2 will lead to smoother flows. This factor is application-
dependent.

2.2.2   Further improvements
Other approaches include weighting Es along the magnitude of the gradient (| I|) to give less
importance to areas having a high value for | I|. This is denoted by a function w. Indeed,
it is more likely to find flow discontinuities at an edge than elsewhere. The equation to be
minimized then becomes :

                             Err2 =       (α2 w(| I|)Es + Ec ) dx dy
                                                      2    2
                                                                                             (2.3)

Equation 2.3 can be improved by adopting an anisotropic approach such as the one proposed
by Werlberger et al. (2009).

  Nir et al. (2008) proposes to model the flow field with piecewise affine regions and to penal-
ize departure from it. This leads us to an over-parametrized approach which recognizes better
affine regions. Trobin et al. (2008) takes the same idea and simplify it by using second order
              2    2
                         ∂2u ∂2      ∂2     ∂2v
derivates ( ∂ u , ∂ u , ∂x∂y , ∂xv , ∂yv , ∂x∂y ) for the prior term.
            ∂x2 ∂y 2             2      2



  Another method which might have a lot of potential is proposed by Brox and Malik (2011). It
considers several energies including ones given by high-level descriptors such as SIFT. There-
fore, the energy it tries to minimize is :

                   Err2 = Ecolor + γEgradient + αEsmooth + βEmatch + Edesc                   (2.4)

with γ, α and β being application-specific. As we can see, this method takes into account many
information in addition to the simple pixel value for a basic method such as the one proposed by
Horn and Schunck (1981), with the drawback of being more expensive in terms of computation.
Chapter 3

Implementation of a method

In this chapter, we will first explain which methods seen in chapter 2 could fit our application.
That is a method which is :
   • Fast, since we need real-time computation.
   • Robust enough to give good results, even in the presence of “holes” (the inpainting zone)
     in the image.
Because it has to be fast, we made the choice to try a simple method, that is, with a low compu-
tational effort. In addition to that criterion, we wanted a method which can evolve quite easily.
The approach which seemed to fit the best our needs is the one Horn and Schunck (1981). In-
deed, since it uses the global approach, it is quite easy and straightforward to add constraints
so it can be more precise. Also, it is simple enough to be computed in real time.


3.1 Horn-Schunck’s method
As we explained in section 2.2, the method proposed by Horn and Schunck (1981) is a straight
forward application of global methods principle to compute the optical flow. In addition to
the basic equation of the optical flow (Equation 1.3), it adds the measure of departure from
smoothness (Equation 2.1). Equation 2.2 can be seen as a problem leading to the following
Euler-Lagrange equations :

                  ∂(α2 Es + Ec )
                         2   2
                                   ∂ ∂(α2 Es + Ec )
                                           2    2
                                                      ∂ ∂(α2 Es + Ec )
                                                              2    2
                                 −                  −                  =0                            (3.1)
                        ∂u         ∂x     ∂ux         ∂y     ∂uy

                  ∂(α2 Es + Ec )
                         2   2
                                   ∂ ∂(α2 Es + Ec )
                                           2    2
                                                      ∂ ∂(α2 Es + Ec )
                                                              2    2
                                 −                  −                  =0                            (3.2)
                        ∂v         ∂x     ∂vx         ∂y     ∂vy
                                                      ∂2        ∂2
  Which leads to the following equations (with ∆ =    ∂x2   +   ∂y 2 ,   the Laplacian operator) :

                               Ex (Ex u + Ey v + Et ) − α2 ∆u = 0                                    (3.3)


                               Ey (Ex u + Ey v + Et ) − α2 ∆v = 0                                    (3.4)
11                                                                        Implementation of a method


  Using a finite distance, we can approximate ∆u by ∆u = u − u where u is the local average
                                                        ¯           ¯
of u.
  Therefore, Equation 3.3 and Equation 3.4 become :

                        (α2 + Ex + Ey )(u − u) = −Ex (Ex u + Ey v + Et )
                               2    2
                                            ¯            ¯      ¯                               (3.5)


                        (α2 + Ex + Ey )(v − v ) = −Ey (Ex u + Ey v + Et )
                               2    2
                                            ¯             ¯      ¯                              (3.6)
  From two consecutive frames, we can find Ex , Ey and Et . Also, we define u and v to be the
                                                                           ¯     ¯
weighted mean from u and v its 8-connectivity neighborhood. Having those terms, we can find
an iterative solution to Equation 3.5 and Equation 3.6 :
                                                    Ex un + Ey v n + Et
                                                       ¯       ¯
                               un+1 = un − Ex
                                      ¯                                                         (3.7)
                                                      α2 + Ex + Ey
                                                             2      2


                                                    Ex un + Ey v n + Et
                                                       ¯       ¯
                               v n+1 = v n − Ey
                                       ¯                2 + E2 + E2
                                                                                                (3.8)
                                                      α      x      y

   This iterative scheme is stopped when the mean evolution of (u, v) between two consecutive
iteration is too small. That is :
                            1
                                         (un+1 − un )2 + (vi,j − vi,j )2 < ε2
                                           i,j    i,j
                                                           n+1    n
                                                                                                (3.9)
                          Nsites   i,j

  We also give an upper bound to the number of iterations, in order to ensure that the algorithm
finishes.

  We implemented this algorithm using Milena1 , leading to the results described in chapter 4.


3.2 Coarse-to-fine version
In chapter 3, we have seen how to implement the method described by Horn and Schunck (1981)
in its original version. In addition, we also implemented a multi-scale version of the algorithm.

3.2.1     Objectives
One of our main objectives, described in 3 is to have a fast computation of the optical flow.
Horn-Schunck method being iterative, a classical way to speed-up computations is to use a
coarse-to-fine strategy.
   We downsample the original images (the two consecutive frames) repetitively a certain num-
ber of time to obtain a pyramidal image illustrated by figure 3.1.
   We then run the algorithm to the lowest resolution image and find our iterative solution. This
flow is then propagated to the image below it so we can start from this solution and not having
to iterate from scratch.

   Another advantage of adapting this algorithm so it can be multiscale is the fact that we also
want to recover large displacements. The original method from Horn and Schunck (1981) is
likely to converge to a local minimum, leading to a wrong solution. With a multiscale approach,
     1 http://lrde.epita.fr/cgi-bin/twiki/view/Olena/WebHome
3.2 Coarse-to-fine version                                                                 12




Figure 3.1: A pyramid of images. We start by computing the optical flow at the coarsest scale,
then we propagate the results to the next scale before running the algorithm again. We keep
doing so until we have reached the finest scale.
13                                                                Implementation of a method


according to Fleet and Yair (2005), we avoid those local minima by guiding our convergence to
a global minimum from scale to scale.

  So there are two main objectives that can be achieved by going multiscale :
     • Our solution is faster to compute.

     • We can recover large displacement by avoiding local minima.

3.2.2    A generic way to implement pyramidal representation
Using the genericity model provided by C++ and Olena, we can implement a pyramidal image
that can work for any source or destination value type. Therefore, we provide a simple class
where the images can be accessed and modified before being downsampled or propagated.
  Thanks to this genericity model, coarse-to-fine versions of classical iterative algorithms can
be implemented easily.
  More specifically, having a generic pyramidal image is very useful in our case, since most of
the methods using the global approach provides an iterative solution.

 Since we only have genericity on value types, this generic model could be improved in two
ways :
     • We could add the genericity on structures (for now, it only works for images in 2D).

     • We could even embed the algorithm inside the structure, so you would only have to give
       a functor to the class.


The last two points were not useful for our application, but it could be an improvement to
implement them in a future work.
Chapter 4

Results

4.1    Methodology
4.1.1 Representation of the optical flow
In our algorithms, the optical flow is stored as an image of 2D vectors. Such an image can not
be displayed easily and needs its own format to be stored in. We used the ideas provided by
Baker et al. (2011).

The evaluation database Baker et al. (2011) provides a database of consecutive frames with
associated ground truth on which we can bench the results found with our 2 versions of Horn-
Schunck. We used the 8 sequences of images with public ground truth.


The .flo format We also used their format in order to store our results. This has three advan-
tages:
   • We store our results with the same format as the ground truth.

   • We use the very same format as in many other papers which allows us to compare.
   • We can submit our results in order to be evaluated by the authors.
This format is described by Baker1 and has been implemented in Olena.


Image visualization Baker et al. (2011) introduce a new way of visualizing the optical flow in
a classical 2D color image. Most of the optical flow visualization algorithms available use the
way colors are encoded (RGB, CIE...) in order to define a mapping from a vector to a color. This
can give a wrong perception of the distance since the natural distance between two colors is not
related to how they are encoded. Based on this observation, they use the work of Savard 2 in
order to map the vectors to colors based on actual human perception.
  1 http://vision.middlebury.edu/flow/code/flow-code/README.txt
  2 http://members.shaw.ca/quadibloc/other/colint.htm
15                                                                  Results




             Figure 4.1: The map from vectors to colors.




     (a) First frame                             (b) Second frame

                  Figure 4.2: Two consecutive frames
4.1 Methodology                                                                             16




Figure 4.3: A sample image of a computed optical flow for frames 4.2 with flow vectors mapped
to the wheel.


  The figure 4.1 shows the actual map from vectors to colors and figure 4.3 shows a sample
result of this visualization method.

  In Olena, this is a simple fun::v2v that can be applied with data::transform.

4.1.2 Measures
There are 2 commonly used measures for the evaluation of an optical flow algorithm. The first
of them is the angular error, the second, the error in flow endpoint.

Angular error (AE) The angular error has been introduced by Fleet and Jepson (1990) and is
defined as the arc cos of the dot product of the two vectors over the product of their lengths :

                                           1 + u × uGT + v × vGT
                         AE = cos−1 ( √                                 )                 (4.1)
                                          1 + u2 + v 2   1 + u2 + vGT
                                                              GT
                                                                   2



Error in flow endpoint (EE)    We also compute the error in flow endpoint defined by Otte and
Nagel (1994) as :
                               EE =     (u − uGT )2 + (v − vGT )2                         (4.2)
Baker et al. (2011) argues that this second measure should be favored since AE penalizes more
errors in regions of zero motion than errors in smooth non-zero motion regions.

  In the following, we report both.
17                                                                                                                         Results


                  (7, 0.1)     (7, 0.01)   (7, 0.001)    (10, 0.1)   (10, 0.01)   (10, 0.001)   (100, 0.1)   (100, 0.01)   (100, 0.001)
  Dimetrodon     (1.94, 57)   (1.73, 48)   (1.58, 42)   (1.96, 59)   (1.78, 51)    (1.56, 42)   (2.06, 63)    (2.04, 63)   (1.84, 53.6)
    Grove2       (3.02, 66)   (2.98, 63)   (2.95, 62)   (2.98, 66)   (2.91, 63)    (2.88, 61)   (3.08, 71)    (3.02, 68)    (2.78, 57)
    Grove3       (3.78, 59)   (3.70, 55)   (3.68, 53)   (3.72, 59)   (3.61, 54)    (3.57, 52)   (3.90, 69)    (3.77, 63)    (3.48, 50)
  Hydrangea      (3.48, 63)   (3.27, 56)   (3.17, 52)   (3.52, 66)   (3.27, 56)    (3.13, 51)   (3.73, 74)    (3.69, 72)    (3.40, 61)
 RubberWhale     (0.97, 37)   (0.53, 18)   (0.38, 12)   (1.04, 40)   (0.59, 20)    (0.39, 13)   (1.25, 50)    (1.23, 49)    (0.98, 36)
    Urban2       (8.22, 61)   (7.98, 53)   (7.90, 50)   (8.26, 63)   (8.02, 54)    (7.89, 50)   (8.39, 69)    (8.38, 69)    (8.26, 62)
    Urban3       (7.16, 71)   (7.00, 65)   (6.92, 62)   (7.18, 73)   (6.99, 66)    (6.88, 61)   (7.30, 79)    (7.29, 78)    (7.13, 70)
    Venus        (3.70, 63)   (3.57, 57)   (3.51, 55)   (3.69, 65)   (3.54, 58)    (3.45, 54)   (3.80, 71)    (3.74, 68)    (3.52, 58)
 Mean Time (s)      2.10         13.22       47.00         1.90        12.68         53.64         0.40          3.98         70.18


Table 4.1: The results in the form (Endpoint Error, Angular Error) compared to the ground truth
for the original version
                  (7, 0.1)     (7, 0.01)   (7, 0.001)    (10, 0.1)   (10, 0.01)   (10, 0.001)   (100, 0.1)   (100, 0.01)   (100, 0.001)
  Dimetrodon     (1.68, 45)   (1.62, 43)   (1.56, 41)   (1.67, 46)   (1.62, 43)    (1.53, 40)   (1.74, 47)    (1.73, 47)    (1.60, 42)
    Grove2       (2.77, 55)   (2.90, 60)   (2.92, 61)   (2.65, 51)   (2.80, 58)    (2.84, 60)   (2.53, 43)    (2.51, 43)    (2.47, 43)
    Grove3       (3.56, 49)   (3.64, 52)   (3.66, 52)   (3.43, 47)   (3.52, 50)    (3.54, 51)   (3.26, 40)    (3.25, 39)    (3.24, 39)
  Hydrangea      (2.92, 42)   (3.04, 47)   (3.13, 51)   (2.86, 39)   (2.94, 43)    (3.05, 48)   (2.88, 38)    (2.87, 37)    (2.74, 34)
 RubberWhale     (0.82, 29)   (0.51, 17)   (0.37, 12)   (0.91, 33)   (0.55, 19)    (0.38, 12)   (1.08, 41)    (1.08, 41)    (0.84, 29)
    Urban2       (7.86, 49)   (7.89, 50)   (7.89, 50)   (7.83, 47)   (7.86, 49)    (7.86, 49)   (7.78, 44)    (7.80, 43)    (7.85, 44)
    Urban3       (6.73, 54)   (6.81, 58)   (6.84, 58)   (6.69, 53)   (6.74, 56)    (6.76, 57)   (6.68, 50)    (6.68, 50)    (6.67, 49)
    Venus        (3.25, 45)   (3.39, 49)   (3.48, 53)   (3.19, 43)   (3.28, 47)    (3.39, 51)   (3.14, 41)    (3.13, 40)    (3.09, 39)
 Mean Time (s)      2.23         12.23       49.64         2.02        11.28         53.85         0.67          2.93         58.32


Table 4.2: The results in the form (Endpoint Error, Angular Error) compared to the ground truth
for our multi-layer version


4.2 Results
For each of the 8 frames, we compute 9 outputs by varying on two terms :
     • α takes the values 7, 10 and 100;
     •   takes the values 0.1, 0.01 and 0.001.


Tables 4.1 and 4.2 show the full results and figures 4.4, 4.5 and 4.6 show detailed results for
Dimetrodon.


4.3 Analysis of the results
With the results on the 8 images with 9 parameters showed in tables 4.1 and 4.2, we can make
several observations :
     • The results are always better with the multiscale version of Horn and Schunck (1981).
     • Sometimes, the single layer version is faster.
     • Only one tuning of parameters among the 9 introduced could be used in our real-time
       application : α = 100 and = 0.1.
     • Both errors give the same order and seem equivalent.


As expected in section 3.2, the results are better with the multiscale version of the algorithm.
  On the opposite, the fact that this second version is sometimes slower was not expected. We
can see 2 reasons for that to happen :
4.3 Analysis of the results                                              18




                        Figure 4.4: The Angular Errors for Dimetrodon.




                       Figure 4.5: The Endpoint Errors for Dimetrodon.
19                                                                                      Results




                        Figure 4.6: The computing times for Dimetrodon.


     • The time taken for the initialization of the pyramid is long. Indeed, we have to downsam-
       ple the images.

     • The number of scales might be application dependant and need to be tuned in an auto-
       matic way.
   Finally, the tuning proposed (α = 100, = 0.1) with seven scales gives pretty good results for
a small cost in term of time. This could be used in an inpainting application.
Conclusion

Optical flow algorithms have been the subject of many studies during the past years, leading
to many improvements as seen in chapter 2. We have chosen to implement one of the simplest
method, Horn and Schunck (1981), and to improve it with a coarse-to-fine version.
   We showed how this method could be suitable for a real-time application with appropriate
parameters, and how a multi-scale version of it could improve the qualitative results by giving
less errors than the original version.


Future work Even though this method could be good enough for a prior to video inpainting,
we still have some leads to explore including :

   • Reducing the domain on which we compute the optical flow. Indeed, since we want to
     use it for video inpainting, we do not need to compute it for areas too far from the region
     to inpaint.
   • We could improve the performances of the pyramidal algorithm so it could be faster. We
     should also try to be as much generic as possible so it would be reusable in other projects.

   • Since the proposed method is fast enough, we can improve its qualitative performance by
     adopting a more sophisticated approach as described in chapter 2.
Chapter 5

Bibliography

Baker, Scharstein, Lewis, Roth, Black, and Szeliski (2011). A database and evaluation method-
ology for optical flow. International Journal of Computer Vision.
Brox, T. and Malik, J. (2011). Large displacement optical flow: Descriptor matching in varia-
tional motion estimation. IEEE Trans. Pattern Anal. Mach. Intell., 33(3):500–513.

Fleet, D. J. and Jepson, A. D. (1990). Computation of component image velocity from local
phase information. International Journal of Computer Vision, 5:77–104. 10.1007/BF00056772.
Fleet, D. J. and Yair, W. (2005). Optical flow estimation. In Mathematical models for Computer
Vision: The Handbook.

Horn, B. K. P. and Schunck, B. G. (1981). Determining optical flow. ARTIFICAL INTELLI-
GENCE, 17:185–203.
Levi (2012). Fast structure preserving inpainting.
Lucas, B. D. and Kanade, T. (1981). An iterative image registration technique with an applica-
tion to stereo vision. In Proceedings of the 7th international joint conference on Artificial intelligence
- Volume 2, pages 674–679, San Francisco, CA, USA. Morgan Kaufmann Publishers Inc.
Nir, T., Bruckstein, A. M., and Kimmel, R. (2008). Over-parameterized variational optical flow.
Int. J. Comput. Vision, 76(2):205–216.
Otte, M. and Nagel, H. (1994). Optical flow estimation: Advances and comparisons. In Ek-
lundh, J.-O., editor, Computer Vision ECCV ’94, volume 800 of Lecture Notes in Computer Science,
pages 49–60. Springer Berlin / Heidelberg.
Trobin, W., Pock, T., Cremers, D., and Bischof, H. (2008). An unbiased second-order prior
for high-accuracy motion estimation. In Rigoll, G., editor, Pattern Recognition, volume 5096 of
Lecture Notes in Computer Science, pages 396–405. Springer Berlin / Heidelberg.

Werlberger, M., Trobin, W., Pock, T., Wedel, A., Cremers, D., and Bischof, H. (2009). Anisotropic
huber-l1 optical flow. In Proceedings of the British Machine Vision Conference (BMVC), London,
UK. to appear.

More Related Content

What's hot

Difference between Vector Quantization and Scalar Quantization
Difference between Vector Quantization and Scalar QuantizationDifference between Vector Quantization and Scalar Quantization
Difference between Vector Quantization and Scalar Quantization
HimanshuSirohi6
 
Face Detection techniques
Face Detection techniquesFace Detection techniques
Face Detection techniques
Abhineet Bhamra
 

What's hot (20)

I. Hill climbing algorithm II. Steepest hill climbing algorithm
I. Hill climbing algorithm II. Steepest hill climbing algorithmI. Hill climbing algorithm II. Steepest hill climbing algorithm
I. Hill climbing algorithm II. Steepest hill climbing algorithm
 
Deep Reinforcement Learning and Its Applications
Deep Reinforcement Learning and Its ApplicationsDeep Reinforcement Learning and Its Applications
Deep Reinforcement Learning and Its Applications
 
Deep Learning
Deep LearningDeep Learning
Deep Learning
 
Training Neural Networks
Training Neural NetworksTraining Neural Networks
Training Neural Networks
 
YOLOv4: optimal speed and accuracy of object detection review
YOLOv4: optimal speed and accuracy of object detection reviewYOLOv4: optimal speed and accuracy of object detection review
YOLOv4: optimal speed and accuracy of object detection review
 
Handwritten Character Recognition: A Comprehensive Review on Geometrical Anal...
Handwritten Character Recognition: A Comprehensive Review on Geometrical Anal...Handwritten Character Recognition: A Comprehensive Review on Geometrical Anal...
Handwritten Character Recognition: A Comprehensive Review on Geometrical Anal...
 
Introduction to OpenCV
Introduction to OpenCVIntroduction to OpenCV
Introduction to OpenCV
 
Activation functions and Training Algorithms for Deep Neural network
Activation functions and Training Algorithms for Deep Neural networkActivation functions and Training Algorithms for Deep Neural network
Activation functions and Training Algorithms for Deep Neural network
 
Difference between Vector Quantization and Scalar Quantization
Difference between Vector Quantization and Scalar QuantizationDifference between Vector Quantization and Scalar Quantization
Difference between Vector Quantization and Scalar Quantization
 
Local search algorithm
Local search algorithmLocal search algorithm
Local search algorithm
 
6 games
6 games6 games
6 games
 
Deep learning based object detection basics
Deep learning based object detection basicsDeep learning based object detection basics
Deep learning based object detection basics
 
Genetic algorithms in Data Mining
Genetic algorithms in Data MiningGenetic algorithms in Data Mining
Genetic algorithms in Data Mining
 
Image Quantization
Image QuantizationImage Quantization
Image Quantization
 
Long Short Term Memory
Long Short Term MemoryLong Short Term Memory
Long Short Term Memory
 
Introduction to Deep learning
Introduction to Deep learningIntroduction to Deep learning
Introduction to Deep learning
 
Face Detection techniques
Face Detection techniquesFace Detection techniques
Face Detection techniques
 
Image segmentation with deep learning
Image segmentation with deep learningImage segmentation with deep learning
Image segmentation with deep learning
 
Understanding RNN and LSTM
Understanding RNN and LSTMUnderstanding RNN and LSTM
Understanding RNN and LSTM
 
Deep Reinforcement Learning
Deep Reinforcement LearningDeep Reinforcement Learning
Deep Reinforcement Learning
 

Viewers also liked

Video Inpainting detection using inconsistencies in optical Flow
Video Inpainting detection using inconsistencies in optical FlowVideo Inpainting detection using inconsistencies in optical Flow
Video Inpainting detection using inconsistencies in optical Flow
Cybersecurity Education and Research Centre
 
Action Recognition (Thesis presentation)
Action Recognition (Thesis presentation)Action Recognition (Thesis presentation)
Action Recognition (Thesis presentation)
nikhilus85
 

Viewers also liked (8)

Optical Flow на GPU
Optical Flow на GPUOptical Flow на GPU
Optical Flow на GPU
 
"Embedded Lucas-Kanade Tracking: How it Works, How to Implement It, and How t...
"Embedded Lucas-Kanade Tracking: How it Works, How to Implement It, and How t..."Embedded Lucas-Kanade Tracking: How it Works, How to Implement It, and How t...
"Embedded Lucas-Kanade Tracking: How it Works, How to Implement It, and How t...
 
Introduction to Optial Flow
Introduction to Optial FlowIntroduction to Optial Flow
Introduction to Optial Flow
 
Video Inpainting detection using inconsistencies in optical Flow
Video Inpainting detection using inconsistencies in optical FlowVideo Inpainting detection using inconsistencies in optical Flow
Video Inpainting detection using inconsistencies in optical Flow
 
MOTION FLOW
MOTION FLOWMOTION FLOW
MOTION FLOW
 
Action Recognition (Thesis presentation)
Action Recognition (Thesis presentation)Action Recognition (Thesis presentation)
Action Recognition (Thesis presentation)
 
Moving object detection
Moving object detectionMoving object detection
Moving object detection
 
Object tracking
Object trackingObject tracking
Object tracking
 

Similar to Improving Horn and Schunck’s Optical Flow Algorithm

The Dissertation
The DissertationThe Dissertation
The Dissertation
phooji
 
Master thesis xavier pererz sala
Master thesis  xavier pererz salaMaster thesis  xavier pererz sala
Master thesis xavier pererz sala
pansuriya
 
Hub location models in public transport planning
Hub location models in public transport planningHub location models in public transport planning
Hub location models in public transport planning
sanazshn
 
Climb – Chaining Operators - Report
Climb – Chaining Operators - ReportClimb – Chaining Operators - Report
Climb – Chaining Operators - Report
Christopher Chedeau
 
Location In Wsn
Location In WsnLocation In Wsn
Location In Wsn
netfet
 
Introduction to the Finite Element Method
Introduction to the Finite Element MethodIntroduction to the Finite Element Method
Introduction to the Finite Element Method
Mohammad Tawfik
 
Robust link adaptation in HSPA Evolved
Robust link adaptation in HSPA EvolvedRobust link adaptation in HSPA Evolved
Robust link adaptation in HSPA Evolved
Daniel Göker
 

Similar to Improving Horn and Schunck’s Optical Flow Algorithm (20)

The Dissertation
The DissertationThe Dissertation
The Dissertation
 
Master thesis xavier pererz sala
Master thesis  xavier pererz salaMaster thesis  xavier pererz sala
Master thesis xavier pererz sala
 
Inkscape industrial project report
Inkscape industrial project reportInkscape industrial project report
Inkscape industrial project report
 
Climb - A Generic and Dynamic Approach to Image Processing
Climb - A Generic and Dynamic Approach to Image ProcessingClimb - A Generic and Dynamic Approach to Image Processing
Climb - A Generic and Dynamic Approach to Image Processing
 
Hub location models in public transport planning
Hub location models in public transport planningHub location models in public transport planning
Hub location models in public transport planning
 
Uml (grasp)
Uml (grasp)Uml (grasp)
Uml (grasp)
 
Notes econometricswithr
Notes econometricswithrNotes econometricswithr
Notes econometricswithr
 
Climb – Chaining Operators - Report
Climb – Chaining Operators - ReportClimb – Chaining Operators - Report
Climb – Chaining Operators - Report
 
Location In Wsn
Location In WsnLocation In Wsn
Location In Wsn
 
Home automation
Home automationHome automation
Home automation
 
Introduction to the Finite Element Method
Introduction to the Finite Element MethodIntroduction to the Finite Element Method
Introduction to the Finite Element Method
 
Solving the XP Legacy Problem with (Extreme) Meta-Programming
Solving the XP Legacy Problem with (Extreme) Meta-ProgrammingSolving the XP Legacy Problem with (Extreme) Meta-Programming
Solving the XP Legacy Problem with (Extreme) Meta-Programming
 
Robust link adaptation in HSPA Evolved
Robust link adaptation in HSPA EvolvedRobust link adaptation in HSPA Evolved
Robust link adaptation in HSPA Evolved
 
2010 RDF credit Risk
2010 RDF credit Risk2010 RDF credit Risk
2010 RDF credit Risk
 
test5
test5test5
test5
 
Sdd 2
Sdd 2Sdd 2
Sdd 2
 
test6
test6test6
test6
 
test4
test4test4
test4
 
test5
test5test5
test5
 
test6
test6test6
test6
 

Improving Horn and Schunck’s Optical Flow Algorithm

  • 1. Improving Horn and Schunck’s Optical Flow Algorithm Sylvain Lobry Technical Report no 1202, JULY 2012 revision 2325 Computing the optical flow of a video has many applications including motion estimation, or as a first step towards video inpainting. The optical flow equation can not be solved as is due to the aperture problem (two unknowns for one single equation). One set of algorithms tries to solve this equation based on a global strategy, that is, minimizing variations in the flow. The first of them is Horn-Schunck’s method. This algorithm, even though it globally works, has several drawbacks, including being slow and unable to find large displacements. In order to solve those problems, many strategies have been developed. We present these strategies and analyze the benefits of a multi-layer strategy applied to Horn-Schunck. Calculer le flux optique a de nombreuses applications telles que l’estimation de mouvement ou un pre- mier pas vers l’inpainting vidéo. L’équation du flux optique ne peut pas être résolue telle quelle à cause du problème de fenêtrage (deux inconnues pour une seule équation). Un ensemble d’algorithmes essaye de résoudre cette équation en se basant sur une stratégie globale, c’est-à-dire en essayant d’avoir des pe- tites variations dans le flux. Le premier d’entre eux est l’algorithme d’Horn-Schunck. Celui-ci, même si il marche dans de nombreux cas, a plusieurs inconvénients, notamment celui d’être lent et de ne pas pou- voir trouver les grands déplacements. Dans le but de résoudre ces problèmes, plusieurs stratégies ont été développées. Dans ce rapport, nous présenterons les méthodes et analyserons les bénéfices apportés par une stratégie multi-échelle à l’algorithme d’Horn-Schunck. Keywords Optical flow, Horn-Schunck’s algorithm, multi-layer approach, multiscale Laboratoire de Recherche et Développement de l’Epita 14-16, rue Voltaire – F-94276 Le Kremlin-Bicêtre cedex – France Tél. +33 1 53 14 59 47 – Fax. +33 1 53 14 59 22 lrde@lrde.epita.fr – http://www.lrde.epita.fr/
  • 2. 2 Copying this document Copyright c 2012 LRDE. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with the Invariant Sections being just “Copying this document”, no Front- Cover Texts, and no Back-Cover Texts. A copy of the license is provided in the file COPYING.DOC.
  • 3. Contents 1 Definition of the optical flow 6 1.1 Basic equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.2 The aperture problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2 Overview of existing methods 8 2.1 Local methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.2 Global methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.2.1 Basic idea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.2.2 Further improvements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3 Implementation of a method 10 3.1 Horn-Schunck’s method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 3.2 Coarse-to-fine version . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 3.2.1 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 3.2.2 A generic way to implement pyramidal representation . . . . . . . . . . . 13 4 Results 14 4.1 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 4.1.1 Representation of the optical flow . . . . . . . . . . . . . . . . . . . . . . . 14 4.1.2 Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 4.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 4.3 Analysis of the results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 5 Bibliography 21
  • 4. Acknowledgments The author thanks Thierry Géraud, Guillaume Lazzara, Coddy Levi and Roland Levillain.
  • 5. Introduction The optical flow is defined as the apparent motion of brightness patterns in an image. Comput- ing it leads to many applications, including video compression and motion estimation. It is also the first step of many video inpainting algorithms which is the reason why we want to work on it. The LRDE1 has taken part into the “Terra Rush” project which has been chosen to be one of the 18 projects from the “Investissement d’avenir”2 program sponsored by the Government of France. In this project, one of our goal is the to remove subtitles embeded in videos. This can be done using video inpainting, a subject developed by Levi (2012). Having the optical flow com- puted over the image to be inpainted can be useful for an inpainting algorithm if well exploited. Even though the first attempts of computing the optical flow has been made in 1981 by Lucas and Kanade (1981), the subject is the object of many recent studies leading to several improve- ments. This report aims to give an overview of the state of the art in this domain and to provide a fast implementation of one of these methods. In this technical report, we will first give a formal definition of the problem. The existing methods are then overviewed and classified. We review one of those algorithms in its basic ver- sion and in a coarse-to-fine adaptation. Finally an evaluation method is proposed and applied to our implementation. 1 http://lrde.epita.fr 2 http://investissement-avenir.gouvernement.fr/
  • 6. Chapter 1 Definition of the optical flow The first definition of the optical flow given in the introduction can be easily formalized into mathematical equations. Yet, we will see in this chapter the reasons that make this problem ill-posed. 1.1 Basic equations In the following, we will only handle the 2D case with 1D values (i.e. gray-level images). Let I(x, y, t) be the value of the (x, y) pixel at frame t and (u(x, y, t), v(x, y, t)), also denoted (u, v), the flow at the (x, y) pixel at frame t. The main idea of most algorithms for computing optical flow is to use the Brightness Con- stancy: from a frame to an other, the brightness of a pixel does not change according to the flow. This can be written as : I(x, y, t) = I(x + u, y + v, t + 1) (1.1) Equation 1.1 can be approximated with a Taylor series as: ∂I ∂I ∂I I(x, y, t) ≈ I(x, y, t) + u +v + (1.2) ∂x ∂y ∂t ∂I ∂I ∂I Introducing the abbreviations Ex = ∂x , Ey = ∂y and Et = ∂t , the following holds : uEx + vEy + Et ≈ 0 (1.3) 1.2 The aperture problem Equation 1.3 is often denoted as the Optical Flow Contraint. As we can see, we have a single equation with two unknowns. This leads to an ill-posed problem which is often called the Aper- ture Problem as illustrated by Figure 1.1. To compute the optical flow from Equation 1.3, we must add additional constraints to obtain a system with more equations. As illustrated by Figure 1.1 , the optical flow cannot always be
  • 7. 7 Definition of the optical flow (a) First frame. (b) Second frame. Figure 1.1: The Aperture Problem : we can not tell in which way the line is moving with only those 2 frames. determined by a human being for certain. However, most of the time, humans may have a good intuition of the optical flow in a real scene by using additional information such as the context. Finding the constraints that are the most likely replicating this behaviour is the main chal- lenge handled by articles about optical flow. chapter 2 presents some of these methods.
  • 8. Chapter 2 Overview of existing methods Methods for computing the optical flow can be classified in two categories. The first one is known as local methods whereas the other takes a global approach to the problem. 2.1 Local methods Local methods are based on the approximation that the flow does not vary inside small regions of an image. The first method using this assumption is from Lucas and Kanade (1981). Following this simple idea, we can obtain the following equations, given a window of size n × n and by denoting pk the k th pixel within this window: Ex (p1 )u + Ey (p1 )v + Et (p1 ) = 0 Ex (p2 )u + Ey (p2 )v + Et (p2 ) = 0 ... Ex (pn2 )u + Ey (pn2 )v + Et (pn2 ) = 0 Thus, this assumption allows us to have n2 equations for 2 unknowns, giving a simple system to solve. Further developments of this method includes a coarse-to-fine implementation by pyramidal representation of the image. This have two advantages : • We can take into account larger displacements. • It can speed-up computations. This coarse-to-fine implementations seem to be the most used optical flow computation al- gorithm into video inpainting solutions. On the opposite, it has not been the object of many improvements during the recent years for optical flow algorithms. Indeed global methods seem to have more potential since they do not rely on simplistic assumptions.
  • 9. 9 Overview of existing methods 2.2 Global methods 2.2.1 Basic idea Global methods generally work by favoring small first-order derivates of the flow field. The first method to follow this idea is from Horn and Schunck (1981) who consider the optical flow computation by defining the measure of departure from smoothness in the flow using an L2 norm : 2 2 2 2 2 ∂u ∂u ∂v ∂v Es = + + + (2.1) ∂x ∂y ∂x ∂y Using Equation 1.3 (denoted Ec = uEx + vEy + Et ) and Equation 2.1 and introducing a fixed constant α2 to balance both terms, we can define the total error to be minimized : Err2 = (α2 Es + Ec ) dx dy 2 2 (2.2) We can note that having a high value for α2 will lead to smoother flows. This factor is application- dependent. 2.2.2 Further improvements Other approaches include weighting Es along the magnitude of the gradient (| I|) to give less importance to areas having a high value for | I|. This is denoted by a function w. Indeed, it is more likely to find flow discontinuities at an edge than elsewhere. The equation to be minimized then becomes : Err2 = (α2 w(| I|)Es + Ec ) dx dy 2 2 (2.3) Equation 2.3 can be improved by adopting an anisotropic approach such as the one proposed by Werlberger et al. (2009). Nir et al. (2008) proposes to model the flow field with piecewise affine regions and to penal- ize departure from it. This leads us to an over-parametrized approach which recognizes better affine regions. Trobin et al. (2008) takes the same idea and simplify it by using second order 2 2 ∂2u ∂2 ∂2 ∂2v derivates ( ∂ u , ∂ u , ∂x∂y , ∂xv , ∂yv , ∂x∂y ) for the prior term. ∂x2 ∂y 2 2 2 Another method which might have a lot of potential is proposed by Brox and Malik (2011). It considers several energies including ones given by high-level descriptors such as SIFT. There- fore, the energy it tries to minimize is : Err2 = Ecolor + γEgradient + αEsmooth + βEmatch + Edesc (2.4) with γ, α and β being application-specific. As we can see, this method takes into account many information in addition to the simple pixel value for a basic method such as the one proposed by Horn and Schunck (1981), with the drawback of being more expensive in terms of computation.
  • 10. Chapter 3 Implementation of a method In this chapter, we will first explain which methods seen in chapter 2 could fit our application. That is a method which is : • Fast, since we need real-time computation. • Robust enough to give good results, even in the presence of “holes” (the inpainting zone) in the image. Because it has to be fast, we made the choice to try a simple method, that is, with a low compu- tational effort. In addition to that criterion, we wanted a method which can evolve quite easily. The approach which seemed to fit the best our needs is the one Horn and Schunck (1981). In- deed, since it uses the global approach, it is quite easy and straightforward to add constraints so it can be more precise. Also, it is simple enough to be computed in real time. 3.1 Horn-Schunck’s method As we explained in section 2.2, the method proposed by Horn and Schunck (1981) is a straight forward application of global methods principle to compute the optical flow. In addition to the basic equation of the optical flow (Equation 1.3), it adds the measure of departure from smoothness (Equation 2.1). Equation 2.2 can be seen as a problem leading to the following Euler-Lagrange equations : ∂(α2 Es + Ec ) 2 2 ∂ ∂(α2 Es + Ec ) 2 2 ∂ ∂(α2 Es + Ec ) 2 2 − − =0 (3.1) ∂u ∂x ∂ux ∂y ∂uy ∂(α2 Es + Ec ) 2 2 ∂ ∂(α2 Es + Ec ) 2 2 ∂ ∂(α2 Es + Ec ) 2 2 − − =0 (3.2) ∂v ∂x ∂vx ∂y ∂vy ∂2 ∂2 Which leads to the following equations (with ∆ = ∂x2 + ∂y 2 , the Laplacian operator) : Ex (Ex u + Ey v + Et ) − α2 ∆u = 0 (3.3) Ey (Ex u + Ey v + Et ) − α2 ∆v = 0 (3.4)
  • 11. 11 Implementation of a method Using a finite distance, we can approximate ∆u by ∆u = u − u where u is the local average ¯ ¯ of u. Therefore, Equation 3.3 and Equation 3.4 become : (α2 + Ex + Ey )(u − u) = −Ex (Ex u + Ey v + Et ) 2 2 ¯ ¯ ¯ (3.5) (α2 + Ex + Ey )(v − v ) = −Ey (Ex u + Ey v + Et ) 2 2 ¯ ¯ ¯ (3.6) From two consecutive frames, we can find Ex , Ey and Et . Also, we define u and v to be the ¯ ¯ weighted mean from u and v its 8-connectivity neighborhood. Having those terms, we can find an iterative solution to Equation 3.5 and Equation 3.6 : Ex un + Ey v n + Et ¯ ¯ un+1 = un − Ex ¯ (3.7) α2 + Ex + Ey 2 2 Ex un + Ey v n + Et ¯ ¯ v n+1 = v n − Ey ¯ 2 + E2 + E2 (3.8) α x y This iterative scheme is stopped when the mean evolution of (u, v) between two consecutive iteration is too small. That is : 1 (un+1 − un )2 + (vi,j − vi,j )2 < ε2 i,j i,j n+1 n (3.9) Nsites i,j We also give an upper bound to the number of iterations, in order to ensure that the algorithm finishes. We implemented this algorithm using Milena1 , leading to the results described in chapter 4. 3.2 Coarse-to-fine version In chapter 3, we have seen how to implement the method described by Horn and Schunck (1981) in its original version. In addition, we also implemented a multi-scale version of the algorithm. 3.2.1 Objectives One of our main objectives, described in 3 is to have a fast computation of the optical flow. Horn-Schunck method being iterative, a classical way to speed-up computations is to use a coarse-to-fine strategy. We downsample the original images (the two consecutive frames) repetitively a certain num- ber of time to obtain a pyramidal image illustrated by figure 3.1. We then run the algorithm to the lowest resolution image and find our iterative solution. This flow is then propagated to the image below it so we can start from this solution and not having to iterate from scratch. Another advantage of adapting this algorithm so it can be multiscale is the fact that we also want to recover large displacements. The original method from Horn and Schunck (1981) is likely to converge to a local minimum, leading to a wrong solution. With a multiscale approach, 1 http://lrde.epita.fr/cgi-bin/twiki/view/Olena/WebHome
  • 12. 3.2 Coarse-to-fine version 12 Figure 3.1: A pyramid of images. We start by computing the optical flow at the coarsest scale, then we propagate the results to the next scale before running the algorithm again. We keep doing so until we have reached the finest scale.
  • 13. 13 Implementation of a method according to Fleet and Yair (2005), we avoid those local minima by guiding our convergence to a global minimum from scale to scale. So there are two main objectives that can be achieved by going multiscale : • Our solution is faster to compute. • We can recover large displacement by avoiding local minima. 3.2.2 A generic way to implement pyramidal representation Using the genericity model provided by C++ and Olena, we can implement a pyramidal image that can work for any source or destination value type. Therefore, we provide a simple class where the images can be accessed and modified before being downsampled or propagated. Thanks to this genericity model, coarse-to-fine versions of classical iterative algorithms can be implemented easily. More specifically, having a generic pyramidal image is very useful in our case, since most of the methods using the global approach provides an iterative solution. Since we only have genericity on value types, this generic model could be improved in two ways : • We could add the genericity on structures (for now, it only works for images in 2D). • We could even embed the algorithm inside the structure, so you would only have to give a functor to the class. The last two points were not useful for our application, but it could be an improvement to implement them in a future work.
  • 14. Chapter 4 Results 4.1 Methodology 4.1.1 Representation of the optical flow In our algorithms, the optical flow is stored as an image of 2D vectors. Such an image can not be displayed easily and needs its own format to be stored in. We used the ideas provided by Baker et al. (2011). The evaluation database Baker et al. (2011) provides a database of consecutive frames with associated ground truth on which we can bench the results found with our 2 versions of Horn- Schunck. We used the 8 sequences of images with public ground truth. The .flo format We also used their format in order to store our results. This has three advan- tages: • We store our results with the same format as the ground truth. • We use the very same format as in many other papers which allows us to compare. • We can submit our results in order to be evaluated by the authors. This format is described by Baker1 and has been implemented in Olena. Image visualization Baker et al. (2011) introduce a new way of visualizing the optical flow in a classical 2D color image. Most of the optical flow visualization algorithms available use the way colors are encoded (RGB, CIE...) in order to define a mapping from a vector to a color. This can give a wrong perception of the distance since the natural distance between two colors is not related to how they are encoded. Based on this observation, they use the work of Savard 2 in order to map the vectors to colors based on actual human perception. 1 http://vision.middlebury.edu/flow/code/flow-code/README.txt 2 http://members.shaw.ca/quadibloc/other/colint.htm
  • 15. 15 Results Figure 4.1: The map from vectors to colors. (a) First frame (b) Second frame Figure 4.2: Two consecutive frames
  • 16. 4.1 Methodology 16 Figure 4.3: A sample image of a computed optical flow for frames 4.2 with flow vectors mapped to the wheel. The figure 4.1 shows the actual map from vectors to colors and figure 4.3 shows a sample result of this visualization method. In Olena, this is a simple fun::v2v that can be applied with data::transform. 4.1.2 Measures There are 2 commonly used measures for the evaluation of an optical flow algorithm. The first of them is the angular error, the second, the error in flow endpoint. Angular error (AE) The angular error has been introduced by Fleet and Jepson (1990) and is defined as the arc cos of the dot product of the two vectors over the product of their lengths : 1 + u × uGT + v × vGT AE = cos−1 ( √ ) (4.1) 1 + u2 + v 2 1 + u2 + vGT GT 2 Error in flow endpoint (EE) We also compute the error in flow endpoint defined by Otte and Nagel (1994) as : EE = (u − uGT )2 + (v − vGT )2 (4.2) Baker et al. (2011) argues that this second measure should be favored since AE penalizes more errors in regions of zero motion than errors in smooth non-zero motion regions. In the following, we report both.
  • 17. 17 Results (7, 0.1) (7, 0.01) (7, 0.001) (10, 0.1) (10, 0.01) (10, 0.001) (100, 0.1) (100, 0.01) (100, 0.001) Dimetrodon (1.94, 57) (1.73, 48) (1.58, 42) (1.96, 59) (1.78, 51) (1.56, 42) (2.06, 63) (2.04, 63) (1.84, 53.6) Grove2 (3.02, 66) (2.98, 63) (2.95, 62) (2.98, 66) (2.91, 63) (2.88, 61) (3.08, 71) (3.02, 68) (2.78, 57) Grove3 (3.78, 59) (3.70, 55) (3.68, 53) (3.72, 59) (3.61, 54) (3.57, 52) (3.90, 69) (3.77, 63) (3.48, 50) Hydrangea (3.48, 63) (3.27, 56) (3.17, 52) (3.52, 66) (3.27, 56) (3.13, 51) (3.73, 74) (3.69, 72) (3.40, 61) RubberWhale (0.97, 37) (0.53, 18) (0.38, 12) (1.04, 40) (0.59, 20) (0.39, 13) (1.25, 50) (1.23, 49) (0.98, 36) Urban2 (8.22, 61) (7.98, 53) (7.90, 50) (8.26, 63) (8.02, 54) (7.89, 50) (8.39, 69) (8.38, 69) (8.26, 62) Urban3 (7.16, 71) (7.00, 65) (6.92, 62) (7.18, 73) (6.99, 66) (6.88, 61) (7.30, 79) (7.29, 78) (7.13, 70) Venus (3.70, 63) (3.57, 57) (3.51, 55) (3.69, 65) (3.54, 58) (3.45, 54) (3.80, 71) (3.74, 68) (3.52, 58) Mean Time (s) 2.10 13.22 47.00 1.90 12.68 53.64 0.40 3.98 70.18 Table 4.1: The results in the form (Endpoint Error, Angular Error) compared to the ground truth for the original version (7, 0.1) (7, 0.01) (7, 0.001) (10, 0.1) (10, 0.01) (10, 0.001) (100, 0.1) (100, 0.01) (100, 0.001) Dimetrodon (1.68, 45) (1.62, 43) (1.56, 41) (1.67, 46) (1.62, 43) (1.53, 40) (1.74, 47) (1.73, 47) (1.60, 42) Grove2 (2.77, 55) (2.90, 60) (2.92, 61) (2.65, 51) (2.80, 58) (2.84, 60) (2.53, 43) (2.51, 43) (2.47, 43) Grove3 (3.56, 49) (3.64, 52) (3.66, 52) (3.43, 47) (3.52, 50) (3.54, 51) (3.26, 40) (3.25, 39) (3.24, 39) Hydrangea (2.92, 42) (3.04, 47) (3.13, 51) (2.86, 39) (2.94, 43) (3.05, 48) (2.88, 38) (2.87, 37) (2.74, 34) RubberWhale (0.82, 29) (0.51, 17) (0.37, 12) (0.91, 33) (0.55, 19) (0.38, 12) (1.08, 41) (1.08, 41) (0.84, 29) Urban2 (7.86, 49) (7.89, 50) (7.89, 50) (7.83, 47) (7.86, 49) (7.86, 49) (7.78, 44) (7.80, 43) (7.85, 44) Urban3 (6.73, 54) (6.81, 58) (6.84, 58) (6.69, 53) (6.74, 56) (6.76, 57) (6.68, 50) (6.68, 50) (6.67, 49) Venus (3.25, 45) (3.39, 49) (3.48, 53) (3.19, 43) (3.28, 47) (3.39, 51) (3.14, 41) (3.13, 40) (3.09, 39) Mean Time (s) 2.23 12.23 49.64 2.02 11.28 53.85 0.67 2.93 58.32 Table 4.2: The results in the form (Endpoint Error, Angular Error) compared to the ground truth for our multi-layer version 4.2 Results For each of the 8 frames, we compute 9 outputs by varying on two terms : • α takes the values 7, 10 and 100; • takes the values 0.1, 0.01 and 0.001. Tables 4.1 and 4.2 show the full results and figures 4.4, 4.5 and 4.6 show detailed results for Dimetrodon. 4.3 Analysis of the results With the results on the 8 images with 9 parameters showed in tables 4.1 and 4.2, we can make several observations : • The results are always better with the multiscale version of Horn and Schunck (1981). • Sometimes, the single layer version is faster. • Only one tuning of parameters among the 9 introduced could be used in our real-time application : α = 100 and = 0.1. • Both errors give the same order and seem equivalent. As expected in section 3.2, the results are better with the multiscale version of the algorithm. On the opposite, the fact that this second version is sometimes slower was not expected. We can see 2 reasons for that to happen :
  • 18. 4.3 Analysis of the results 18 Figure 4.4: The Angular Errors for Dimetrodon. Figure 4.5: The Endpoint Errors for Dimetrodon.
  • 19. 19 Results Figure 4.6: The computing times for Dimetrodon. • The time taken for the initialization of the pyramid is long. Indeed, we have to downsam- ple the images. • The number of scales might be application dependant and need to be tuned in an auto- matic way. Finally, the tuning proposed (α = 100, = 0.1) with seven scales gives pretty good results for a small cost in term of time. This could be used in an inpainting application.
  • 20. Conclusion Optical flow algorithms have been the subject of many studies during the past years, leading to many improvements as seen in chapter 2. We have chosen to implement one of the simplest method, Horn and Schunck (1981), and to improve it with a coarse-to-fine version. We showed how this method could be suitable for a real-time application with appropriate parameters, and how a multi-scale version of it could improve the qualitative results by giving less errors than the original version. Future work Even though this method could be good enough for a prior to video inpainting, we still have some leads to explore including : • Reducing the domain on which we compute the optical flow. Indeed, since we want to use it for video inpainting, we do not need to compute it for areas too far from the region to inpaint. • We could improve the performances of the pyramidal algorithm so it could be faster. We should also try to be as much generic as possible so it would be reusable in other projects. • Since the proposed method is fast enough, we can improve its qualitative performance by adopting a more sophisticated approach as described in chapter 2.
  • 21. Chapter 5 Bibliography Baker, Scharstein, Lewis, Roth, Black, and Szeliski (2011). A database and evaluation method- ology for optical flow. International Journal of Computer Vision. Brox, T. and Malik, J. (2011). Large displacement optical flow: Descriptor matching in varia- tional motion estimation. IEEE Trans. Pattern Anal. Mach. Intell., 33(3):500–513. Fleet, D. J. and Jepson, A. D. (1990). Computation of component image velocity from local phase information. International Journal of Computer Vision, 5:77–104. 10.1007/BF00056772. Fleet, D. J. and Yair, W. (2005). Optical flow estimation. In Mathematical models for Computer Vision: The Handbook. Horn, B. K. P. and Schunck, B. G. (1981). Determining optical flow. ARTIFICAL INTELLI- GENCE, 17:185–203. Levi (2012). Fast structure preserving inpainting. Lucas, B. D. and Kanade, T. (1981). An iterative image registration technique with an applica- tion to stereo vision. In Proceedings of the 7th international joint conference on Artificial intelligence - Volume 2, pages 674–679, San Francisco, CA, USA. Morgan Kaufmann Publishers Inc. Nir, T., Bruckstein, A. M., and Kimmel, R. (2008). Over-parameterized variational optical flow. Int. J. Comput. Vision, 76(2):205–216. Otte, M. and Nagel, H. (1994). Optical flow estimation: Advances and comparisons. In Ek- lundh, J.-O., editor, Computer Vision ECCV ’94, volume 800 of Lecture Notes in Computer Science, pages 49–60. Springer Berlin / Heidelberg. Trobin, W., Pock, T., Cremers, D., and Bischof, H. (2008). An unbiased second-order prior for high-accuracy motion estimation. In Rigoll, G., editor, Pattern Recognition, volume 5096 of Lecture Notes in Computer Science, pages 396–405. Springer Berlin / Heidelberg. Werlberger, M., Trobin, W., Pock, T., Wedel, A., Cremers, D., and Bischof, H. (2009). Anisotropic huber-l1 optical flow. In Proceedings of the British Machine Vision Conference (BMVC), London, UK. to appear.