2. Definition of Tracking
• Tracking:
– Generate some conclusions about the motion of
the scene, objects, or the camera, given a
sequence of images.
– Knowing this motion, predict where things are
going to project in the next image, so that we
don’t have so much work looking for them.
3. Why Track?
• Detection and
recognition are expensive
• If we get an idea of
where an object is in
the image because we
have an idea of the
motion from previous
images, we need less
work detecting or
recognizing the object.
Scene
Image Sequence
Detection
+ recognition
Tracking
(less detection,
no recognition) New
person
A B
C
A B
C
4. Tracking a Silhouette by
Measuring Edge Positions
• Observations are positions of edges along normals to tracked contour
5. Why not Wait and Process the
Set of Images as a Batch?
• In a car system, detecting and tracking
pedestrians in real time is important.
• Recursive methods require less computing
6. Implicit Assumptions of
Tracking
• Physical cameras do not move instantly
from a viewpoint to another
• Object do not teleport between places
around the scene
• Relative position between camera and scene
changes incrementally
• We can model motion
8. The Problem: Signal Estimation
• We have a system with parameters
– Scene structure, camera motion, automatic zoom
– System state is unknown (“hidden”)
• We have measurements
– Components of stable “feature points” in the
images.
– “Observations”, projections of the state.
• We want to recover the state components from
the observations
9. Necessary Models
System Dynamics
Model
Measurement
Model (projection)
(u, v)
Previous State
Next State
State
Measurement
• We use models to describe a priori knowledge about
• the world (including external parameters of camera)
• the imaging projection process
10. State variable
a
A Simple Example of Estimation
by Least Square Method
• Goal: Find estimate of state
such that the least square
error between measurements
and the state is minimum
aˆ
∑=
−=
n
i
i axC
1
2
)(
2
1
∑∑ ==
−=−==
∂
∂ n
i
i
n
i
i anxax
a
C
11
ˆ)ˆ(0
∑=
=
n
i
ix
n
a
1
1
ˆ
a
Measurement
x
x
t
xi
a xi-a
ti
11. Recursive Least Square Estimation
• We don’t want to wait until
all data have been collected
to get an estimate of the
depth
• We don’t want to reprocess
old data when we make a
new measurement
• Recursive method: data at
step i are obtained from
data at step i-1
aˆ
State variable
a
Measurement
x
x
t
a
xi
1
ˆ −ia iaˆ
12. Recursive Least Square Estimation 2
• Recursive method: data at
step i are obtained from
data at step i-1
x
t
a
xi
)ˆ(
1
ˆˆ 11 −− −+= iiii ax
i
aa
i
i
k
k
i
k
ki x
i
x
i
x
i
a
111
ˆ
1
11
∑∑
−
==
+==
∑
−
=
−
−
=
1
1
1
1
1
ˆ
i
k
ki x
i
a
iii x
i
a
i
i
a
1
ˆ
1
ˆ 1 +
−
= −
13. Recursive Least Square Estimation 3
)ˆ(
1
ˆˆ 11 −− −+= iiii ax
i
aa
Estimate at step i
Predicted
measure
Innovation
Gain
Actual
measure
Gain specifies how much
do we pay attention
to the difference
between what we expected
and what we actually get
14. Least Square Estimation of the
State Vector of a Static System
1. Batch method
a
H1 Hi
H2
x1
xi
x2
aH ⋅= iix a
H
H
H
=
⇒
nnx
x
x
......
2
1
2
1
aHx =⇒
measurement equation
XHH)(Ha T1T −
=ˆ
Find estimate that minimizesaˆ
a)H(Xa)H(X T
−−=
2
1
C
We find
15. Least Square Estimation
of the State Vector
of a Static System 2
2. Recursive method
a
H1 Hi
H2
x1
xi
x2
)ˆˆˆ -1iiii-1ii aH(XKaa −+=
with
Calculation is similar to calculation of running average
Now we find:
We had: )ˆ(
1
ˆˆ 11 −− −+= iiii ax
i
aa
1
)( −
=
=
T
iii
T
iii
HHP
HPK
Gain matrix
Innovation
Predicted measure
17. )ˆˆˆ -1iiiii-1iii aH(xKaa Φ−+Φ=
Recursive Least Square
Estimation for a Dynamic System
(Kalman Filter)
-1i-1iii waa +Φ=
iiii vaHx +=
),0(~ ii Qw N
),0(~ ii Rv N
-1i-1i-1i-1i
-1i
T
i-1iii
-1
i
T
iii
T
iii
)P'HK(IP
QPP'
RHP'HHP'K
−=
+ΦΦ=
+= )(
State equation
Measurement equation
Tweak factor for model
Measurement noise
Prediction for xi
Prediction for ai
Gain
Covariance matrix for
prediction error
Covariance for estimation error
19. • Predict next state as using
previous step and dynamic model
• Predict regions
of next measurements using
measurement model and
uncertainties
• Make new measurements xi in
predicted regions
• Compute best estimate of next state
Tracking Steps
(u, v)Measurement
Prediction
region
-1iiaˆΦ
),ˆ( i-1iii P'aH ΦN
)ˆˆˆ -1iiiii-1iii aH(xKaa Φ−+Φ=
“Correction” of predicted state
20. Recursive Least Square Estimation for
a Dynamic System (Kalman Filter)
x
t
Measurement xi
State vector ai
Estimation iaˆ
21. Tracking as a Probabilistic
Inference Problem
• Find distributions for state vector ai and for
measurement vector xi. Then we are able to
compute the expectations and
• Simplifying assumptions (same as for HMM)
ixˆ
)|(),,,|( -1ii-1i21i aaaaaa PP =L
)|()|()|()|,,,( ikijiiiji axaxaxaxx PPPP KK =
(Conditional independence of
measurements given a state)
(Only immediate past matters)
iaˆ
22. Tracking as Inference
• Prediction
• Correction
• Produces same results as least square approach if
distributions are Gaussians: Kalman filter
• See Forsyth and Ponce, Ch. 19
∫= -1i-1i1-1i-1ii-1i1i axxaaaxxa dPPP ),,|()|(),,|( LL
∫
=
i-1i1iii
-1i1iii
i1i
axxaax
xxaax
xxa
dPP
PP
P
),,|()|(
),,|()|(
),,|(
L
L
L
23. )ˆ(ˆˆ 11 i-iii-i afhxKafa −+=
Kalman Filter for 1D Signals
11 i-i-i wafa +=
iii vahx +=
)0(~ ,qNwi
)0(~ ,rNvi
111
1
2
12
')1(
)(
i-ii-
ii
-
iii
phKp
qpfp'
rp'hhp'K
−
−
−=
+=
+=
State equation
Measurement equation
Tweak factor for model
Measurement noise
Prediction for xi
Prediction for ai (a priori estimate)
Gain
Standard deviation for
prediction error
St.d. for estimation error
24. Applications: Structure from Motion
• Measurement vector components:
– Coordinates of corners, “salient
points”
• State vector components:
– Camera motion parameters
– Scene structure
• Is there enough equations?
– N corners, 2N measurements
– N unknown state components from structure
(distances from first center of projection to
3D points)
– 6 unknown state components from motion
(translation and rotation)
– More measurements than unknowns for every
frame if N>6 (2N > N + 6)
xi
• Batch methods
• Recursive methods
(Kalman filter)
25. Problems with Tracking
• Initial detection
– If it is too slow we will never catch up
– If it is fast, why not do detection at every frame?
• Even if raw detection can be done in real time, tracking
saves processing cycles compared to raw detection.
The CPU has other things to do.
• Detection is needed again if you lose tracking
• Most vision tracking prototypes use initial
detection done by hand
(see Forsyth and Ponce for discussion)
26. References
• Kalman, R.E., “A New Approach to Linear Prediction
Problems”, Transactions of the ASME--Journal of Basic
Engineering, pp. 35-45, March 1960.
• Sorenson, H.W., “Least Squares Estimation: from Gauss to
Kalman”, IEEE Spectrum, vol. 7, pp. 63-68, July 1970.
• http://www.cs.unc.edu/~welch/kalmanLinks.html
• D. Forsyth and J. Ponce. Computer Vision: A Modern
Approach, Chapter 19.
http://www.cs.berkeley.edu/~daf/book3chaps.html
• O. Faugeras.. Three-Dimensional Computer Vision. MIT
Press. Ch. 8, “Tracking Tokens over Time”.