4. The Problem
Aim: Given a frame sequence from a fixed
camera, detecting all the foreground objects.
Approach: detecting foreground as
difference between the current frame & an
image of the scene’s static background.
| framei – backgroundi| > Th
4
6. Various Techniques
Running
Gaussian Average
Temporal Median Filter
Mixture of Gaussians(MoG)
Kernel Density Estimation(KDE)
Sequential Kernel Density Approximation(SKDA)
Co-occurrence of Image Variance
Eigenbackgrounds
6
9. Mixture of Gaussian
Problems with other techniques [3]
1. Averaging images over time
-Not robust to scenes with slow moving objects
-Single Threshold for entire scene
2. Modelling each pixel using Kalman filter
-Not robust to backgrounds with repetitive change
-Takes significant time to re-establish the background
3. Modelling each pixel using single Gaussian
-Good indoor performance
- Not good enough out-door scenes for repetitive
change
9
10. Mixture of Gaussian
Hypothesis
:
If we model each pixel as a mixture of Gaussians to
determine whether or not a pixel is part of the background, then
we will arrive at an effective approach to separate the
background and foreground, which can be used for real-time
tracking.[ 3]
Used in tracking of moving objects:
1. Separating Foreground from Background (our agenda)
2. Tracking Objects in Foreground (not in the scope)
10
12. Background Subtraction -MoG
Pixel process : It’s the history of the pixel ‘X’’s value
from frame 1 to ‘t’. [4]
Part 1 : For above pixel process of ‘X’ the
probability of observing a ‘X’ at frame ‘t’ is as
follows,
K
= no. of Gaussian in the mixture (generally 3-5). [4]
weight(i,t)
= estimate of the weight of Gaussian in the mixture
means (i,t)
= mean value of the Gaussian at time ‘t’
Covariance(i,t) = Covariance matrix of Gaussian in the matrix
= Variance * Identity matrix
12
13. Background Subtraction -MoG
Decision making : For each Gaussian in the mixture
of pixel X
if pixel X <= 2.5 (probability of .996) standard deviation form
the mean then the Gaussian is said to be ‘matched’
- Increase the weight
- Adjust the mean closer to X(t)
- Decrease the Variance
Else the Gaussian is ‘unmatched’
- Decrease the weight
If all the Gaussians in the mixture for pixel X(t) are unmatched
- Mark X(t) as foreground pixel
- Find the least probable Gaussian in the mixture and replace it with a new
Gaussian with the following parameters:
- Mean = X(t) i.e present value of X
- Variance as a high value
- Weight as a low value
13
14. Background Subtraction -MoG
Part 2 : Updating of parameters and using a suitable
heuristic for distributions that represent background
pixels
Based on the decision made , change the following parameters using the
equations given below:
Where
- α (alpha) is the learning parameter.
- M (i,t) value is set to 1 for model that matched and 0 for rest
- µ(mean) and σ (Std Deviation) for unmatched remain same
and changes for the matched distributions
- ρ is the updating parameter
14
15. Background Subtraction -MoG
Advantages :
Robust against movement that are part of
background, e.g moving branches of a tree
Robust against rain , snow, etc….
Disadvantages:
Not a good subtraction when shadows are there
Difficulty with objects overlapping
Fast lighting changes were also an issue.
Gives false positives
15
17. Implementation
Frame Difference
as the easiest method
o Objects with uniformly
distributed intensity
o Objects must be moving all the
time!
Computationally cheap
Highly adaptive background
model
Tuning threshold value (=25 for
our example)
17
18. Implementation
MoG
a complex method
o Parametric model
o Mixture of Gaussian
components
o Comparing pixel value with
tracking Gaussian components
Very good at separating objects
Suppressing background noise
Parameter optimisation
Not quickly enough adaptive
background model
18
20. References
1.
2.
3.
4.
5.
6.
7.
Tarun Baloch, MSc Thesis ‘Background Subtraction in Highly Illuminated Indoor
Environment’ Indian Institute of Technology, 2010
M. Piccardi. Background Subtraction techniques : A Review . In IEEE International
Conference on Systems, Man and Cybernetics, 2004, Volume 4, pages 3099–
3104, 2005. URL http://dx.doi.org/10.1109/ICSMC.2004.1400815.
Eric Thul , ECSE-626 Final Project: An evaluation of Chris Stauffer and W. E. L.
Grimson’s
method
for
background
subtraction,
2008,
www.cs.mcgill.ca/~ethul/pub/course/ecse626/project-report.pdf
C. Stauffer and W. E. L. Grimson. Adaptive background mixture models for realtime tracking. IEEE Computer Society Conference on Computer Vision and
Pattern
Recognition,
1999,
2:252,
1999.
URL
http://doi.ieeecomputersociety.org/10.1109/CVPR.1999.784637.
A A Mazeed, Mark Nixon and Steve Gunn, Classifiers Combination for Improved
Motion Segmentation, 2004,eprints.ecs.soton.ac.uk › ECS › Research ›
Publications
Omar Javed, Khurram Shafique and Mubarak Shah, A hierarchical approach to
robust background subtraction using colour and gradient information,2002,
visionnas2.cs.ucf.edu/papers/javed_wmvc_2002.pdf
Sen-Ching S.Chung and Chandrika Kamath, Robust techniques for background
subtraction in urban traffic video, 2004,
www.llnl.gov/casc/sapphire/pubs/UCRL-CONF-200706.pdf
20
Notas del editor
Background subtraction is a computational vision process of extracting foreground objects in a particular scene.It identifies moving objects from the portion of video frame that differs from the background model.It is basically a class of techniques for segmenting out objects of interest in a scene for applications such as surveillance.
Aim –given a frame sequence from a fixed camera, try detecting all the foreground objectsApproach - detecting the foreground objects as the difference between the current frame and an image of the scene’s static backgroundThe value of the threshold is adjusted according to the conditions.
RGA – it was proposed by Wren et al. They proposed to model background independently at each pixel location. The model was based on fitting a Gaussian pdf on the last n valuesTMF – it was based on using median value of last n frames as the background modelMoG – Stauffer & Grimson in 1999 proposed a multivalued background model able to cope with multiple background objectsKDE – Elgammal et al. proposed this technique and used a non parametric model based on KDE which used a buffer of last n background valuesSKDA – uses mean-shift vector to detect main modes of true pdf directly from sample data with a minimum set of assumptionsCOIV – here, the neighbouring blocks of pixels belonging to the background experience similar variations over timeEBG – based on eigenvalue decomposition applied to whole image
SpeedMoG – m = no. of gaussian distributions used (3-5)KDE – n is typically as high as 100SKDA – m = no. of modes of approximated pdfCOIV – n = nearest neighbours, N2 = spreads the cost over pixelsEBG – M = no. of eigenvectorsMemoryCOIV – n = no. of training models, k = dimension of training modelsRunning Gaussian average or the median filter offer acceptable accuracy while achieving a high frame rate and having limited memory requirements.MoG and KDE prove very good model accuracy. KDE has a high memory requirement (in the order of a 100 frames) which might prevent easy implementation on low-memory devices. SKDA is an approximation of KDE which proves almost as accurate, but mitigates the memory requirement by an order of magnitude and has lower time complexity. COIV and EBG explicitly address spatial correlation. They both offer good accuracy against reasonable time and memory complexity
A good background model should also react quickly to changes in background and adapt itself to accommodate changes occurring in the background
-i.e for every pixel we find the sum of probability of it lying in ‘k’ gaussians and based on this probability be make decision ‘matched’ or unmatched’ - As background subtraction is fundamental process an the first step for any tracking, matching/recognition problem of Computer Vision.
-These images are taken from Eric’s final year report in MIT, which uses Stauffer’s algorithm based on MoG,- For a general appreciation of the audience, The original scene is broken down in ‘background’ and foreground’ and then the object in foreground is tracked.
Defining Pixel process.. History of pixel X The background subtraction is further divided into two parts, part 1 is calculating probability for pixel to be observed in frame ‘t’, which is equal to the sum of the probabilities of it appearing to in each of ‘k’ Gaussians.
- Based on the probability a decision is made as to , matched or not matchedIf its matched then we increase the weight, adjust the mean closer to the value of X at time frame ‘t’ and decrease the varianceAnd is it doesn’t match then decrease the weight for the next frame.Finally if none of the ‘k’ distributions match current pixel value, the least probable distribution is replaced with a distribution with the new mean and variance.
Most of the times the examples which these methods are tested with are divided to two major categories, indoor and outdoor scenes and obviously outdoor scenes are difficult to handle.We have different objects in outdoor scenes, different moving objects, shades and shadows of different objects, moving object which are moving and stopping and moving again such as cars and pedestrians! Objects covering each other such as trees and buildings which may differ in position with a wind. Stationary objects which have high frequency but in a small domain movements, such as leaves of tree and many different sort of problems like this.But we should keep in mind according to the application, we should choose the method.
Frame difference, current frame is simply subtracted from the previous frame and if the difference in pixel values for given pixel is greater than a threshold, the pixel is considered as a part of foreground!A big problem in this method is as we can see objects which have uniformly distributed intensity values such as car sides, the interior parts are totally interpreted as background while they are foreground.Another problem is that since this method is considering just the previous frame, objects must be moving all the time. So if a car stops in a junction, if it stays still for more than one frame period, it becomes background.But the computational load is very lightThe background model is highly adaptive, since it is just dependant to the previous frame it can adapt to changes of the background faster than any other method. And it is the reason frame difference is doing a very good job in cancelling the noise. Please note the leaves of the tree! In the more complex model, Mixture of Gaussians method it is doing a very good job.A challenge for this method is threshold value which is normally tuned manually and empirically.
In contrast with frame difference, in MoG the background is not a frame of values. It is a parametric model, each pixel location is represented by a number or a mixture of Gaussian functions that sum to gather to for a for a probability distribution function. To determine if a pixel is part of background or not, we compare it to the Gaussian components tracking it. If the pixel value is within a scaling factor of a background components standard deviation, it is considered as a background pixel, otherwise it is foreground.As it is seen MoG is very good at separating objects and suppressing background noise such as waving trees. However there are some points that the method breaks down allowing the background to seep into the foreground. These points correspond to relatively rapid changes in illumination. This is because the background model is not adapting quickly enough.It shows one of the MoG main challenges: Paramter Optimisation. parameters that must be manipulated to get the best resultand maybe if we tune this parameters, and change them a little bit, a better answer was produced.
A common approach to identifying the moving objects is background subtraction.Identifying moving objects from a video sequence is a fundamental and critical task in video surveillance, traffic monitoring and analysis, human detection and tracking, and gesture recognition in human-machine interface.