Background subtraction is an important task for visual surveillance systems. However, this task becomes more complex when the data size grows since the real-world scenario requires larger data to be processed in a more efficient way, and in some cases, in a continuous manner. Until now, most of background subtraction algorithms were designed for mono or trichromatic cameras within the visible spectrum or near infrared part. Recent advances in multispectral imaging technologies give the possibility to record multispectral videos for video surveillance applications. Due to the specific nature of these data, many of the bands within multispectral images are often strongly correlated. In addition, processing multispectral images with hundreds of bands can be computationally burdensome. In order to address these major difficulties of multispectral imaging for video surveillance, this paper propose an online stochastic framework for tensor decomposition of multispectral video sequences (OSTD). First, the experimental evaluations on synthetic generated data show the robustness of the OSTD with other state of the art approaches then, we apply the same idea on seven multispectral video bands to show that only RGB features are not sufficient to tackle color saturation, illumination variations and shadows problem, but the addition of six visible spectral bands together with one near infra-red spectra provides a better background/foreground separation.
Vector Databases 101 - An introduction to the world of Vector Databases
Online Stochastic Tensor Decomposition for Background Subtraction in Multispectral Video Sequences
1. Online Stochastic Tensor Decomposition for
Background Subtraction in Multispectral
Video Sequences
Andrews Sobral1, Sajid Javed2, Soon Ki Jung2, Thierry Bouwmans1, and
El-hadi Zahzah1
1Laboratoire MIA (Mathematiques Image de Applications)
Universite de La Rochelle, France
2Virtual Reality Laboratory, School of Computer Science and Engineering
Kyungpook National University, Republic of Korea
18 December, 2015
3. • What is Tensor?
– multi-dimensional numerical array
• generalization of conventional arrays
– Matrix
o second-order tensor: rank (2) tensor
– Vector
o first-order tensor: rank(1) tensor
• Higher order tensors (order≥3): stores data in a
multi-dimensional array
– Main operation
• unfolding or matricization
– reformating tensors into matrices
o frontal, vertical, and horizontal
3
Introduction
4. • Video or sequence of images as a tensor
4
Introduction-cont..
5. • Is it possible to decompose tensor for Background subtraction application?
– 2 components
• Multi-dimensional low-rank tensor (corresponds to
background model)
• Multi-dimensional sparse tensor (belongs to moving objects)
• Matrix-based decomposition
– matrix considers only single dimensional (i.e., grayscale)
– spatial correlation loss
• erroneous foreground regions
• Tensor-based decomposition
– multi-dimensional data is considered (3d or 4rth order tensor)
– multi-aspects generalization of matrices
5
Tensor Decomposition
6. • Example: background subtraction via tensor decomposition under convex
optimization framework
6
Tensor Decomposition
MaskInput Low-rank Sparse
Frontal slices
7. • Methods
– Tucker/HOSVD
– CANDECOMP-PARAFAC(CP)
– NTF (Non-negative Tensor Factorization)
– NTD (Non-negative Tucker Decomposition)
– NCP (Non-negative CP Decomposition)
• Major Challenges
– Batch optimization
– Higher Order SVD computation
– Computational complexities
– Designed for only monochromatic (i.e., grayscale) or trichromatic (i.e.,
RGB) cameras.
– Real-time processing is not desirable
7
Tensor Decomposition
8. • Is it possible to make “Online Tensor Decomposition method for RGB as
well as Multispectral bands for background subtraction”?
– Main contributions
• Online Stochastic framework for Tensor Decomposition (OSTD)
– computationally good
– less memory cost
• OSTD for Multi-Spectral Video Sequences (MSVS)
– RGB is not sufficient for color saturation/ shadows/ reflections
– Multi-spectral bands can improve foreground segmentation
8
Proposed Methodology
10. • Let say 𝑁 𝑡ℎ order observation tensor
– corrupted by outliers,
• Main assumption
– can be reconstructed by the combination of
• low-rank component,
• sparse component,
– convex optimization framework
• represents the nuclear norm of 𝑖 𝑡ℎ mode
• represents the 𝑙1 norm
• Stochastic/Online optimization proposed by [Feng et.al 2013]
10
OSTD: The Model
11. • Main notion
– process only one frame at a time instance t
• MSVS: process each k band
• Nuclear norm is re-formulated
– Decompose nuclear norm into
• explicit product of basis and coefficients
• re-formulated norm is used proposed by [Feng et.al 2013]
– p ambient dimension, and r is rank
• Stochastic optimization
11
OSTD: Online Optimization
= 𝑖𝑛𝑓
𝐿∈ℝ 𝑝×𝑟,𝑅∈ℝ 𝑛×𝑟
1
2
( | 𝐿𝑖 | 𝐹
2
+ | 𝑅𝑖 | 𝐹
2
) 𝑠. 𝑡. , = 𝐿𝑖 𝑅𝑖
𝑇
12. • Advantages
– no batch processing
– iteratively update the basis
– used for each 𝑖 𝑡ℎ mode
• Major Processing: 3 Steps
– Low-rank approximation
• Initialize the basis, L
– Bilateral Random Projections (BRP) method
o L, Y, A are all random matrices
o speed-up low-rank recovery: fast convergence
• SVD decay slowly
12
OSTD cont…
𝐿 = 𝑌1(𝐴1
𝑇
𝑌1)−1
𝑌2
𝑇
13. • Find coefficients R as
• Fix R, and updated basis
– use block-coordinate decent method
– incremental updated
• Sparse outlier estimation
– M = −L𝑟 𝑡 of k element
• Background Model:
• Sparse Component:
13
OSTD cont…
𝑒 𝑡
=
𝑀 𝑡 𝑘 − λ2, 𝑖𝑓 𝑀 𝑡 𝑘 > λ2
𝑀 𝑡 𝑘 + λ2, 𝑖𝑓𝑀 𝑡 𝑘 < λ2
𝑟 𝑡 = (𝐿 𝑇 𝐿 + λ1 𝐼)−1 𝐿 𝑇 −𝑒 𝑡−1
14. • Synthetic Evaluation
– True low-rank tensor of size 30 × 30 × 30 is generated
by rank-3 factor matrices
• 𝑍 𝑛
∈ ℝ30×3
, where 𝑛 = 1,2,3
• random entries are corrupted
– Relative Root Square Error (RRSE) measure is computed
• Two different cases are considered
– smaller magnitude of true data
– with a higher magnitude
14
Experimental Evaluations
RRSE=
16. • Multispectral Video Sequences (MSVS)
– Acquisition
• commercial camera (FD-1665-MS)
– 7 spectral narrow bands = 6 visible + 1 NIR spectral band
– 5 video sequences
• 1 indoor video sequence
• 4 outdoor scenes
• frame Size: [658 × 491 × 3] with 250 to 2300 no. of frames
• frame rate: depends on overall scene illuminations
– 5 fps for dark scene and 15 fps for brighter one
– Main Challenges
• gradual illumination changes, shadows, and intermittent
object motion
• camouflage (color similarity between background and objects)
16
Experimental Evaluations
17. • MSVS dataset
– “integration of MS bands improve the foreground segmentation”
17
Experimental Evaluations
Video1 Video2 Video 3 Video 4 Video 5
22. • Quantitative Analysis
– F measure score is computed for RGB and MS bands for comparison
22
Experimental Evaluations
23. • Time Complexity
– Independent number of samples
• grows linearly to the image resolution
23
Experimental Evaluations
24. • Video demo 1
– color saturation issue
24
Experimental Evaluations
Input Ground
Truth
RGB
Mask
MS
Mask
25. • Video demo 2
– color saturation issue
25
Experimental Evaluations
Input Ground
Truth
RGB
Mask
MS
Mask
26. • Video demo 3
– shadows, dynamic backgrounds, intermittent object motion
26
Experimental Evaluations
Input Ground
Truth
RGB
Mask
MS
Mask
27. • Online stochastic optimization framework is proposed
– tensor decomposition into low-rank and sparse tensor
• computationally attractive
• real-time processing achieved
– provides great potential for multi-spectral bands
• Limitation
– proposed method is not stable for RGB image features
• achieve promising accuracy with the integration of MS
spectral bands
• Future work
– disparity features will be integrated
– can be extended for visual tracker as Low-rank sparse tracking
27
Conclusion