Presentation of paper: Cozzolino, A., Flammini, G., Galli, V., Lamberti, M., Poggi, G., Pragliola, C.: Evaluating the effects of MJPEG compression on Motion Tracking in metro railway surveillance. In: Proc. 14th Intl. Conf. on Advanced Concepts for Intelligent Vision Systems, ACIVS 2012, Sept. 4-7 2012, Brno, Czech Republic, J. Blanc-Talon et al. (Eds.), Springer LNCS 7517, pp. 142–154 (Springer-Verlag Berlin Heidelberg, Germany, ISBN 978-3-642-33139-8)
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
ACIVS'12: Evaluating the effects of MJPEG compression on Motion Tracking in metro railway surveillance
1. ACIVS’12
Advanced Concepts for Intelligent Vision Systems
Sept. 4-7 2012, Faculty of Information Technology, Brno University of Technology, Brno, Czech Republic
Angelo Cozzolino, Francesco Flammini, Valentina Galli, Mariangela Lamberti, Giovanni Poggi, Concetta Pragliola
Evaluating the effects of MJPEG compression on
Motion Tracking in metro railway surveillance
presented by
Francesco Flammini
Ansaldo STS – Innovation & Security Engineer
francesco.flammini@ansaldo-sts.com
2. Video Content Analytics in transit systems
• Transit systems attractive targets for:
– Thieves
– Vandals
– Terrorists
• Video surveillance essential for:
– Deterrence
– Detection
– Response
– Prosecution
• VCA supports Safety & Security Surveillance, especially
when there are:
– High-number of cameras (hundreds to thousands)
– Low number of operators
• VS with VCA integrated in current PSIM (Physical Security
Information Management) systems
– Pros: superior situation awareness Ref.
– Cons: possible issues with the number of false alarms
Francesco Flammini: Critical Infrastructure
Security: Assessment, Prevention, Detection,
• Frequent requests of upgrade of legacy CCTV with modern
VCA systems Response, 2011 (WIT Press, Southampton, UK,
ISBN: 978-1-84564-562-5)
• VCA event detection and performance requirements in
recent tenders are increasingly demanding
ACIVS’12, Francesco Flammini
2
3. Performance evaluation of motion tracking
• Ground Truth generation by annotation tools
• GT includes for each frame:
– Top-left
– Bottom-right
coordinates of the so called ‘bounding-boxes’ top
surrounding objects detected in the scene left
• MT metrics defined in the literature to
measure the temporal and spatial overlap
by comparison between the Ground Truth
and Algorithm Result produced by the right
bottom
Motion Tracker, using appropriate
thresholds
False Negative
False Positive
ACIVS’12, Francesco Flammini
3
4. Evaluation method
Video Metrics
Video Selection GT Generation AR Generation
Compression Computation
• Videos have been analyzed by a Motion Tracker identical to the one
installed in the real metro-railway but without using filters for alarm
generation
• The Motion Tracker has generated for each compression level an
AR text file with detected objects, whose information was structured
coherently with the ones included in the GT
ACIVS’12, Francesco Flammini
4
5. Video selection
Concourse - 7 objects Platform – simulation of object left behind
4 CIF (720x576)
25 FPS
Turnstiles – 7 objects 60s Tunnel portal – train passing, IR lamp
➩ 1500 frames
ACIVS’12, Francesco Flammini
5
6. MJPEG video compression
C = 1 (Q = 100%) C ≈ 5 (Q = 50%) C ≈ 10 (Q = 20%)
C ≈ 15 (Q = 10%) C ≈ 20 (Q = 5%) C ≈ 25 (Q =1%)
ACIVS’12, Francesco Flammini
6
7. Metrics computation
• For metrics evaluation, we have developed a Matlab program that automatically
computes the FN and FP metrics. The tool organizes its input data (GT and AR) in
appropriate arrays, whose number of rows is equal to the number of objects while
the number of columns is 5, that is:
– The list of frames in which the object is present (i.e. the track), that is a vector whose
length is equal to the number of frames of the track
– Top-left and bottom-right coordinates of the bounding-boxes (4 numbers)
• It is being extended to compute other metrics (e.g. ‘ID change’)
ACIVS’12, Francesco Flammini
7
8. Evaluation of results
• Fluctuation of
results due to
algorithm adaptive
thresholds
depending on
scene
characteristics (e.g.
objects size,
ambient light, etc.)
(a) (b)
• ‘Filtering’ effect of
the compression
can counterbalance
negative effect of
quality degradation,
by reducing the
number of
detectable objects
(c) (d)
ACIVS’12, Francesco Flammini
8
9. Evaluation of trends
(a) (b)
• As expected, tracking performance degrades generally with quality, and this has a
much relevant impact at higher levels of compression, in particular when the image
quality threshold is lower than 20%, that is at compression ratios higher than 10
(corresponding approximately to 4 Mbps bandwidth occupation)
ACIVS’12, Francesco Flammini
9
10. Main causes of False Negatives
• Tiling (right) and occlusions
(down) prevent the tracker
to ‘hook’ the objects in the
scene, and thus to track (a)
their trajectory, since their
IDs change frequently as
they were different objects
(b) (c)
(a) (b)
ACIVS’12, Francesco Flammini
10
11. Main causes of False Positives
Glare Reflections Camouflage Large artefacts
ACIVS’12, Francesco Flammini
11
12. Relevance of FP sources w.r.t. compression
(a) (b)
(c)
• For the Concourse, all FP causes (especially glare) increase considerably with compression, while in Platform
and Turnstiles the effects of artefacts is largely predominant with respect to other causes, which, however,
continue to be relevant
• Tunnel FP are not reported: since there is no real object moving in the scene, they show up only at train
passage due to the light change in the scene; furthermore, the absence of most chromatic components w.r.t.
other standard cameras (IR cameras only provide greyscale images) reduces the number of FP causes
varying with compression levels
ACIVS’12, Francesco Flammini
12
13. Conclusions and future developments
• Performance degradation critical when passing from a 20% till a 1% quality level of
compressed videos, whereas a 50% reduction on image quality represents a very
acceptable trade-off (corresponding to ≈ 7 Mbps bandwidth occupation)
• In all the cases in which it is required to go over that ‘conservative’ ratio, it is
necessary to evaluate how the error sources are affected in the correct detection of
the objects, according to the specific features of each scene (motion density, light
sources, camera shots, type of background, etc.)
• The results achieved can provide some guidelines which can be applicable in
similar scenarios (technologies and contexts), e.g. using more efficient codecs
• Using the same evaluation method in any domain it is possible to:
– support the design of surveillance systems by fine-tuning the video compression level
against scene characteristics or other factors, for each camera (especially useful in
distributed wireless systems)
– quantify the effect on VCA performance of other quality or noise factors like
• sensitivity, resolution, frame rate, etc.
• vibrations, electro-magnetic interference, chromatic distortions, etc.
ACIVS’12, Francesco Flammini
13