Video copy detection using visual and semantic fingerprinting.
Presentation based on the following journal papers:
[1] Hyun-seok Min, Jaeyoung Choi, Wesley De Neve, Yong Man Ro. Near-Duplicate Video Clip Detection Using Model-Free Semantic Concept Detection and Adaptive Semantic Distance Measurement. IEEE Transactions on Circuits and Systems for Video Technology. Vol. 22(8). August 2012. pp. 1174-1187. DOI=http://dx.doi.org/10.1109/TCSVT.2012.2197080
[2] Hyun-seok Min, Jaeyoung Choi, Wesley De Neve, Yong Man Ro. Bimodal Fusion of Low-level Visual Features and High-level Semantic Features for Near-duplicate Video Copy Detection. EURASIP Signal Processing – Image Communication. Vol. 26(10). November 2011. pp. 612-627. DOI=http://dx.doi.org/10.1016/j.image.2011.04.001
08448380779 Call Girls In Civil Lines Women Seeking Men
Video Copy Detection Using Visual and Semantic Fingerprinting
1. ELIS – Multimedia Lab
Video Copy Detection using
Visual and Semantic Fingerprinting
Presentations aOG MIT
November 17, 2011
Wesley De Neve
Multimedia Lab Image and Video Systems Lab
Dept. of Electronics & Information Systems Dept. of Electrical Engineering
Faculty of Engineering & Architecture College of Information Science & Technology
Ghent University – IBBT KAIST
Ghent, Belgium Daejeon, South Korea
2. ELIS – Multimedia Lab
Context Research Effort
• Worked in South Korea during the past four years
– at ICU and KAIST in Daejeon
– main focus on advising graduate students
• keeping track of the state-of-the-art
• identifying and solving research questions
• help communicate research results
– main research topics
• data-driven image annotation and tag relevance learning
• face recognition using online social network context
• video surveillance and privacy protection
• video copy detection
Video Copy Detection using Visual and Semantic Fingerprinting
Wesley De Neve 2/42
November 17, 2011
3. ELIS – Multimedia Lab
Outline
• Introduction
• Video copy detection
– using visual features
– using semantic features
• Experimental results
• Conclusions
Video Copy Detection using Visual and Semantic Fingerprinting
Wesley De Neve 3/42
November 17, 2011
4. ELIS – Multimedia Lab
Outline
• Introduction
• Video copy detection
– using visual features
– using semantic features
• Experimental results
• Conclusions
Video Copy Detection using Visual and Semantic Fingerprinting
Wesley De Neve 4/42
November 17, 2011
5. ELIS – Multimedia Lab
Introduction (1/3)
• Increasing consumption of online video content
– thanks to easy-to-use multimedia devices and online services
– thanks to cheap storage and bandwidth
– thanks to an increasing number of people going online
• Increasing availability of online video content
– digitization of professional video archives
– user-generated video content
Video Copy Detection using Visual and Semantic Fingerprinting
Wesley De Neve 5/42
November 17, 2011
6. ELIS – Multimedia Lab
Introduction (2/3)
• Some statistics
– professional video content
• BBC Motion Gallery (as of January 2009)
o contains over 2.5 million hours of video content
o dating back 60 years in time
– user-generated video content
• YouTube (as of May 2011)
o 48 hours of new video content are uploaded each minute
Video Copy Detection using Visual and Semantic Fingerprinting
Wesley De Neve 6/42
November 17, 2011
7. ELIS – Multimedia Lab
Introduction (3/3)
• Problem: digital video overload
– our ability to automatically manage video clips does not keep up with
our ability to create and store video clips
– makes it, e.g., more and more difficult to find video clips of interest
• Part of the solution: techniques for video copy detection
– help in managing vast libraries of video clips
• reduction of visual redundancy in video search results
• detection of copyright infringement
• metadata propagation along visual links
• media usage monitoring
• search by video query
Video Copy Detection using Visual and Semantic Fingerprinting
Wesley De Neve 7/42
November 17, 2011
8. ELIS – Multimedia Lab
Duplicates versus Near-Duplicates
• Duplicate video clips
– exact video copies
– can be easily detected using hashing
• Near-duplicate video clips (NDVCs)
– transformed video clips
– detection is challenging
transformation
original video clip black & white cropping flipping
Video Copy Detection using Visual and Semantic Fingerprinting
Wesley De Neve 8/42
November 17, 2011
9. ELIS – Multimedia Lab
Applications: Reduction of Visual Redundancy (1/2)
visual redundancy
visual redundancy
Video Copy Detection using Visual and Semantic Fingerprinting
Wesley De Neve 9/42
November 17, 2011
10. ELIS – Multimedia Lab
Applications: Detection of Copyright Infringement (2/2)
• Missed by YouTube’s
ContentID
• Transformations used
o scaling
o recompression
Video Copy Detection using Visual and Semantic Fingerprinting
Wesley De Neve 10/42
November 17, 2011
11. ELIS – Multimedia Lab
Outline
• Introduction
• Video copy detection
– using visual features
– using semantic features
• Experimental results
• Conclusions
Video Copy Detection using Visual and Semantic Fingerprinting
Wesley De Neve 11/42
November 17, 2011
12. ELIS – Multimedia Lab
System for Video Copy Detection: Conceptual Design
query video clip
Realized by means
of video signatures
collection of video matching
reference video clips
original video clip copy found
≈
Video Copy Detection using Visual and Semantic Fingerprinting
Wesley De Neve 12/42
November 17, 2011
13. ELIS – Multimedia Lab
Video Signatures
• Aim at uniquely characterizing a video clip
• Commonly consist of visual features
– e.g., color, texture, shape, and motion
• Are low-dimensional representations
– in order to facilitate more efficient matching
dimensionality
reduction ...
921600-D (1280x720) 128-D (128 bins)
Video Copy Detection using Visual and Semantic Fingerprinting
Wesley De Neve 13/42
November 17, 2011
14. ELIS – Multimedia Lab
Room for Improvement
• Observations
– no single type of visual feature has thus far emerged that is robust
against all possible transformations
– transformations tend to preserve semantic features
semantic textual
helmet face wall clothes /
features descriptions
• Research question
– how about (additionally) making use of semantic features?
Video Copy Detection using Visual and Semantic Fingerprinting
Wesley De Neve 14/42
November 17, 2011
15. ELIS – Multimedia Lab
Outline
• Introduction
• Video copy detection
– using visual features
– using semantic features
• Experimental results
• Conclusions
Video Copy Detection using Visual and Semantic Fingerprinting
Wesley De Neve 15/42
November 17, 2011
16. ELIS – Multimedia Lab
Extraction of Semantic Features (1/2)
• Question
– how to extract semantic features?
helmet
face
?
wall
clothes
• Our answer semantic features
– by means of binary concept classifiers
Video Copy Detection using Visual and Semantic Fingerprinting
Wesley De Neve 16/42
November 17, 2011
17. ELIS – Multimedia Lab
Extraction of Semantic Features (2/2)
• Example of a binary classifier for ‘apple’
apple
‘apple’
classifier
apple
‘not apple’
classifier
• Concept classifiers
– pieces of logic that know how, e.g., an “apple” image looks like
– more formally: pieces of logic that know the statistical distribution of
the visual features of, e.g., representative “apple” images
Video Copy Detection using Visual and Semantic Fingerprinting
Wesley De Neve 17/42
November 17, 2011
18. ELIS – Multimedia Lab
Challenges Concept Classification (1/2)
• Limited effectiveness
– false negatives (due to intra-concept variability)
apple
‘not apple’
classifier
– false positives (due to inter-concept variability)
apple
‘apple’
classifier
Video Copy Detection using Visual and Semantic Fingerprinting
Wesley De Neve 18/42
November 17, 2011
19. ELIS – Multimedia Lab
Challenges Concept Classification (2/2)
• Limited semantic coverage
– only a limited number of concept classifiers can be supported
• due to the high cost of training
• experts need to collect training images for each concept classifier
training images for training images for training images for
‘apple’ ‘orange’ ‘strawberry’
Video Copy Detection using Visual and Semantic Fingerprinting
Wesley De Neve 19/42
November 17, 2011
20. ELIS – Multimedia Lab
Classifier-Based Semantic Feature Extraction
for Video Copy Detection
• Challenges concept classification affect a semantic approach
towards the task of video copy detection
• How to deal with this?
– limited effectiveness of concept classifiers
• use of semantic features that can be easily and reliably detected
o e.g., ‘people’
– limited semantic coverage of concept classifiers
• use of semantic features that are general in nature
o e.g., ‘people’ versus ‘Barack Obama’
• use of the temporal variation of the semantic features
o extraction of semantic features at the level of video shots
Video Copy Detection using Visual and Semantic Fingerprinting
Wesley De Neve 20/42
November 17, 2011
21. ELIS – Multimedia Lab
Outline
• Introduction
• Video copy detection
– using visual features
– using semantic features
• Experimental results
• Conclusions
Video Copy Detection using Visual and Semantic Fingerprinting
Wesley De Neve 21/42
November 17, 2011
22. ELIS – Multimedia Lab
Reference and Query Video Clips
• 311 reference video clips with a total duration of 170 h
– 101 video clips from MUSCLE-VCD-2007
• total duration: 80 h
– 210 video clips from TRECVID 2008
• total duration: 90 h
• 500 query video clips
– the result of five transformations applied to 100 video clips randomly
selected from the 311 reference video clips
Video Copy Detection using Visual and Semantic Fingerprinting
Wesley De Neve 22/42
November 17, 2011
23. ELIS – Multimedia Lab
Transformations Applied
original blur pattern insertion
caption insertion change in brightness crop
Video Copy Detection using Visual and Semantic Fingerprinting
Wesley De Neve 23/42
November 17, 2011
24. ELIS – Multimedia Lab
Semantic Features
• Use of Support Vector Machines (SVM)
– binary classifiers with state-of-the-art effectiveness
• 32 semantic concepts used
– mean average precision (MAP): 0.51
– ‘gravel’, ‘park’, ‘pavement’, ‘road’, ‘rock’, ‘sand’, ‘sidewalk’, ‘face’,
‘people’, ‘indoor’, ‘field’, ‘peak’, ‘wood’, ‘night’, ‘street’, ‘flowers’,
‘leaves’, ‘trees’, ‘cloudy’, ‘sunny’, ‘sunset’, ‘brick’, ‘arch’, ‘buildings’,
‘wall’, ‘windows’, ‘beach’, ‘high-wave’, ‘low-wave’, ‘still water’,
‘mirrored water’, and ‘snow’
Video Copy Detection using Visual and Semantic Fingerprinting
Wesley De Neve 24/42
November 17, 2011
25. ELIS – Multimedia Lab
Video Matching
reference
video clip
d1
query
video clip
Video Copy Detection using Visual and Semantic Fingerprinting
Wesley De Neve 25/42
November 17, 2011
26. ELIS – Multimedia Lab
Video Matching
reference
video clip
d2
query
video clip
Video Copy Detection using Visual and Semantic Fingerprinting
Wesley De Neve 26/42
November 17, 2011
27. ELIS – Multimedia Lab
Video Matching
reference
video clip
d3
query
video clip
Video Copy Detection using Visual and Semantic Fingerprinting
Wesley De Neve 27/42
November 17, 2011
28. ELIS – Multimedia Lab
Video Matching
reference
video clip
d4
query
video clip
Video Copy Detection using Visual and Semantic Fingerprinting
Wesley De Neve 28/42
November 17, 2011
29. ELIS – Multimedia Lab
Video Matching
reference
video clip
d5
query
video clip
• di : linearly weighted combination of the Manhattan distance between the
– visual features of the query video clip and the part of the reference
video clip in the sliding window
– semantic features of the query video clip and the part of the reference
video clip in the sliding window
Video Copy Detection using Visual and Semantic Fingerprinting
Wesley De Neve 29/42
November 17, 2011
30. ELIS – Multimedia Lab
Normalized Detection Cost Ratio (NDCR)
• Definition
NDCR = Pmiss + β * RFA
where
Pmiss = NFN / Nqueries missed detection probablity
RFA = NFP / (Tquery * Trefdata) false alarm rate (per hour)
• We set β to a value of 2 (“balanced profile”)
– see “CBCD Evaluation Plan TRECVID 2010 v3”
– assigns a higher cost to raising false alarms
• A value of zero indicates perfect detection performance
Video Copy Detection using Visual and Semantic Fingerprinting
Wesley De Neve 30/42
November 17, 2011
31. ELIS – Multimedia Lab
Semantic Concept Models Used
• AP (Average Precision)
– true positive rate: #true positives / (#true positives + #false positives)
– averaged over 100 query video clips
• MAP (Mean AP) of the 32 semantic concept models: 0.52
Video Copy Detection using Visual and Semantic Fingerprinting
Wesley De Neve 31/42
November 17, 2011
32. ELIS – Multimedia Lab
Effectiveness of Bimodal Fusion
• Bimodal fusion of visual and semantic features outperforms the
separate use of either type of features for all transformations
Video Copy Detection using Visual and Semantic Fingerprinting
Wesley De Neve 32/42
November 17, 2011
33. ELIS – Multimedia Lab
Comparison of Effectiveness of Video Copy Detection
• In general, bimodal fusion of visual and semantic features
outperforms the other techniques for video copy detection
Video Copy Detection using Visual and Semantic Fingerprinting
Wesley De Neve 33/42
November 17, 2011
34. ELIS – Multimedia Lab
Robustness Against Variation in Semantic Coverage
• The more concepts used, the more effective NDVC detection
Video Copy Detection using Visual and Semantic Fingerprinting
Wesley De Neve 34/42
November 17, 2011
35. ELIS – Multimedia Lab
Influence of Effectiveness of Semantic Concept Detection
0.925
0.711
• The effectiveness of NDVC detection starts to stabilize once
the MAP of the concept detectors is higher than 0.3
Video Copy Detection using Visual and Semantic Fingerprinting
Wesley De Neve 35/42
November 17, 2011
36. ELIS – Multimedia Lab
Time Complexity of Creating Video Signatures
• Measurements expressed in seconds
– include the time to perform
o shot segmentation
o keyframe selection
o feature extraction
Video Copy Detection using Visual and Semantic Fingerprinting
Wesley De Neve 36/42
November 17, 2011
37. ELIS – Multimedia Lab
Time Complexity of Matching
• Measurements expressed in seconds
– include the time to
o compute the temporal entropy for the proposed method
o perform matching using a sliding window approach
Video Copy Detection using Visual and Semantic Fingerprinting
Wesley De Neve 37/42
November 17, 2011
38. ELIS – Multimedia Lab
Storage Complexity
• Measurements expressed in Mbytes
– storing the 32 semantic features requires 4 bytes per shot
– storing the MPEG-7 visual features requires about 0.4 kbytes per shot
Video Copy Detection using Visual and Semantic Fingerprinting
Wesley De Neve 38/42
November 17, 2011
39. ELIS – Multimedia Lab
Outline
• Introduction
• Video copy detection
– using visual features
– using semantic features
• Experimental results
• Conclusions
Video Copy Detection using Visual and Semantic Fingerprinting
Wesley De Neve 39/42
November 17, 2011
40. ELIS – Multimedia Lab
Conclusions (1/3)
• Discussed the novel idea of using semantic features for the
purpose of video copy detection
– given the observation that no single type of visual feature exists that
is robust against all possible transformations
– given the observation that transformations tend to preserve
semantic information
– (given the observation that the semantic features extracted can be
reused for annotation purposes)
• Experimental results
– fusion of visual and semantic features outperforms
• the seperate use of either type of features
• temporal ordinal measurement, PCA-SIFT, and BoVW
Video Copy Detection using Visual and Semantic Fingerprinting
Wesley De Neve 40/42
November 17, 2011
41. ELIS – Multimedia Lab
Conclusions (2/3)
• Current and future extensions
– use of the temporal variation of concept confidence values
• studied by the National University of Singapore
– classifier-free semantic feature extraction
• takes advantage of collective knowledge available on Flickr
o unrestricted semantic concept vocabulary (higher coverage)
• accepted for publication in IEEE Trans. on CSVT
– improved semantic distance measurement
– indexing of semantic features
Video Copy Detection using Visual and Semantic Fingerprinting
Wesley De Neve 41/42
November 17, 2011
42. ELIS – Multimedia Lab
Conclusions (3/3)
• Publications of interest
– “Bimodal fusion of low-level visual features and high-level semantic
features for near-duplicate video clip detection”
o published in Signal Processing – Image Communication
– “Near-Duplicate Video Clip Detection Using Model-Free Semantic
Concept Detection and Adaptive Semantic Distance Measurement”
o published in IEEE Trans. on Circuits and Systems for Video Technology
Video Copy Detection using Visual and Semantic Fingerprinting
Wesley De Neve 42/42
November 17, 2011