SlideShare una empresa de Scribd logo
1 de 42
ELIS – Multimedia Lab



                Video Copy Detection using
            Visual and Semantic Fingerprinting

                           Presentations aOG MIT
                                   November 17, 2011


                                   Wesley De Neve

            Multimedia Lab                        Image and Video Systems Lab

Dept. of Electronics & Information Systems          Dept. of Electrical Engineering
  Faculty of Engineering & Architecture      College of Information Science & Technology
          Ghent University – IBBT                               KAIST
              Ghent, Belgium                             Daejeon, South Korea
ELIS – Multimedia Lab

                   Context Research Effort

• Worked in South Korea during the past four years
   – at ICU and KAIST in Daejeon

   – main focus on advising graduate students
      • keeping track of the state-of-the-art
      • identifying and solving research questions
      • help communicate research results

   – main research topics
      • data-driven image annotation and tag relevance learning
      • face recognition using online social network context
      • video surveillance and privacy protection
      • video copy detection
                    Video Copy Detection using Visual and Semantic Fingerprinting
                                          Wesley De Neve                                                    2/42
                                        November 17, 2011
ELIS – Multimedia Lab

                                     Outline



• Introduction
• Video copy detection
   – using visual features
   – using semantic features
• Experimental results
• Conclusions




                 Video Copy Detection using Visual and Semantic Fingerprinting
                                       Wesley De Neve                                                    3/42
                                     November 17, 2011
ELIS – Multimedia Lab

                                     Outline



• Introduction
• Video copy detection
   – using visual features
   – using semantic features
• Experimental results
• Conclusions




                 Video Copy Detection using Visual and Semantic Fingerprinting
                                       Wesley De Neve                                                    4/42
                                     November 17, 2011
ELIS – Multimedia Lab

                           Introduction (1/3)

• Increasing consumption of online video content
   – thanks to easy-to-use multimedia devices and online services
   – thanks to cheap storage and bandwidth
   – thanks to an increasing number of people going online

• Increasing availability of online video content
   – digitization of professional video archives
   – user-generated video content




                      Video Copy Detection using Visual and Semantic Fingerprinting
                                            Wesley De Neve                                                    5/42
                                          November 17, 2011
ELIS – Multimedia Lab

                          Introduction (2/3)


• Some statistics
   – professional video content
      • BBC Motion Gallery (as of January 2009)
           o contains over 2.5 million hours of video content
           o dating back 60 years in time

   – user-generated video content
      • YouTube (as of May 2011)
           o 48 hours of new video content are uploaded each minute




                     Video Copy Detection using Visual and Semantic Fingerprinting
                                           Wesley De Neve                                                    6/42
                                         November 17, 2011
ELIS – Multimedia Lab

                          Introduction (3/3)

• Problem: digital video overload
   – our ability to automatically manage video clips does not keep up with
     our ability to create and store video clips
   – makes it, e.g., more and more difficult to find video clips of interest


• Part of the solution: techniques for video copy detection
   – help in managing vast libraries of video clips
      • reduction of visual redundancy in video search results
      • detection of copyright infringement
      • metadata propagation along visual links
      • media usage monitoring
      • search by video query

                     Video Copy Detection using Visual and Semantic Fingerprinting
                                           Wesley De Neve                                                    7/42
                                         November 17, 2011
ELIS – Multimedia Lab

                      Duplicates versus Near-Duplicates

• Duplicate video clips
      – exact video copies
      – can be easily detected using hashing

• Near-duplicate video clips (NDVCs)
      – transformed video clips
      – detection is challenging




                       transformation


original video clip                           black & white                         cropping                flipping
                                 Video Copy Detection using Visual and Semantic Fingerprinting
                                                       Wesley De Neve                                                    8/42
                                                     November 17, 2011
ELIS – Multimedia Lab

Applications: Reduction of Visual Redundancy (1/2)




                                                                              visual redundancy




                                                                              visual redundancy




              Video Copy Detection using Visual and Semantic Fingerprinting
                                    Wesley De Neve                                                    9/42
                                  November 17, 2011
ELIS – Multimedia Lab

Applications: Detection of Copyright Infringement (2/2)




                                                                                • Missed by YouTube’s
                                                                                  ContentID

                                                                                • Transformations used
                                                                                   o scaling
                                                                                   o recompression




                Video Copy Detection using Visual and Semantic Fingerprinting
                                      Wesley De Neve                                                    10/42
                                    November 17, 2011
ELIS – Multimedia Lab

                                     Outline



• Introduction
• Video copy detection
   – using visual features
   – using semantic features
• Experimental results
• Conclusions




                 Video Copy Detection using Visual and Semantic Fingerprinting
                                       Wesley De Neve                                                    11/42
                                     November 17, 2011
ELIS – Multimedia Lab

   System for Video Copy Detection: Conceptual Design

                                                         query video clip
                                                                                                  Realized by means
                                                                                                  of video signatures


    collection of                                        video matching
reference video clips




                            original video clip                                         copy found

                                                                                              ≈

                        Video Copy Detection using Visual and Semantic Fingerprinting
                                              Wesley De Neve                                                      12/42
                                            November 17, 2011
ELIS – Multimedia Lab

                            Video Signatures

• Aim at uniquely characterizing a video clip

• Commonly consist of visual features
   – e.g., color, texture, shape, and motion

• Are low-dimensional representations
   – in order to facilitate more efficient matching



                            dimensionality
                              reduction                                              ...

    921600-D (1280x720)                                              128-D (128 bins)

                     Video Copy Detection using Visual and Semantic Fingerprinting
                                           Wesley De Neve                                                      13/42
                                         November 17, 2011
ELIS – Multimedia Lab

                     Room for Improvement
• Observations
   – no single type of visual feature has thus far emerged that is robust
     against all possible transformations
   – transformations tend to preserve semantic features




                                                                                           semantic     textual
            helmet             face                    wall                  clothes                /
                                                                                           features   descriptions


• Research question
   – how about (additionally) making use of semantic features?

                      Video Copy Detection using Visual and Semantic Fingerprinting
                                            Wesley De Neve                                                       14/42
                                          November 17, 2011
ELIS – Multimedia Lab

                                     Outline



• Introduction
• Video copy detection
   – using visual features
   – using semantic features
• Experimental results
• Conclusions



                 Video Copy Detection using Visual and Semantic Fingerprinting
                                       Wesley De Neve                                                    15/42
                                     November 17, 2011
ELIS – Multimedia Lab

          Extraction of Semantic Features (1/2)

• Question
   – how to extract semantic features?

                                                                                                helmet


                                                                                                 face
                                                             ?
                                                                                                  wall


                                                                                                clothes

• Our answer                                                                                semantic features

   – by means of binary concept classifiers

                     Video Copy Detection using Visual and Semantic Fingerprinting
                                           Wesley De Neve                                                       16/42
                                         November 17, 2011
ELIS – Multimedia Lab

          Extraction of Semantic Features (2/2)

• Example of a binary classifier for ‘apple’

                                              apple
                                                                                      ‘apple’
                                            classifier


                                              apple
                                                                                     ‘not apple’
                                            classifier



• Concept classifiers
   – pieces of logic that know how, e.g., an “apple” image looks like
   – more formally: pieces of logic that know the statistical distribution of
     the visual features of, e.g., representative “apple” images

                     Video Copy Detection using Visual and Semantic Fingerprinting
                                           Wesley De Neve                                                    17/42
                                         November 17, 2011
ELIS – Multimedia Lab

         Challenges Concept Classification (1/2)

• Limited effectiveness

   – false negatives (due to intra-concept variability)

                                                   apple
                                                                                        ‘not apple’
                                                 classifier



   – false positives (due to inter-concept variability)

                                                   apple
                                                                                         ‘apple’
                                                 classifier




                      Video Copy Detection using Visual and Semantic Fingerprinting
                                            Wesley De Neve                                                    18/42
                                          November 17, 2011
ELIS – Multimedia Lab

           Challenges Concept Classification (2/2)

• Limited semantic coverage
   – only a limited number of concept classifiers can be supported
      • due to the high cost of training
      • experts need to collect training images for each concept classifier




  training images for                    training images for                              training images for
         ‘apple’                               ‘orange’                                       ‘strawberry’



                        Video Copy Detection using Visual and Semantic Fingerprinting
                                              Wesley De Neve                                                    19/42
                                            November 17, 2011
ELIS – Multimedia Lab
     Classifier-Based Semantic Feature Extraction
                for Video Copy Detection
• Challenges concept classification affect a semantic approach
  towards the task of video copy detection

• How to deal with this?
   – limited effectiveness of concept classifiers
       • use of semantic features that can be easily and reliably detected
           o e.g., ‘people’

   – limited semantic coverage of concept classifiers
       • use of semantic features that are general in nature
           o e.g., ‘people’ versus ‘Barack Obama’
       • use of the temporal variation of the semantic features
           o extraction of semantic features at the level of video shots
                     Video Copy Detection using Visual and Semantic Fingerprinting
                                           Wesley De Neve                                                    20/42
                                         November 17, 2011
ELIS – Multimedia Lab

                                     Outline



• Introduction
• Video copy detection
   – using visual features
   – using semantic features
• Experimental results
• Conclusions




                 Video Copy Detection using Visual and Semantic Fingerprinting
                                       Wesley De Neve                                                    21/42
                                     November 17, 2011
ELIS – Multimedia Lab

             Reference and Query Video Clips


• 311 reference video clips with a total duration of 170 h
   – 101 video clips from MUSCLE-VCD-2007
      • total duration: 80 h
   – 210 video clips from TRECVID 2008
      • total duration: 90 h


• 500 query video clips
   – the result of five transformations applied to 100 video clips randomly
     selected from the 311 reference video clips



                     Video Copy Detection using Visual and Semantic Fingerprinting
                                           Wesley De Neve                                                    22/42
                                         November 17, 2011
ELIS – Multimedia Lab

               Transformations Applied




    original                              blur                           pattern insertion




caption insertion         change in brightness                                   crop

                Video Copy Detection using Visual and Semantic Fingerprinting
                                      Wesley De Neve                                                    23/42
                                    November 17, 2011
ELIS – Multimedia Lab

                           Semantic Features


• Use of Support Vector Machines (SVM)
   – binary classifiers with state-of-the-art effectiveness


• 32 semantic concepts used
   – mean average precision (MAP): 0.51
   – ‘gravel’, ‘park’, ‘pavement’, ‘road’, ‘rock’, ‘sand’, ‘sidewalk’, ‘face’,
     ‘people’, ‘indoor’, ‘field’, ‘peak’, ‘wood’, ‘night’, ‘street’, ‘flowers’,
     ‘leaves’, ‘trees’, ‘cloudy’, ‘sunny’, ‘sunset’, ‘brick’, ‘arch’, ‘buildings’,
     ‘wall’, ‘windows’, ‘beach’, ‘high-wave’, ‘low-wave’, ‘still water’,
     ‘mirrored water’, and ‘snow’



                       Video Copy Detection using Visual and Semantic Fingerprinting
                                             Wesley De Neve                                                    24/42
                                           November 17, 2011
ELIS – Multimedia Lab

                     Video Matching

reference
video clip


             d1

  query
video clip




             Video Copy Detection using Visual and Semantic Fingerprinting
                                   Wesley De Neve                                                    25/42
                                 November 17, 2011
ELIS – Multimedia Lab

                     Video Matching

reference
video clip


                                   d2

  query
video clip




             Video Copy Detection using Visual and Semantic Fingerprinting
                                   Wesley De Neve                                                    26/42
                                 November 17, 2011
ELIS – Multimedia Lab

                     Video Matching

reference
video clip


                                                        d3

  query
video clip




             Video Copy Detection using Visual and Semantic Fingerprinting
                                   Wesley De Neve                                                    27/42
                                 November 17, 2011
ELIS – Multimedia Lab

                     Video Matching

reference
video clip


                                                                             d4

  query
video clip




             Video Copy Detection using Visual and Semantic Fingerprinting
                                   Wesley De Neve                                                    28/42
                                 November 17, 2011
ELIS – Multimedia Lab

                               Video Matching

reference
video clip


                                                                                                  d5

  query
video clip



 • di : linearly weighted combination of the Manhattan distance between the
    – visual features of the query video clip and the part of the reference
         video clip in the sliding window
    – semantic features of the query video clip and the part of the reference
         video clip in the sliding window
                       Video Copy Detection using Visual and Semantic Fingerprinting
                                             Wesley De Neve                                                    29/42
                                           November 17, 2011
ELIS – Multimedia Lab

        Normalized Detection Cost Ratio (NDCR)

• Definition
                            NDCR = Pmiss + β * RFA
       where
                Pmiss       = NFN / Nqueries                                         missed detection probablity

                RFA         = NFP / (Tquery * Trefdata)                              false alarm rate (per hour)



• We set β to a value of 2 (“balanced profile”)
   – see “CBCD Evaluation Plan TRECVID 2010 v3”
   – assigns a higher cost to raising false alarms


• A value of zero indicates perfect detection performance
                     Video Copy Detection using Visual and Semantic Fingerprinting
                                           Wesley De Neve                                                          30/42
                                         November 17, 2011
ELIS – Multimedia Lab

               Semantic Concept Models Used




• AP (Average Precision)
   – true positive rate: #true positives / (#true positives + #false positives)
   – averaged over 100 query video clips
• MAP (Mean AP) of the 32 semantic concept models: 0.52
                      Video Copy Detection using Visual and Semantic Fingerprinting
                                            Wesley De Neve                                                    31/42
                                          November 17, 2011
ELIS – Multimedia Lab

             Effectiveness of Bimodal Fusion




• Bimodal fusion of visual and semantic features outperforms the
  separate use of either type of features for all transformations
                   Video Copy Detection using Visual and Semantic Fingerprinting
                                         Wesley De Neve                                                    32/42
                                       November 17, 2011
ELIS – Multimedia Lab

Comparison of Effectiveness of Video Copy Detection




• In general, bimodal fusion of visual and semantic features
  outperforms the other techniques for video copy detection
                  Video Copy Detection using Visual and Semantic Fingerprinting
                                        Wesley De Neve                                                    33/42
                                      November 17, 2011
ELIS – Multimedia Lab

 Robustness Against Variation in Semantic Coverage




• The more concepts used, the more effective NDVC detection
                  Video Copy Detection using Visual and Semantic Fingerprinting
                                        Wesley De Neve                                                    34/42
                                      November 17, 2011
ELIS – Multimedia Lab

Influence of Effectiveness of Semantic Concept Detection



                                                                             0.925

                                                                              0.711




 • The effectiveness of NDVC detection starts to stabilize once
   the MAP of the concept detectors is higher than 0.3
                    Video Copy Detection using Visual and Semantic Fingerprinting
                                          Wesley De Neve                                                     35/42
                                        November 17, 2011
ELIS – Multimedia Lab

    Time Complexity of Creating Video Signatures




• Measurements expressed in seconds
   – include the time to perform
       o shot segmentation
       o keyframe selection
       o feature extraction
                     Video Copy Detection using Visual and Semantic Fingerprinting
                                           Wesley De Neve                                                    36/42
                                         November 17, 2011
ELIS – Multimedia Lab

                Time Complexity of Matching




• Measurements expressed in seconds
   – include the time to
       o compute the temporal entropy for the proposed method
       o perform matching using a sliding window approach

                     Video Copy Detection using Visual and Semantic Fingerprinting
                                           Wesley De Neve                                                    37/42
                                         November 17, 2011
ELIS – Multimedia Lab

                        Storage Complexity




• Measurements expressed in Mbytes
   – storing the 32 semantic features requires 4 bytes per shot
   – storing the MPEG-7 visual features requires about 0.4 kbytes per shot


                     Video Copy Detection using Visual and Semantic Fingerprinting
                                           Wesley De Neve                                                    38/42
                                         November 17, 2011
ELIS – Multimedia Lab

                                     Outline



• Introduction
• Video copy detection
   – using visual features
   – using semantic features
• Experimental results
• Conclusions




                 Video Copy Detection using Visual and Semantic Fingerprinting
                                       Wesley De Neve                                                    39/42
                                     November 17, 2011
ELIS – Multimedia Lab

                           Conclusions (1/3)

• Discussed the novel idea of using semantic features for the
  purpose of video copy detection
   – given the observation that no single type of visual feature exists that
     is robust against all possible transformations
   – given the observation that transformations tend to preserve
     semantic information
   – (given the observation that the semantic features extracted can be
     reused for annotation purposes)


• Experimental results
   – fusion of visual and semantic features outperforms
       • the seperate use of either type of features
       • temporal ordinal measurement, PCA-SIFT, and BoVW
                     Video Copy Detection using Visual and Semantic Fingerprinting
                                           Wesley De Neve                                                    40/42
                                         November 17, 2011
ELIS – Multimedia Lab

                           Conclusions (2/3)
• Current and future extensions

   – use of the temporal variation of concept confidence values
      • studied by the National University of Singapore

   – classifier-free semantic feature extraction
       • takes advantage of collective knowledge available on Flickr
           o unrestricted semantic concept vocabulary (higher coverage)
       • accepted for publication in IEEE Trans. on CSVT

   – improved semantic distance measurement

   – indexing of semantic features
                     Video Copy Detection using Visual and Semantic Fingerprinting
                                           Wesley De Neve                                                    41/42
                                         November 17, 2011
ELIS – Multimedia Lab

                            Conclusions (3/3)


• Publications of interest

   – “Bimodal fusion of low-level visual features and high-level semantic
     features for near-duplicate video clip detection”
       o published in Signal Processing – Image Communication


   – “Near-Duplicate Video Clip Detection Using Model-Free Semantic
     Concept Detection and Adaptive Semantic Distance Measurement”
       o published in IEEE Trans. on Circuits and Systems for Video Technology




                      Video Copy Detection using Visual and Semantic Fingerprinting
                                            Wesley De Neve                                                    42/42
                                          November 17, 2011

Más contenido relacionado

Destacado (10)

Visual Search
Visual SearchVisual Search
Visual Search
 
CBIR
CBIRCBIR
CBIR
 
Content Based Image and Video Retrieval Algorithm
Content Based Image and Video Retrieval AlgorithmContent Based Image and Video Retrieval Algorithm
Content Based Image and Video Retrieval Algorithm
 
Content-based Image Retrieval
Content-based Image RetrievalContent-based Image Retrieval
Content-based Image Retrieval
 
CBIR
CBIRCBIR
CBIR
 
Content Based Image Retrieval
Content Based Image Retrieval Content Based Image Retrieval
Content Based Image Retrieval
 
Recent advances in visual information retrieval marques klu june 2010
Recent advances in visual information retrieval marques klu june 2010Recent advances in visual information retrieval marques klu june 2010
Recent advances in visual information retrieval marques klu june 2010
 
Image retrieval: challenges and opportunities
Image retrieval: challenges and opportunitiesImage retrieval: challenges and opportunities
Image retrieval: challenges and opportunities
 
Advances in Image Search and Retrieval
Advances in Image Search and RetrievalAdvances in Image Search and Retrieval
Advances in Image Search and Retrieval
 
Content based image retrieval(cbir)
Content based image retrieval(cbir)Content based image retrieval(cbir)
Content based image retrieval(cbir)
 

Más de Wesley De Neve

Más de Wesley De Neve (20)

Towards diagnosis of rotator cuff tears in 3-D MRI using 3-D convolutional ne...
Towards diagnosis of rotator cuff tears in 3-D MRI using 3-D convolutional ne...Towards diagnosis of rotator cuff tears in 3-D MRI using 3-D convolutional ne...
Towards diagnosis of rotator cuff tears in 3-D MRI using 3-D convolutional ne...
 
Investigating the biological relevance in trained embedding representations o...
Investigating the biological relevance in trained embedding representations o...Investigating the biological relevance in trained embedding representations o...
Investigating the biological relevance in trained embedding representations o...
 
Impact of adversarial examples on deep learning models for biomedical image s...
Impact of adversarial examples on deep learning models for biomedical image s...Impact of adversarial examples on deep learning models for biomedical image s...
Impact of adversarial examples on deep learning models for biomedical image s...
 
Learning Biologically Relevant Features Using Convolutional Neural Networks f...
Learning Biologically Relevant Features Using Convolutional Neural Networks f...Learning Biologically Relevant Features Using Convolutional Neural Networks f...
Learning Biologically Relevant Features Using Convolutional Neural Networks f...
 
The 5th Aslla Symposium
The 5th Aslla SymposiumThe 5th Aslla Symposium
The 5th Aslla Symposium
 
Ghent University Global Campus 101
Ghent University Global Campus 101Ghent University Global Campus 101
Ghent University Global Campus 101
 
Booklet for the First GUGC Research Symposium
Booklet for the First GUGC Research SymposiumBooklet for the First GUGC Research Symposium
Booklet for the First GUGC Research Symposium
 
Center for Biotech Data Science at Ghent University Global Campus
Center for Biotech Data Science at Ghent University Global CampusCenter for Biotech Data Science at Ghent University Global Campus
Center for Biotech Data Science at Ghent University Global Campus
 
Center for Biotech Data Science at Ghent University Global Campus
Center for Biotech Data Science at Ghent University Global CampusCenter for Biotech Data Science at Ghent University Global Campus
Center for Biotech Data Science at Ghent University Global Campus
 
Learning biologically relevant features using convolutional neural networks f...
Learning biologically relevant features using convolutional neural networks f...Learning biologically relevant features using convolutional neural networks f...
Learning biologically relevant features using convolutional neural networks f...
 
Towards reading genomic data using deep learning-driven NLP techniques
Towards reading genomic data using deep learning-driven NLP techniquesTowards reading genomic data using deep learning-driven NLP techniques
Towards reading genomic data using deep learning-driven NLP techniques
 
Deep Machine Learning for Making Sense of Biotech Data - From Clean Energy to...
Deep Machine Learning for Making Sense of Biotech Data - From Clean Energy to...Deep Machine Learning for Making Sense of Biotech Data - From Clean Energy to...
Deep Machine Learning for Making Sense of Biotech Data - From Clean Energy to...
 
GUGC Info Session - Informatics and Bioinformatics
GUGC Info Session - Informatics and BioinformaticsGUGC Info Session - Informatics and Bioinformatics
GUGC Info Session - Informatics and Bioinformatics
 
Ghent University Global Campus - Sungkyunkwan University: Workshop on Researc...
Ghent University Global Campus - Sungkyunkwan University: Workshop on Researc...Ghent University Global Campus - Sungkyunkwan University: Workshop on Researc...
Ghent University Global Campus - Sungkyunkwan University: Workshop on Researc...
 
Ghent University and GUGC-K: Overview of Teaching and Research Activities
Ghent University and GUGC-K: Overview of Teaching and Research ActivitiesGhent University and GUGC-K: Overview of Teaching and Research Activities
Ghent University and GUGC-K: Overview of Teaching and Research Activities
 
Biotech Data Science @ GUGC in Korea: Deep Learning for Prediction of Drug-Ta...
Biotech Data Science @ GUGC in Korea: Deep Learning for Prediction of Drug-Ta...Biotech Data Science @ GUGC in Korea: Deep Learning for Prediction of Drug-Ta...
Biotech Data Science @ GUGC in Korea: Deep Learning for Prediction of Drug-Ta...
 
Exploring Deep Machine Learning for Automatic Right Whale Recognition and No...
 Exploring Deep Machine Learning for Automatic Right Whale Recognition and No... Exploring Deep Machine Learning for Automatic Right Whale Recognition and No...
Exploring Deep Machine Learning for Automatic Right Whale Recognition and No...
 
Deep Machine Learning for Automating Biotech Tasks Through Self-Learning Expe...
Deep Machine Learning for Automating Biotech Tasks Through Self-Learning Expe...Deep Machine Learning for Automating Biotech Tasks Through Self-Learning Expe...
Deep Machine Learning for Automating Biotech Tasks Through Self-Learning Expe...
 
Towards using multimedia technology for biological data processing
Towards using multimedia technology for biological data processingTowards using multimedia technology for biological data processing
Towards using multimedia technology for biological data processing
 
Multimedia Lab @ Ghent University - iMinds - Organizational Overview & Outlin...
Multimedia Lab @ Ghent University - iMinds - Organizational Overview & Outlin...Multimedia Lab @ Ghent University - iMinds - Organizational Overview & Outlin...
Multimedia Lab @ Ghent University - iMinds - Organizational Overview & Outlin...
 

Último

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 

Último (20)

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 

Video Copy Detection Using Visual and Semantic Fingerprinting

  • 1. ELIS – Multimedia Lab Video Copy Detection using Visual and Semantic Fingerprinting Presentations aOG MIT November 17, 2011 Wesley De Neve Multimedia Lab Image and Video Systems Lab Dept. of Electronics & Information Systems Dept. of Electrical Engineering Faculty of Engineering & Architecture College of Information Science & Technology Ghent University – IBBT KAIST Ghent, Belgium Daejeon, South Korea
  • 2. ELIS – Multimedia Lab Context Research Effort • Worked in South Korea during the past four years – at ICU and KAIST in Daejeon – main focus on advising graduate students • keeping track of the state-of-the-art • identifying and solving research questions • help communicate research results – main research topics • data-driven image annotation and tag relevance learning • face recognition using online social network context • video surveillance and privacy protection • video copy detection Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 2/42 November 17, 2011
  • 3. ELIS – Multimedia Lab Outline • Introduction • Video copy detection – using visual features – using semantic features • Experimental results • Conclusions Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 3/42 November 17, 2011
  • 4. ELIS – Multimedia Lab Outline • Introduction • Video copy detection – using visual features – using semantic features • Experimental results • Conclusions Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 4/42 November 17, 2011
  • 5. ELIS – Multimedia Lab Introduction (1/3) • Increasing consumption of online video content – thanks to easy-to-use multimedia devices and online services – thanks to cheap storage and bandwidth – thanks to an increasing number of people going online • Increasing availability of online video content – digitization of professional video archives – user-generated video content Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 5/42 November 17, 2011
  • 6. ELIS – Multimedia Lab Introduction (2/3) • Some statistics – professional video content • BBC Motion Gallery (as of January 2009) o contains over 2.5 million hours of video content o dating back 60 years in time – user-generated video content • YouTube (as of May 2011) o 48 hours of new video content are uploaded each minute Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 6/42 November 17, 2011
  • 7. ELIS – Multimedia Lab Introduction (3/3) • Problem: digital video overload – our ability to automatically manage video clips does not keep up with our ability to create and store video clips – makes it, e.g., more and more difficult to find video clips of interest • Part of the solution: techniques for video copy detection – help in managing vast libraries of video clips • reduction of visual redundancy in video search results • detection of copyright infringement • metadata propagation along visual links • media usage monitoring • search by video query Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 7/42 November 17, 2011
  • 8. ELIS – Multimedia Lab Duplicates versus Near-Duplicates • Duplicate video clips – exact video copies – can be easily detected using hashing • Near-duplicate video clips (NDVCs) – transformed video clips – detection is challenging transformation original video clip black & white cropping flipping Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 8/42 November 17, 2011
  • 9. ELIS – Multimedia Lab Applications: Reduction of Visual Redundancy (1/2) visual redundancy visual redundancy Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 9/42 November 17, 2011
  • 10. ELIS – Multimedia Lab Applications: Detection of Copyright Infringement (2/2) • Missed by YouTube’s ContentID • Transformations used o scaling o recompression Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 10/42 November 17, 2011
  • 11. ELIS – Multimedia Lab Outline • Introduction • Video copy detection – using visual features – using semantic features • Experimental results • Conclusions Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 11/42 November 17, 2011
  • 12. ELIS – Multimedia Lab System for Video Copy Detection: Conceptual Design query video clip Realized by means of video signatures collection of video matching reference video clips original video clip copy found ≈ Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 12/42 November 17, 2011
  • 13. ELIS – Multimedia Lab Video Signatures • Aim at uniquely characterizing a video clip • Commonly consist of visual features – e.g., color, texture, shape, and motion • Are low-dimensional representations – in order to facilitate more efficient matching dimensionality reduction ... 921600-D (1280x720) 128-D (128 bins) Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 13/42 November 17, 2011
  • 14. ELIS – Multimedia Lab Room for Improvement • Observations – no single type of visual feature has thus far emerged that is robust against all possible transformations – transformations tend to preserve semantic features semantic textual helmet face wall clothes / features descriptions • Research question – how about (additionally) making use of semantic features? Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 14/42 November 17, 2011
  • 15. ELIS – Multimedia Lab Outline • Introduction • Video copy detection – using visual features – using semantic features • Experimental results • Conclusions Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 15/42 November 17, 2011
  • 16. ELIS – Multimedia Lab Extraction of Semantic Features (1/2) • Question – how to extract semantic features? helmet face ? wall clothes • Our answer semantic features – by means of binary concept classifiers Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 16/42 November 17, 2011
  • 17. ELIS – Multimedia Lab Extraction of Semantic Features (2/2) • Example of a binary classifier for ‘apple’ apple ‘apple’ classifier apple ‘not apple’ classifier • Concept classifiers – pieces of logic that know how, e.g., an “apple” image looks like – more formally: pieces of logic that know the statistical distribution of the visual features of, e.g., representative “apple” images Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 17/42 November 17, 2011
  • 18. ELIS – Multimedia Lab Challenges Concept Classification (1/2) • Limited effectiveness – false negatives (due to intra-concept variability) apple ‘not apple’ classifier – false positives (due to inter-concept variability) apple ‘apple’ classifier Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 18/42 November 17, 2011
  • 19. ELIS – Multimedia Lab Challenges Concept Classification (2/2) • Limited semantic coverage – only a limited number of concept classifiers can be supported • due to the high cost of training • experts need to collect training images for each concept classifier training images for training images for training images for ‘apple’ ‘orange’ ‘strawberry’ Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 19/42 November 17, 2011
  • 20. ELIS – Multimedia Lab Classifier-Based Semantic Feature Extraction for Video Copy Detection • Challenges concept classification affect a semantic approach towards the task of video copy detection • How to deal with this? – limited effectiveness of concept classifiers • use of semantic features that can be easily and reliably detected o e.g., ‘people’ – limited semantic coverage of concept classifiers • use of semantic features that are general in nature o e.g., ‘people’ versus ‘Barack Obama’ • use of the temporal variation of the semantic features o extraction of semantic features at the level of video shots Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 20/42 November 17, 2011
  • 21. ELIS – Multimedia Lab Outline • Introduction • Video copy detection – using visual features – using semantic features • Experimental results • Conclusions Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 21/42 November 17, 2011
  • 22. ELIS – Multimedia Lab Reference and Query Video Clips • 311 reference video clips with a total duration of 170 h – 101 video clips from MUSCLE-VCD-2007 • total duration: 80 h – 210 video clips from TRECVID 2008 • total duration: 90 h • 500 query video clips – the result of five transformations applied to 100 video clips randomly selected from the 311 reference video clips Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 22/42 November 17, 2011
  • 23. ELIS – Multimedia Lab Transformations Applied original blur pattern insertion caption insertion change in brightness crop Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 23/42 November 17, 2011
  • 24. ELIS – Multimedia Lab Semantic Features • Use of Support Vector Machines (SVM) – binary classifiers with state-of-the-art effectiveness • 32 semantic concepts used – mean average precision (MAP): 0.51 – ‘gravel’, ‘park’, ‘pavement’, ‘road’, ‘rock’, ‘sand’, ‘sidewalk’, ‘face’, ‘people’, ‘indoor’, ‘field’, ‘peak’, ‘wood’, ‘night’, ‘street’, ‘flowers’, ‘leaves’, ‘trees’, ‘cloudy’, ‘sunny’, ‘sunset’, ‘brick’, ‘arch’, ‘buildings’, ‘wall’, ‘windows’, ‘beach’, ‘high-wave’, ‘low-wave’, ‘still water’, ‘mirrored water’, and ‘snow’ Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 24/42 November 17, 2011
  • 25. ELIS – Multimedia Lab Video Matching reference video clip d1 query video clip Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 25/42 November 17, 2011
  • 26. ELIS – Multimedia Lab Video Matching reference video clip d2 query video clip Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 26/42 November 17, 2011
  • 27. ELIS – Multimedia Lab Video Matching reference video clip d3 query video clip Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 27/42 November 17, 2011
  • 28. ELIS – Multimedia Lab Video Matching reference video clip d4 query video clip Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 28/42 November 17, 2011
  • 29. ELIS – Multimedia Lab Video Matching reference video clip d5 query video clip • di : linearly weighted combination of the Manhattan distance between the – visual features of the query video clip and the part of the reference video clip in the sliding window – semantic features of the query video clip and the part of the reference video clip in the sliding window Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 29/42 November 17, 2011
  • 30. ELIS – Multimedia Lab Normalized Detection Cost Ratio (NDCR) • Definition NDCR = Pmiss + β * RFA where Pmiss = NFN / Nqueries missed detection probablity RFA = NFP / (Tquery * Trefdata) false alarm rate (per hour) • We set β to a value of 2 (“balanced profile”) – see “CBCD Evaluation Plan TRECVID 2010 v3” – assigns a higher cost to raising false alarms • A value of zero indicates perfect detection performance Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 30/42 November 17, 2011
  • 31. ELIS – Multimedia Lab Semantic Concept Models Used • AP (Average Precision) – true positive rate: #true positives / (#true positives + #false positives) – averaged over 100 query video clips • MAP (Mean AP) of the 32 semantic concept models: 0.52 Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 31/42 November 17, 2011
  • 32. ELIS – Multimedia Lab Effectiveness of Bimodal Fusion • Bimodal fusion of visual and semantic features outperforms the separate use of either type of features for all transformations Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 32/42 November 17, 2011
  • 33. ELIS – Multimedia Lab Comparison of Effectiveness of Video Copy Detection • In general, bimodal fusion of visual and semantic features outperforms the other techniques for video copy detection Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 33/42 November 17, 2011
  • 34. ELIS – Multimedia Lab Robustness Against Variation in Semantic Coverage • The more concepts used, the more effective NDVC detection Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 34/42 November 17, 2011
  • 35. ELIS – Multimedia Lab Influence of Effectiveness of Semantic Concept Detection 0.925 0.711 • The effectiveness of NDVC detection starts to stabilize once the MAP of the concept detectors is higher than 0.3 Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 35/42 November 17, 2011
  • 36. ELIS – Multimedia Lab Time Complexity of Creating Video Signatures • Measurements expressed in seconds – include the time to perform o shot segmentation o keyframe selection o feature extraction Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 36/42 November 17, 2011
  • 37. ELIS – Multimedia Lab Time Complexity of Matching • Measurements expressed in seconds – include the time to o compute the temporal entropy for the proposed method o perform matching using a sliding window approach Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 37/42 November 17, 2011
  • 38. ELIS – Multimedia Lab Storage Complexity • Measurements expressed in Mbytes – storing the 32 semantic features requires 4 bytes per shot – storing the MPEG-7 visual features requires about 0.4 kbytes per shot Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 38/42 November 17, 2011
  • 39. ELIS – Multimedia Lab Outline • Introduction • Video copy detection – using visual features – using semantic features • Experimental results • Conclusions Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 39/42 November 17, 2011
  • 40. ELIS – Multimedia Lab Conclusions (1/3) • Discussed the novel idea of using semantic features for the purpose of video copy detection – given the observation that no single type of visual feature exists that is robust against all possible transformations – given the observation that transformations tend to preserve semantic information – (given the observation that the semantic features extracted can be reused for annotation purposes) • Experimental results – fusion of visual and semantic features outperforms • the seperate use of either type of features • temporal ordinal measurement, PCA-SIFT, and BoVW Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 40/42 November 17, 2011
  • 41. ELIS – Multimedia Lab Conclusions (2/3) • Current and future extensions – use of the temporal variation of concept confidence values • studied by the National University of Singapore – classifier-free semantic feature extraction • takes advantage of collective knowledge available on Flickr o unrestricted semantic concept vocabulary (higher coverage) • accepted for publication in IEEE Trans. on CSVT – improved semantic distance measurement – indexing of semantic features Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 41/42 November 17, 2011
  • 42. ELIS – Multimedia Lab Conclusions (3/3) • Publications of interest – “Bimodal fusion of low-level visual features and high-level semantic features for near-duplicate video clip detection” o published in Signal Processing – Image Communication – “Near-Duplicate Video Clip Detection Using Model-Free Semantic Concept Detection and Adaptive Semantic Distance Measurement” o published in IEEE Trans. on Circuits and Systems for Video Technology Video Copy Detection using Visual and Semantic Fingerprinting Wesley De Neve 42/42 November 17, 2011