SlideShare una empresa de Scribd logo
1 de 38
Oge Marques
Florida Atlantic University
     Boca Raton, FL - USA
    “Image search and retrieval” is not a problem,
     but rather a collection of related problems that
     look like one.

    10 years after “the end of the early years”,
     research in image search and retrieval still has
     many open problems, challenges, and
     opportunities.
    This is a highly interdisciplinary field, but …

                        Image and       (Multimedia)
                                                         Information
                          Video          Database
                                                           Retrieval
                        Processing        Systems




                                          Visual
                     Machine                                 Computer
                     Learning          Information            Vision
                                         Retrieval



                                         Visual data
                                                        Human Visual
                         Data Mining    modeling and
                                                         Perception
                                       representation
    There are many things that I believe…




    … but cannot prove
The “big mismatch”
    It’s been 10 years since the “end of the early
     years” [Smeulders et al., 2000]




     ◦  Are the challenges from 2000 still relevant?
     ◦  Are the directions and guidelines from 2000 still
        appropriate?
    Revisiting the ‘Concluding Remarks’ from
     [Smeulders et al., 2000]:

     ◦  Driving forces
        “[…] content-based image retrieval (CBIR) will continue
         to grow in every direction: new audiences, new
         purposes, new styles of use, new modes of interaction,
         larger data sets, and new methods to solve the
         problems.”
    Yes, we have seen many new audiences, new
     purposes, new styles of use, and new modes
     of interaction emerge.

    Each of these usually requires new methods
     to solve the problems that they bring.

    However, not too many researchers see them
     as a driving force (as they should).
    Revisiting the ‘Concluding Remarks’ from
     [Smeulders et al., 2000]:

     ◦  Heritage of computer vision
        “An important obstacle to overcome […] is to realize
         that image retrieval does not entail solving the general
         image understanding problem.”
    I’m afraid I have bad news…
     ◦  Computer vision hasn’t made so much progress
        during the past 10 years.

     ◦  Some classical problems 

        (including image 

        understanding)

        remain unresolved.

     ◦  Similarly, CBIR from a 

        pure computer vision

        perspective didn’t work 

        too well either.
    Revisiting the ‘Concluding Remarks’ from
     [Smeulders et al., 2000]:

     ◦  Influence on computer vision
        “[…] CBIR offers a different look at traditional computer
         vision problems: large data sets, no reliance on strong
         segmentation, and revitalized interest in color image
         processing and invariance.”
    The adoption of large data sets became standard
     practice in computer vision (see Torralba’s work).
    No reliance on strong segmentation (still
     unresolved)  new areas of research, e.g.,
     automatic ROI extraction and RBIR.
    Color image processing and color descriptors
     became incredibly popular, useful, and (to some
     degree) effective.
    Invariance still a huge problem
     ◦  But it’s cheaper than ever to have multiple views.
    Revisiting the ‘Concluding Remarks’ from
     [Smeulders et al., 2000]:

     ◦  Similarity and learning
        “We make a pledge for the importance of human-
         based similarity rather than general similarity. Also,
         the connection between image semantics, image data,
         and query context will have to be made clearer in the
         future.”
        “[…] in order to bring semantics to the user, learning is
         inevitable.”
    Similarity is a tough problem to crack and
     model.

    See it for yourself…
    Are these two images similar?
    Are these two images similar?
    Is the second or the third image more similar
     to the first?
    Which image fits better to the first two: the
     third or the fourth?
    Is learning really inevitable?

    Maybe, maybe not, but it sure comes handy
     in some specific cases…
     ◦  SVM anyone?
    Revisiting the ‘Concluding Remarks’ from
     [Smeulders et al., 2000]:

     ◦  Interaction
        Better visualization options, more control to the user,
         ability to provide feedback […]
    Significant progress on visualization
     interfaces and devices.

    Relevance Feedback: still a very tricky
     tradeoff (effort vs. perceived benefit), but
     more popular than ever (rating, thumbs up/
     down, etc.)
    Revisiting the ‘Concluding Remarks’ from
     [Smeulders et al., 2000]:

     ◦  Need for databases
        “The connection between CBIR and database research
         is likely to increase in the future. […] problems like the
         definition of suitable query languages, efficient search
         in high dimensional feature space, search in the
         presence of changing similarity measures are largely
         unsolved […]”
    Very little progress
     ◦  Image search and retrieval has benefited much
        more from document information retrieval than
        from database research.
    Revisiting the ‘Concluding Remarks’ from
     [Smeulders et al., 2000]:

     ◦  The problem of evaluation
        CBIR could use a reference standard against which new
         algorithms could be evaluated (similar to TREC in the
         field of text recognition).
        “A comprehensive and publicly available collection of
         images, sorted by class and retrieval purposes,
         together with a protocol to standardize experimental
         practices, will be instrumental in the next phase of
         CBIR.”
    Significant progress on benchmarks,
     standardized datasets, etc.
     ◦  ImageCLEF
     ◦  Pascal VOC Challenge
     ◦  MSRA dataset
     ◦  Simplicity dataset
     ◦  UCID dataset and ground truth (GT)
     ◦  Accio / SIVAL dataset and GT
     ◦  Caltech 101, Caltech 256
     ◦  LabelMe
    Revisiting the ‘Concluding Remarks’ from
     [Smeulders et al., 2000]:

     ◦  Semantic gap and other sources
        “A critical point in the advancement of CBIR is the
         semantic gap, where the meaning of an image is rarely
         self-evident. […] One way to resolve the semantic gap
         comes from sources outside the image by integrating
         other sources of information about the image in the
         query.”
    The semantic gap problem has not been
     solved (and maybe will never be…)

    What are the alternatives?
     1.  Treat visual similarity and semantic relatedness
         differently
        Examples: Alipr, Google similarity search, etc.
     2.  Improve both (text-based and visual) search
         methods independently
     3.  Trust the user
        CFIR, collaborative filtering, crowdsourcing, games.
    I postulate that image search and retrieval is
     not a problem (but, instead, a collection of
     related problems that look like one)

    There are many potential opportunities for
     good solutions to specific problems

    One promising avenue: think about image
     retrieval as added value (e.g., like.com, SPE,
     etc.)
    Google Similarity Search (VisualRank) [Jing &
     Baluja, 2008]



    Google Goggles (mobile visual search)
    Google Goggles understands narrow-domain
     search and retrieval




    Several other apps for iPhone, iPad, and
     Android (e.g., kooaba and Fetch!)
    The Web 2.0 has brought about:
     ◦  New data sources
     ◦  New usage patterns
     ◦  New understanding about the users, their needs,
        habits, preferences
     ◦  New opportunities
     ◦  Lots of metadata!

     ◦  A chance to experience a true paradigm shift
        Before: image annotation is tedious, labor-intensive,
         expensive
        After: image annotation is fun!
    Games!
     ◦  Google Image Labeler
     ◦  Games with a purpose (GWAP):
        The ESP Game
        Squigl
        Matchin
    New devices and services…

     ◦  Flickr (b. 2004)
     ◦  YouTube (b. 2005)
     ◦  Flip video cameras (b. 2006)
     ◦  iPhone (b. 2007)
     ◦  iPad (b. 2010)
    New opportunities for narrowing the semantic
     gap
     ◦  From bottom up: (semi-)automatic image
        annotation
     ◦  From top down: using (content / context)
        ontologies
     ◦  Combining top-down and bottom-up

    New fields of research, including:
     ◦  Tag recommendation systems
     ◦  User intentions in image search
    Many opportunities await…
–    I believe (but cannot prove…) that successful
     Image Search & Retrieval solutions will:
     •  combine content-based image retrieval (CBIR) with
        metadata (high-level semantic-based image
        retrieval)
     •  only be truly successful in narrow domains
     •  include the user in the loop
      –  Relevance Feedback (RF)
      –  Collaborative efforts (tagging, rating, annotating)
     •  provide friendly, intuitive interfaces
     •  incorporate results and insights from cognitive
        science, particularly human visual attention,
        perception, and memory
Questions?




             omarques@fau.edu

Más contenido relacionado

La actualidad más candente

SSII2021 [SS2] Deepfake Generation and Detection – An Overview (ディープフェイクの生成と検出)
SSII2021 [SS2] Deepfake Generation and Detection – An Overview (ディープフェイクの生成と検出)SSII2021 [SS2] Deepfake Generation and Detection – An Overview (ディープフェイクの生成と検出)
SSII2021 [SS2] Deepfake Generation and Detection – An Overview (ディープフェイクの生成と検出)
SSII
 
Recent Advances in Computer Vision
Recent Advances in Computer VisionRecent Advances in Computer Vision
Recent Advances in Computer Vision
antiw
 
Elegant Resume
Elegant ResumeElegant Resume
Elegant Resume
butest
 
Multimedia Information Retrieval: What is it, and why isn't ...
Multimedia Information Retrieval: What is it, and why isn't ...Multimedia Information Retrieval: What is it, and why isn't ...
Multimedia Information Retrieval: What is it, and why isn't ...
webhostingguy
 

La actualidad más candente (16)

Data Con LA 2019 - State of the Art of Innovation in Computer Vision by Chris...
Data Con LA 2019 - State of the Art of Innovation in Computer Vision by Chris...Data Con LA 2019 - State of the Art of Innovation in Computer Vision by Chris...
Data Con LA 2019 - State of the Art of Innovation in Computer Vision by Chris...
 
Machine Learning 2 deep Learning: An Intro
Machine Learning 2 deep Learning: An IntroMachine Learning 2 deep Learning: An Intro
Machine Learning 2 deep Learning: An Intro
 
SSII2021 [SS2] Deepfake Generation and Detection – An Overview (ディープフェイクの生成と検出)
SSII2021 [SS2] Deepfake Generation and Detection – An Overview (ディープフェイクの生成と検出)SSII2021 [SS2] Deepfake Generation and Detection – An Overview (ディープフェイクの生成と検出)
SSII2021 [SS2] Deepfake Generation and Detection – An Overview (ディープフェイクの生成と検出)
 
Recent Advances in Computer Vision
Recent Advances in Computer VisionRecent Advances in Computer Vision
Recent Advances in Computer Vision
 
Elegant Resume
Elegant ResumeElegant Resume
Elegant Resume
 
16 ijcse-01237
16 ijcse-0123716 ijcse-01237
16 ijcse-01237
 
Lecture 1 computer vision introduction
Lecture 1 computer vision introductionLecture 1 computer vision introduction
Lecture 1 computer vision introduction
 
Multimedia Information Retrieval: What is it, and why isn't ...
Multimedia Information Retrieval: What is it, and why isn't ...Multimedia Information Retrieval: What is it, and why isn't ...
Multimedia Information Retrieval: What is it, and why isn't ...
 
Practical computer vision-- A problem-driven approach towards learning CV/ML/DL
Practical computer vision-- A problem-driven approach towards learning CV/ML/DLPractical computer vision-- A problem-driven approach towards learning CV/ML/DL
Practical computer vision-- A problem-driven approach towards learning CV/ML/DL
 
Who are the users of a video search system?
Who are the users of a video search system?Who are the users of a video search system?
Who are the users of a video search system?
 
Resume 2015/1
Resume 2015/1Resume 2015/1
Resume 2015/1
 
Viva presentation
Viva presentation Viva presentation
Viva presentation
 
Information Architecture Course Part 2 - Spring 2013 - Class 1
Information Architecture Course Part 2 - Spring 2013 - Class 1Information Architecture Course Part 2 - Spring 2013 - Class 1
Information Architecture Course Part 2 - Spring 2013 - Class 1
 
Information Architecture - Part 1 - Spring 2013 - Class 1
Information Architecture - Part 1 - Spring 2013 - Class 1Information Architecture - Part 1 - Spring 2013 - Class 1
Information Architecture - Part 1 - Spring 2013 - Class 1
 
FInES, ENSEMBLE and A Scientific Perspective For Enterprise Interoperability
FInES, ENSEMBLE and A Scientific Perspective For Enterprise InteroperabilityFInES, ENSEMBLE and A Scientific Perspective For Enterprise Interoperability
FInES, ENSEMBLE and A Scientific Perspective For Enterprise Interoperability
 
An Introduction to Face Detection
An Introduction to Face DetectionAn Introduction to Face Detection
An Introduction to Face Detection
 

Destacado

упко младши бр.1
упко младши бр.1упко младши бр.1
упко младши бр.1
eclass
 
WordCamp Dayton - Keynote
WordCamp Dayton - KeynoteWordCamp Dayton - Keynote
WordCamp Dayton - Keynote
Cory Miller
 
WordCamp Atlanta - Go Far Together
WordCamp Atlanta - Go Far TogetherWordCamp Atlanta - Go Far Together
WordCamp Atlanta - Go Far Together
Cory Miller
 

Destacado (20)

CAR Email 6.21.02
CAR Email 6.21.02CAR Email 6.21.02
CAR Email 6.21.02
 
Crew Documents 020700 - 020754
Crew Documents 020700 - 020754Crew Documents 020700 - 020754
Crew Documents 020700 - 020754
 
Alistair McNaught Right 2 Read presentation from Assess2010
Alistair McNaught Right 2 Read presentation from Assess2010Alistair McNaught Right 2 Read presentation from Assess2010
Alistair McNaught Right 2 Read presentation from Assess2010
 
упко младши бр.1
упко младши бр.1упко младши бр.1
упко младши бр.1
 
Entrepreneurship for Developers: Key for Success
Entrepreneurship for Developers: Key for SuccessEntrepreneurship for Developers: Key for Success
Entrepreneurship for Developers: Key for Success
 
LinkedIn Tips presented to SE Michigan Entrepreneurs Association (SEMEA)
LinkedIn Tips presented to SE Michigan Entrepreneurs Association (SEMEA)LinkedIn Tips presented to SE Michigan Entrepreneurs Association (SEMEA)
LinkedIn Tips presented to SE Michigan Entrepreneurs Association (SEMEA)
 
WordCamp Dayton - Keynote
WordCamp Dayton - KeynoteWordCamp Dayton - Keynote
WordCamp Dayton - Keynote
 
WordCamp Atlanta - Go Far Together
WordCamp Atlanta - Go Far TogetherWordCamp Atlanta - Go Far Together
WordCamp Atlanta - Go Far Together
 
RCEC Email 4.16.03
RCEC Email 4.16.03RCEC Email 4.16.03
RCEC Email 4.16.03
 
SERA Email 1.20.03
SERA Email 1.20.03SERA Email 1.20.03
SERA Email 1.20.03
 
RCEC Email 2.25.03 (b)
RCEC Email 2.25.03 (b)RCEC Email 2.25.03 (b)
RCEC Email 2.25.03 (b)
 
RCEC Email 5.5.03 (b)
RCEC Email 5.5.03 (b)RCEC Email 5.5.03 (b)
RCEC Email 5.5.03 (b)
 
Profile Inspire
Profile InspireProfile Inspire
Profile Inspire
 
Department of the Interior Preliminary Regulatory Reform Plan
Department of the Interior Preliminary Regulatory Reform PlanDepartment of the Interior Preliminary Regulatory Reform Plan
Department of the Interior Preliminary Regulatory Reform Plan
 
RCEC Email 5.30.03
RCEC Email 5.30.03RCEC Email 5.30.03
RCEC Email 5.30.03
 
Whitepaper ame purchasing
Whitepaper ame purchasingWhitepaper ame purchasing
Whitepaper ame purchasing
 
The First 6 Critical Partners for Your Business
The First 6 Critical Partners for Your BusinessThe First 6 Critical Partners for Your Business
The First 6 Critical Partners for Your Business
 
Carpe diem2
Carpe diem2Carpe diem2
Carpe diem2
 
Effectively Communicating Pharmacoeconomic Research Winnie Nelson (Brief)
Effectively Communicating Pharmacoeconomic Research Winnie Nelson (Brief)Effectively Communicating Pharmacoeconomic Research Winnie Nelson (Brief)
Effectively Communicating Pharmacoeconomic Research Winnie Nelson (Brief)
 
How to Put Your Reading on Steroids
How to Put Your Reading on SteroidsHow to Put Your Reading on Steroids
How to Put Your Reading on Steroids
 

Similar a Oge Marques (FAU) - invited talk at WISMA 2010 (Barcelona, May 2010)

A Comparative Study of Content Based Image Retrieval Trends and Approaches
A Comparative Study of Content Based Image Retrieval Trends and ApproachesA Comparative Study of Content Based Image Retrieval Trends and Approaches
A Comparative Study of Content Based Image Retrieval Trends and Approaches
CSCJournals
 
Image retrieval from the world wide web issues, techniques, and systems
Image retrieval from the world wide web issues, techniques, and systemsImage retrieval from the world wide web issues, techniques, and systems
Image retrieval from the world wide web issues, techniques, and systems
unyil96
 
Image retrieval from the world wide web
Image retrieval from the world wide webImage retrieval from the world wide web
Image retrieval from the world wide web
unyil96
 
01Introduction.pptx - C280, Computer Vision
01Introduction.pptx - C280, Computer Vision01Introduction.pptx - C280, Computer Vision
01Introduction.pptx - C280, Computer Vision
butest
 
Content Complexity, Similarity, and Consistency in Social Media: A Deep Learn...
Content Complexity, Similarity, and Consistency in Social Media: A Deep Learn...Content Complexity, Similarity, and Consistency in Social Media: A Deep Learn...
Content Complexity, Similarity, and Consistency in Social Media: A Deep Learn...
Gene Moo Lee
 

Similar a Oge Marques (FAU) - invited talk at WISMA 2010 (Barcelona, May 2010) (20)

A Comparative Study of Content Based Image Retrieval Trends and Approaches
A Comparative Study of Content Based Image Retrieval Trends and ApproachesA Comparative Study of Content Based Image Retrieval Trends and Approaches
A Comparative Study of Content Based Image Retrieval Trends and Approaches
 
Image Search: Then and Now
Image Search: Then and NowImage Search: Then and Now
Image Search: Then and Now
 
Image retrieval from the world wide web issues, techniques, and systems
Image retrieval from the world wide web issues, techniques, and systemsImage retrieval from the world wide web issues, techniques, and systems
Image retrieval from the world wide web issues, techniques, and systems
 
Image retrieval from the world wide web
Image retrieval from the world wide webImage retrieval from the world wide web
Image retrieval from the world wide web
 
Invited Talk OAGM Workshop Salzburg, May 2015
Invited Talk OAGM Workshop Salzburg, May 2015Invited Talk OAGM Workshop Salzburg, May 2015
Invited Talk OAGM Workshop Salzburg, May 2015
 
Computer vision introduction
Computer vision  introduction Computer vision  introduction
Computer vision introduction
 
CORE: Cognitive Organization for Requirements Elicitation
CORE: Cognitive Organization for Requirements ElicitationCORE: Cognitive Organization for Requirements Elicitation
CORE: Cognitive Organization for Requirements Elicitation
 
Parents
ParentsParents
Parents
 
Global Descriptor Attributes Based Content Based Image Retrieval of Query Images
Global Descriptor Attributes Based Content Based Image Retrieval of Query ImagesGlobal Descriptor Attributes Based Content Based Image Retrieval of Query Images
Global Descriptor Attributes Based Content Based Image Retrieval of Query Images
 
01Introduction.pptx - C280, Computer Vision
01Introduction.pptx - C280, Computer Vision01Introduction.pptx - C280, Computer Vision
01Introduction.pptx - C280, Computer Vision
 
Image retrieval and re ranking techniques - a survey
Image retrieval and re ranking techniques - a surveyImage retrieval and re ranking techniques - a survey
Image retrieval and re ranking techniques - a survey
 
40120140501006
4012014050100640120140501006
40120140501006
 
Interactive Video Search: Where is the User in the Age of Deep Learning?
Interactive Video Search: Where is the User in the Age of Deep Learning?Interactive Video Search: Where is the User in the Age of Deep Learning?
Interactive Video Search: Where is the User in the Age of Deep Learning?
 
A Literature Survey on Image Linguistic Visual Question Answering
A Literature Survey on Image Linguistic Visual Question AnsweringA Literature Survey on Image Linguistic Visual Question Answering
A Literature Survey on Image Linguistic Visual Question Answering
 
Content Complexity, Similarity, and Consistency in Social Media: A Deep Learn...
Content Complexity, Similarity, and Consistency in Social Media: A Deep Learn...Content Complexity, Similarity, and Consistency in Social Media: A Deep Learn...
Content Complexity, Similarity, and Consistency in Social Media: A Deep Learn...
 
Project presentation by Debendra Adhikari
Project presentation by Debendra AdhikariProject presentation by Debendra Adhikari
Project presentation by Debendra Adhikari
 
Brief History of Visual Representation Learning
Brief History of Visual Representation LearningBrief History of Visual Representation Learning
Brief History of Visual Representation Learning
 
The deep learning technology on coco framework full report
The deep learning technology on coco framework full reportThe deep learning technology on coco framework full report
The deep learning technology on coco framework full report
 
IRJET-Semi-Supervised Collaborative Image Retrieval using Relevance Feedback
IRJET-Semi-Supervised Collaborative Image Retrieval using Relevance FeedbackIRJET-Semi-Supervised Collaborative Image Retrieval using Relevance Feedback
IRJET-Semi-Supervised Collaborative Image Retrieval using Relevance Feedback
 
Efficient CBIR Using Color Histogram Processing
Efficient CBIR Using Color Histogram ProcessingEfficient CBIR Using Color Histogram Processing
Efficient CBIR Using Color Histogram Processing
 

Último

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 

Último (20)

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 

Oge Marques (FAU) - invited talk at WISMA 2010 (Barcelona, May 2010)

  • 1. Oge Marques Florida Atlantic University Boca Raton, FL - USA
  • 2.   “Image search and retrieval” is not a problem, but rather a collection of related problems that look like one.   10 years after “the end of the early years”, research in image search and retrieval still has many open problems, challenges, and opportunities.
  • 3.   This is a highly interdisciplinary field, but … Image and (Multimedia) Information Video Database Retrieval Processing Systems Visual Machine Computer Learning Information Vision Retrieval Visual data Human Visual Data Mining modeling and Perception representation
  • 4.   There are many things that I believe…   … but cannot prove
  • 6.   It’s been 10 years since the “end of the early years” [Smeulders et al., 2000] ◦  Are the challenges from 2000 still relevant? ◦  Are the directions and guidelines from 2000 still appropriate?
  • 7.   Revisiting the ‘Concluding Remarks’ from [Smeulders et al., 2000]: ◦  Driving forces   “[…] content-based image retrieval (CBIR) will continue to grow in every direction: new audiences, new purposes, new styles of use, new modes of interaction, larger data sets, and new methods to solve the problems.”
  • 8.   Yes, we have seen many new audiences, new purposes, new styles of use, and new modes of interaction emerge.   Each of these usually requires new methods to solve the problems that they bring.   However, not too many researchers see them as a driving force (as they should).
  • 9.   Revisiting the ‘Concluding Remarks’ from [Smeulders et al., 2000]: ◦  Heritage of computer vision   “An important obstacle to overcome […] is to realize that image retrieval does not entail solving the general image understanding problem.”
  • 10.   I’m afraid I have bad news… ◦  Computer vision hasn’t made so much progress during the past 10 years. ◦  Some classical problems 
 (including image 
 understanding)
 remain unresolved. ◦  Similarly, CBIR from a 
 pure computer vision
 perspective didn’t work 
 too well either.
  • 11.   Revisiting the ‘Concluding Remarks’ from [Smeulders et al., 2000]: ◦  Influence on computer vision   “[…] CBIR offers a different look at traditional computer vision problems: large data sets, no reliance on strong segmentation, and revitalized interest in color image processing and invariance.”
  • 12.   The adoption of large data sets became standard practice in computer vision (see Torralba’s work).   No reliance on strong segmentation (still unresolved)  new areas of research, e.g., automatic ROI extraction and RBIR.   Color image processing and color descriptors became incredibly popular, useful, and (to some degree) effective.   Invariance still a huge problem ◦  But it’s cheaper than ever to have multiple views.
  • 13.
  • 14.   Revisiting the ‘Concluding Remarks’ from [Smeulders et al., 2000]: ◦  Similarity and learning   “We make a pledge for the importance of human- based similarity rather than general similarity. Also, the connection between image semantics, image data, and query context will have to be made clearer in the future.”   “[…] in order to bring semantics to the user, learning is inevitable.”
  • 15.   Similarity is a tough problem to crack and model.   See it for yourself…
  • 16.   Are these two images similar?
  • 17.   Are these two images similar?
  • 18.   Is the second or the third image more similar to the first?
  • 19.   Which image fits better to the first two: the third or the fourth?
  • 20.   Is learning really inevitable?   Maybe, maybe not, but it sure comes handy in some specific cases… ◦  SVM anyone?
  • 21.   Revisiting the ‘Concluding Remarks’ from [Smeulders et al., 2000]: ◦  Interaction   Better visualization options, more control to the user, ability to provide feedback […]
  • 22.   Significant progress on visualization interfaces and devices.   Relevance Feedback: still a very tricky tradeoff (effort vs. perceived benefit), but more popular than ever (rating, thumbs up/ down, etc.)
  • 23.   Revisiting the ‘Concluding Remarks’ from [Smeulders et al., 2000]: ◦  Need for databases   “The connection between CBIR and database research is likely to increase in the future. […] problems like the definition of suitable query languages, efficient search in high dimensional feature space, search in the presence of changing similarity measures are largely unsolved […]”
  • 24.   Very little progress ◦  Image search and retrieval has benefited much more from document information retrieval than from database research.
  • 25.   Revisiting the ‘Concluding Remarks’ from [Smeulders et al., 2000]: ◦  The problem of evaluation   CBIR could use a reference standard against which new algorithms could be evaluated (similar to TREC in the field of text recognition).   “A comprehensive and publicly available collection of images, sorted by class and retrieval purposes, together with a protocol to standardize experimental practices, will be instrumental in the next phase of CBIR.”
  • 26.   Significant progress on benchmarks, standardized datasets, etc. ◦  ImageCLEF ◦  Pascal VOC Challenge ◦  MSRA dataset ◦  Simplicity dataset ◦  UCID dataset and ground truth (GT) ◦  Accio / SIVAL dataset and GT ◦  Caltech 101, Caltech 256 ◦  LabelMe
  • 27.   Revisiting the ‘Concluding Remarks’ from [Smeulders et al., 2000]: ◦  Semantic gap and other sources   “A critical point in the advancement of CBIR is the semantic gap, where the meaning of an image is rarely self-evident. […] One way to resolve the semantic gap comes from sources outside the image by integrating other sources of information about the image in the query.”
  • 28.   The semantic gap problem has not been solved (and maybe will never be…)   What are the alternatives? 1.  Treat visual similarity and semantic relatedness differently   Examples: Alipr, Google similarity search, etc. 2.  Improve both (text-based and visual) search methods independently 3.  Trust the user   CFIR, collaborative filtering, crowdsourcing, games.
  • 29.   I postulate that image search and retrieval is not a problem (but, instead, a collection of related problems that look like one)   There are many potential opportunities for good solutions to specific problems   One promising avenue: think about image retrieval as added value (e.g., like.com, SPE, etc.)
  • 30.   Google Similarity Search (VisualRank) [Jing & Baluja, 2008]   Google Goggles (mobile visual search)
  • 31.   Google Goggles understands narrow-domain search and retrieval   Several other apps for iPhone, iPad, and Android (e.g., kooaba and Fetch!)
  • 32.   The Web 2.0 has brought about: ◦  New data sources ◦  New usage patterns ◦  New understanding about the users, their needs, habits, preferences ◦  New opportunities ◦  Lots of metadata! ◦  A chance to experience a true paradigm shift   Before: image annotation is tedious, labor-intensive, expensive   After: image annotation is fun!
  • 33.   Games! ◦  Google Image Labeler ◦  Games with a purpose (GWAP):   The ESP Game   Squigl   Matchin
  • 34.   New devices and services… ◦  Flickr (b. 2004) ◦  YouTube (b. 2005) ◦  Flip video cameras (b. 2006) ◦  iPhone (b. 2007) ◦  iPad (b. 2010)
  • 35.   New opportunities for narrowing the semantic gap ◦  From bottom up: (semi-)automatic image annotation ◦  From top down: using (content / context) ontologies ◦  Combining top-down and bottom-up   New fields of research, including: ◦  Tag recommendation systems ◦  User intentions in image search
  • 36.   Many opportunities await…
  • 37. –  I believe (but cannot prove…) that successful Image Search & Retrieval solutions will: •  combine content-based image retrieval (CBIR) with metadata (high-level semantic-based image retrieval) •  only be truly successful in narrow domains •  include the user in the loop –  Relevance Feedback (RF) –  Collaborative efforts (tagging, rating, annotating) •  provide friendly, intuitive interfaces •  incorporate results and insights from cognitive science, particularly human visual attention, perception, and memory
  • 38. Questions? omarques@fau.edu