Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.

Extracting event and place semantics from Flickr tags

Presentation at SIGIR 2007.

  • Inicia sesión para ver los comentarios

Extracting event and place semantics from Flickr tags

  1. 1. Tye Rattenbury, Nathan Good, Mor Naaman* Yahoo! Research Berkeley (Yahoo! Advanced Development Division) Towards Automatic Extraction of Event and Place Semantics from Flickr Tags
  2. 2. This Talk is Not About Photos <ul><li>Geo-referenced data </li></ul><ul><ul><li>Photos </li></ul></ul><ul><ul><li>Blog posts (GeoRSS) </li></ul></ul><ul><ul><li>Geo-annotated Web pages </li></ul></ul><ul><ul><li>KML-based collections </li></ul></ul><ul><li>Similar issues, similar potential </li></ul>
  3. 3. Find the Differences “ Yahoo! Headquarters” “ Dog” “ SIGIR 2007”
  4. 4. Data Description
  5. 5. Simple Model (photo_id, user_id, time, latitude, longitude) (photo_id, tag)
  6. 6. Information? <ul><ul><li>Flickr “geotagged” in San Francisco </li></ul></ul>
  7. 7. Tag Maps - San Francisco
  8. 8. Tag-based Semantics <ul><li>Derive meaningful data about individual tags </li></ul><ul><li>Based on the tag’s metadata patterns </li></ul><ul><li>E.g., ”Yahoo! Headquarters” , “SIGIR2007” , “Dog” . </li></ul>
  9. 9. Tag Semantics: Why? <ul><li>Improved image (and web!) search through query semantics </li></ul><ul><li>Automatic place- and event-gazetteers </li></ul><ul><li>Association of missing time/place data based on tags </li></ul><ul><li>Collection visualization </li></ul><ul><li>… </li></ul>
  10. 10. Next Few Minutes <ul><li>Extended model </li></ul><ul><li>Sample data </li></ul><ul><li>Problem definition </li></ul>
  11. 11. Extended Model (photo_id, user_id, time, latitude, longitude) (photo_id, tag) (tag, location) (tag, time)
  12. 12. Tag Patterns
  13. 13. Tag Patterns
  14. 14. Tag Patterns
  15. 15. Definitions <ul><li>Event tag: expected to exhibit significant time-based patterns </li></ul><ul><li>Location tag: expected to exhibit significant place-based patterns </li></ul>
  16. 16. Problem Definition <ul><li>Can we extract event/place semantics from unstructured tags given their usage distribution (time and location patterns)? </li></ul>
  17. 17. Fundamental Issues <ul><li>“ Regions” of semantic relevance </li></ul><ul><ul><li>“ church” in a neighborhood is a place </li></ul></ul><ul><ul><li>“ carnival” in Rio de Janeiro is an event </li></ul></ul><ul><ul><li>“ California” in San Francsico is not a place </li></ul></ul><ul><li>Semantic ambiguity </li></ul><ul><ul><li>“ apple” </li></ul></ul><ul><ul><li>“ turkey” </li></ul></ul><ul><ul><li>“ pride” </li></ul></ul>
  18. 18. Proposed Solutions <ul><li>Naïve Scan </li></ul><ul><ul><li>Burst detection </li></ul></ul><ul><li>Spatial Scan Statistics </li></ul><ul><ul><li>Borrowed from disease outbreak detection </li></ul></ul><ul><li>Scale-structure method </li></ul><ul><ul><li>Examine the “structure” of the tag’s patterns in multiple scales </li></ul></ul>
  19. 19. Next Few Minutes <ul><li>Intuition behind the three methods </li></ul><ul><li>Evaluation </li></ul><ul><li>Summary </li></ul>
  20. 20. Naïve Scan
  21. 21. Spatial Scan
  22. 22. Issues <ul><li>First two methods just look for a local burst in usage (in either time or space) </li></ul>
  23. 23. Scale-structure Identification <ul><li>Select a scale </li></ul><ul><ul><li>Find the structure of the data for the scale </li></ul></ul><ul><ul><li>Calculate the entropy of the structure </li></ul></ul><ul><li>Increase scale, repeat </li></ul><ul><li>Is the information “well organized”? </li></ul>
  24. 24. Scale-structure Identification
  25. 25. Scale-structure Identification Entropy = 1.421621 Entropy = 0.766263 Entropy = 0.555915 Entropy = 0.062760 “Barry Bonds” tag
  26. 26. San Francisco Experiments <ul><li>~43k photos </li></ul><ul><li>~800 tags </li></ul>San Francisco Dataset: 42,000 Photos 800+ popular tags
  27. 27. Experiments byobw Ground Truth: BYOBW?
  28. 28. Experiments <ul><li>Given a ground truth: </li></ul><ul><ul><li>Run the three methods, vary thresholds </li></ul></ul><ul><ul><li>Precision vs. recall </li></ul></ul>
  29. 29. The Obligatory Slide Where I Show You that Our Curve is Above the Other Curves
  30. 30. Experiments
  31. 31. Future Work <ul><li>Multiple regions </li></ul><ul><li>Different spatially and temporally encoded data (e.g., blogs) </li></ul><ul><li>Other metadata domains besides time and space </li></ul><ul><ul><li>Visual features: shape primitives, color </li></ul></ul><ul><ul><li>Semantic features </li></ul></ul>
  32. 32. Questions? <ul><li>Tye Rattenbury, Nathan Good, Mor Naaman </li></ul><ul><li>Y!RB / Y!A.D.D. </li></ul><ul><li>Read more, follow: http://www.whyrb.com </li></ul><ul><li>Slides: http://slideshare.net/mor </li></ul><ul><li>Contact: mor@yahoo-inc.com </li></ul>
  33. 33. Tag Maps - Paris - Les Blogs?
  34. 34. Issues <ul><li>Noisy data </li></ul><ul><li>Photographer biases </li></ul><ul><ul><li>In locations </li></ul></ul><ul><ul><li>In Tags </li></ul></ul><ul><li>Wrong data </li></ul>Whatever, color, city, spectrum, santa barbara, california, usa, Lookatme, Herbert Bayer Chromatic Gate

×