Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.
Juan-Antonio Pastor-Sánchez
pastor@um.es
Universidad de Murcia
Virginia Bazán-Gil
virginia.bazan@rtve.es
Radio Televisión ...
RTVE is the Spanish main
public service broadcaster.
Provides services
incluiding:
 7 television channels
 6 radio stati...
Toyed with video-to-text
solutions
Testing solutions in
real workflows
Identifying metadata
needs
Determining the state of...
 SEO
 Personalize discovery
 Semantic interoperability based in API’s
+ Standarization
+ Adaptationsi to change
+ relev...
University
of Murcia:
the
research
RTVE.es:
the
innovation
The Team
RTVE
Archive: the
experience
Goal
 Explore the possibilities of automated cataloguing in live events film+cloud
edition+broadcasting+web publishing
Pa...
Findings
 S2T y OCR just for 50% of the clips
 Only 25% of the contents were
automatically classified (IPTC)
 Low relev...
Goals
 Analysis of speech-to-text solutions
 Compare with the implemented solution in
the archive (Autonomy 2009)
Speech...
Speech-to-text benchmarking
92,1%
accuracy
90,2%
accuracy
70%
accuracy
80%
accuracy
62%
accuracy
60%
accuracy
Text String ...
Speech-to-text benchmarking
Correct
spelling
Radio Televisión
Española
Barcelona Open
Bank Sabadell
IMG España
Eladio Jare...
Goals
 Define a metadata protocol for automatic
metadata extraction applicable to raw
footage.
 Camera, journalist, loca...
Automated Classification / Entities Recognition
Goals
 Get familiar with automated classification and entities recognitio...
iNews
API rtve.es
Captions
CLASSIFICATION IPTC / IAB
News
Written text / Well structured / Famous entities
ENTITIES
ENTITI...
News
 Less than 15% of the categories are relevant to classify the RTVE contents.
 IPTC /UNESCO are good to clasify in g...
Outcomes
 Classification results are promising
 Name entity recognition results are good
 Improve the taxonomies to inc...
Goals
 State of the art of video-to-text solutions.
 Neural Talk and Walk :
https://vimeo.com/146492001
Video-to-text be...
a group of
people standing
in a line
a woman sitting
at a table with a
laptop
First iteration
 Corpus: 10 videos + locato...
Video-to-text
Second Iteration
 Corpus: 20 videos + locators (general scope)
 Accuracy rate between 0% y 32%
 Good resu...
Video-to-text
Third Iteration
 Corpus: 360 videos + locators (general scope)
 Accuracy 0%
 Lack of synchrony between im...
Video-to-text
Outcomes
 Promising results, better in 2 or 3 years
 Specialized models should obtain beeter results
 Vid...
What we learn
It is possible to define a data model for RTVE:
 Idefying the metadata needs of every area
 Maping with st...
Video
catalogin
and web
analysis SKOS
vocabulary
developmen
t
Schemas and
ontologies
analysis
Catalogin
model
mapping
Data...
Semantic Web Salón del Manga
http://salonmanga.rtve.es/rtvepedia
Web
Contents
Mapping
Equivalences
PMETA-SCHEMA-CMS-
EBUCO...
On going
Technnological partnership to:
 Test the data model defined for the Salón del Manga de Barcelona (VSN)
 Automat...
Thanks for your attention...
… and now...
Questions?
Suggestions?
Criticisms?
Congratulations?
Próxima SlideShare
Cargando en…5
×

Semantics, Automatic Metadata and Audiovisual Contents. A case of study: the Barcelona International Manga Fair (BAZÁN GIL, Virginia, MATAS PASCUAL, Roberto, GÓMEZ ZOTANO, Manuel, PASTOR SANCHEZ (RTVE, Spain))

137 visualizaciones

Publicado el

FIAT/IFTA World Conference 2017 Mexico City

Publicado en: Datos y análisis
  • Sé el primero en comentar

  • Sé el primero en recomendar esto

Semantics, Automatic Metadata and Audiovisual Contents. A case of study: the Barcelona International Manga Fair (BAZÁN GIL, Virginia, MATAS PASCUAL, Roberto, GÓMEZ ZOTANO, Manuel, PASTOR SANCHEZ (RTVE, Spain))

  1. 1. Juan-Antonio Pastor-Sánchez pastor@um.es Universidad de Murcia Virginia Bazán-Gil virginia.bazan@rtve.es Radio Televisión Española Roberto Matas-Pascual roberto.matas@rtve.es Radio Televisión Española Manuel Gómez-Zotano manuel.gomez@rtve.es Radio Televisión Española Semantics, automatic metadata and audiovisual contents. A case study: Barcelona Internacional Manga Fair
  2. 2. RTVE is the Spanish main public service broadcaster. Provides services incluiding:  7 television channels  6 radio stations  Extensive Web (rtve.es)  Orchestra & Choir  Television & Radio Institute
  3. 3. Toyed with video-to-text solutions Testing solutions in real workflows Identifying metadata needs Determining the state of the art of semantic technologies a man in a suit and tie standing in front of a store The Semantic Journey
  4. 4.  SEO  Personalize discovery  Semantic interoperability based in API’s + Standarization + Adaptationsi to change + relevant and accurate results  Interoperability between production and archive  Simplify workflows  Archive re-use  Preservation  Accuracy Bussines Aplications Audiovisual Heritage Preservation Our Vision Content semantic enrichment
  5. 5. University of Murcia: the research RTVE.es: the innovation The Team RTVE Archive: the experience
  6. 6. Goal  Explore the possibilities of automated cataloguing in live events film+cloud edition+broadcasting+web publishing Partners  Etiqmedia: Supply clips + locators. Automated cataloguing system training  University of Murcia: Generatio of SKOS vocabularies (taxonomy and authorities). XXII Manga Fair Barcelona 2016
  7. 7. Findings  S2T y OCR just for 50% of the clips  Only 25% of the contents were automatically classified (IPTC)  Low relevance for the assigned categories  No entity/keyword extraction  No results for facial recogniton Outcomes  S2T and captions are esencial for classificaction and entity recognition.  Manga Fair is a very specific event: different strategies for diferent contexts XXII Manga Fair Barcelona 2016
  8. 8. Goals  Analysis of speech-to-text solutions  Compare with the implemented solution in the archive (Autonomy 2009) Speech-to-text benchmarking Solutions  Cloud Solutions  API Speech Google, Speech Recognition HP, Watson Speech-to-text (IBM), Video API (Microsoft)
  9. 9. Speech-to-text benchmarking 92,1% accuracy 90,2% accuracy 70% accuracy 80% accuracy 62% accuracy 60% accuracy Text String Entities transcription
  10. 10. Speech-to-text benchmarking Correct spelling Radio Televisión Española Barcelona Open Bank Sabadell IMG España Eladio Jareño Barcelona Hope Ing España Ing España Ing España Mexes Eladio Carreño Avión Carreño Eladio Cariño Mireia Alzuria Mirella Al Zuria Ramiro y Alcurnia Mira y al sur
  11. 11. Goals  Define a metadata protocol for automatic metadata extraction applicable to raw footage.  Camera, journalist, location, date, event  Test the adequacy of the protocol to the archiving needs. Outcomes  Protocol not fully applied: usually the recording date its missing.  From archive point of view, metadata are enough to identify raw images in production systems.  Automatic metadata from cameras are needed: date and geolocation  Cooperation of journalist is neccesary Condé de Godó 2017
  12. 12. Automated Classification / Entities Recognition Goals  Get familiar with automated classification and entities recognition solutions.  Determine its usefulness in the audiovisual context. Partners  Everis: Framework based on Moriarty, used in European Research Projects (H2020)  S|ngular: general purpose cloud solutions for text analytics
  13. 13. iNews API rtve.es Captions CLASSIFICATION IPTC / IAB News Written text / Well structured / Famous entities ENTITIES ENTITIES CLASSIFICATION IPTC / UNESCO / RTVE iNews API rtve.es Captions Fiction Transcription of spoken language / Not so well structured / No famous entities ENTITIES ENTITIES CLASSIFICATION IPTC / UNESCO / RTVE Automated Classification / Entities Recognition
  14. 14. News  Less than 15% of the categories are relevant to classify the RTVE contents.  IPTC /UNESCO are good to clasify in generic domains (politics, culture, etc.)  IPTC Only first and second level are good, third level is not efficient.  RTVE Classification is good to classify named entities and specific events. Fiction  Less than 5% of categories are relevant to classify the RTVE contents  IPTC is too specific for the news domain  UNESCO good to classify in generic domains (politics, culture, etc.)  RTVE Classification is good to classify named entities and specific events Automated Classification / Entities Recognition
  15. 15. Outcomes  Classification results are promising  Name entity recognition results are good  Improve the taxonomies to increase the precision of the automatic classification  Adapt the classification systems according to the type of content  News: scene segmentation is necessary  Fiction: classification is more dificult with narrative texts (fiction): specific styles require a deeper linguistic analysis. Automated Classification / Entities Recognition
  16. 16. Goals  State of the art of video-to-text solutions.  Neural Talk and Walk : https://vimeo.com/146492001 Video-to-text benchmarking Partner  Beeva Video-to-text
  17. 17. a group of people standing in a line a woman sitting at a table with a laptop First iteration  Corpus: 10 videos + locators (Salón del Manga de Barcelona)  Pre-trainded model: Neuraltalk  Accuracy nearly 21%  Errors derived from the model  The dataset and the scope must be expanded Video-to-text
  18. 18. Video-to-text Second Iteration  Corpus: 20 videos + locators (general scope)  Accuracy rate between 0% y 32%  Good results: scenes with people and objects  Worse landcaspe, no people or objects  Train a new model a man is standing in a field with a mountain in the background a close up of an orange and a banana
  19. 19. Video-to-text Third Iteration  Corpus: 360 videos + locators (general scope)  Accuracy 0%  Lack of synchrony between image and locators  High subjectivity in image descriptors  Unknown words statements of man UNK director of the UNK UNK of UNK
  20. 20. Video-to-text Outcomes  Promising results, better in 2 or 3 years  Specialized models should obtain beeter results  Video-to-text is not useful for interview.  Challenges Ambiguity of natural languaje Locators are not useful to train the model
  21. 21. What we learn It is possible to define a data model for RTVE:  Idefying the metadata needs of every area  Maping with standars as EBUCore, Schema or DCMI Internal workflows must be improved (specially the data life cycle) General vocabularies are not always suitale for special needs The techonology is afforable and adaptable to RTVE needs
  22. 22. Video catalogin and web analysis SKOS vocabulary developmen t Schemas and ontologies analysis Catalogin model mapping Dataset Dataset publishing Semantig tagging Dataset enrichment EBUCore Schema.org ma-ont Protégé ARCAFuseki Skosmos CMS RTVEPedia DBpedia Geonames On going: XXIII Manga Fair Barcelona 2017
  23. 23. Semantic Web Salón del Manga http://salonmanga.rtve.es/rtvepedia Web Contents Mapping Equivalences PMETA-SCHEMA-CMS- EBUCORE CSV Linking Skosmos API REST JSON-LD schema.org/ebucore JSON-LD linked with vocabularies On going: XXIII Manga Fair Barcelona 2017
  24. 24. On going Technnological partnership to:  Test the data model defined for the Salón del Manga de Barcelona (VSN)  Automatic metadata extraction Make the Manga Fair website a reference in semantic and linked data. Reuse of data trough Creative Commons Licenses. Increase the number of IRI references in order to facilitate the linking of resources from external sources (Dbpedia, Geonames, Wikidata, etc).
  25. 25. Thanks for your attention... … and now... Questions? Suggestions? Criticisms? Congratulations?

×