MediaMosa Transcripting Technology Scouting Project and Proof of Concept
Presentation at TF-Media meeting in Porto, Portugal, 28 October 2011
Presenter, Frans Ward , SURFnet
Marketplace and Quality Assurance Presentation - Vincent Chirchir
MediaMosa Transcripting Technology Project
1. Frans Ward
Technical Product Manager
SURFnet Advanced Services
MediaMosa Transcripting
Technology Scouting Project and
Proof of Concept
MediaMosa @ 5th TF-Media Workshop
Porto, October 26, 2011 - SURFnet. We make innovation work
`
Friday, October 28, 11
2. MEDIAMOSA TRANSCRIPTING TECHNOLOGY
Disclosure
of
audiovisual
archives
UK National Film and Television Archive, Berkhamsted
http://www.flickr.com/people/footage/
1 SURFnet. We make innovation work
MediaMosa @ 5th TF-Media Workshop
Porto, October 26, 2011 - SURFnet. We make innovation work
Friday, October 28, 11
3. MEDIAMOSA TRANSCRIPTING TECHNOLOGY
Disclosure
of
audiovisual
archives
• The number of AV-archives on the Internet
increases rapidly
UK National Film and Television Archive, Berkhamsted
http://www.flickr.com/people/footage/
1 SURFnet. We make innovation work
MediaMosa @ 5th TF-Media Workshop
Porto, October 26, 2011 - SURFnet. We make innovation work
Friday, October 28, 11
4. MEDIAMOSA TRANSCRIPTING TECHNOLOGY
Disclosure
of
audiovisual
archives
• The number of AV-archives on the Internet
increases rapidly
• Archiving is not enough: disclosure and reusing is
required!
UK National Film and Television Archive, Berkhamsted
http://www.flickr.com/people/footage/
1 SURFnet. We make innovation work
MediaMosa @ 5th TF-Media Workshop
Porto, October 26, 2011 - SURFnet. We make innovation work
Friday, October 28, 11
5. MEDIAMOSA TRANSCRIPTING TECHNOLOGY
Disclosure
of
audiovisual
archives
• The number of AV-archives on the Internet
increases rapidly
• Archiving is not enough: disclosure and reusing is
required!
• The use of speech technology is needed
(Reduce human effort).
UK National Film and Television Archive, Berkhamsted
http://www.flickr.com/people/footage/
1 SURFnet. We make innovation work
MediaMosa @ 5th TF-Media Workshop
Porto, October 26, 2011 - SURFnet. We make innovation work
Friday, October 28, 11
6. MEDIAMOSA TRANSCRIPTING TECHNOLOGY
Disclosure
of
audiovisual
archives
UK National Film and Television Archive, Berkhamsted
http://www.flickr.com/people/footage/
1 SURFnet. We make innovation work
MediaMosa @ 5th TF-Media Workshop
Porto, October 26, 2011 - SURFnet. We make innovation work
Friday, October 28, 11
7. MEDIAMOSA TRANSCRIPTING TECHNOLOGY
Disclosure
of
audiovisual
archives
• The number of AV-archives on the Internet
increases rapidly.
UK National Film and Television Archive, Berkhamsted
http://www.flickr.com/people/footage/
1 SURFnet. We make innovation work
MediaMosa @ 5th TF-Media Workshop
Porto, October 26, 2011 - SURFnet. We make innovation work
Friday, October 28, 11
8. MEDIAMOSA TRANSCRIPTING TECHNOLOGY
Disclosure
of
audiovisual
archives
• The number of AV-archives on the Internet
increases rapidly.
• Archiving is not enough: disclosure and reusing
is required!
UK National Film and Television Archive, Berkhamsted
http://www.flickr.com/people/footage/
1 SURFnet. We make innovation work
MediaMosa @ 5th TF-Media Workshop
Porto, October 26, 2011 - SURFnet. We make innovation work
Friday, October 28, 11
9. MEDIAMOSA TRANSCRIPTING TECHNOLOGY
Disclosure
of
audiovisual
archives
• The number of AV-archives on the Internet
increases rapidly.
• Archiving is not enough: disclosure and reusing
is required!
• Adding Metadata is the key component here.
UK National Film and Television Archive, Berkhamsted
http://www.flickr.com/people/footage/
1 SURFnet. We make innovation work
MediaMosa @ 5th TF-Media Workshop
Porto, October 26, 2011 - SURFnet. We make innovation work
Friday, October 28, 11
10. MEDIAMOSA TRANSCRIPTING TECHNOLOGY
Disclosure
of
audiovisual
archives
• The number of AV-archives on the Internet
increases rapidly.
• Archiving is not enough: disclosure and reusing
is required!
• Adding Metadata is the key component here.
• The use of speech technology is needed
(Reduce human effort).
UK National Film and Television Archive, Berkhamsted
http://www.flickr.com/people/footage/
1 SURFnet. We make innovation work
MediaMosa @ 5th TF-Media Workshop
Porto, October 26, 2011 - SURFnet. We make innovation work
Friday, October 28, 11
11. Adding metadata, the traditional approach:
Manual annotation
Huge amount of work
and no time-coded relations with video
1 SURFnet. We make innovation work
MediaMosa @ 5th TF-Media Workshop
Porto, October 26, 2011 - SURFnet. We make innovation work
Friday, October 28, 11
12. Adding metadata, the new approach:
Using speech-to-text technology for
metadata generation
Audio Extraction
Speech Recognition
(Speech-to-Text)
Time-coded Transcript
Indexing and Search:
Search on fragment level
1 SURFnet. We make innovation work
MediaMosa @ 5th TF-Media Workshop
Porto, October 26, 2011 - SURFnet. We make innovation work
Friday, October 28, 11
13. MEDIAMOSA TRANSCRIPTING TECHNOLOGY
• Transcripting: conversion of speech
into an electronic text document.
• Automatic Speech Recognition (ASR)
seems to be the ideal technology for
this.
• In combination with Optical Character
Recognition (OCR) of slides.
• Goal: to provide additional metadata
for searching in video / lecture
recordings.
1 SURFnet. We make innovation work
MediaMosa @ 5th TF-Media Workshop
Porto, October 26, 2011 - SURFnet. We make innovation work
Friday, October 28, 11
14. MEDIAMOSA TRANSCRIPTING TECHNOLOGY
The Technology Scout Project. The process is complex...
1 SURFnet. We make innovation work
MediaMosa @ 5th TF-Media Workshop
Porto, October 26, 2011 - SURFnet. We make innovation work
Friday, October 28, 11
15. MEDIAMOSA TRANSCRIPTING TECHNOLOGY
SCOUTING PROJECT
Lecture Recording
End User • Recording of Teacher
Application • Recording of Slides
• Reference material
Transcription Multi-Source
MediaMosa
by Spraak / Player
Cmu Sphinx
• Recognize the Speech • Transcode into audio • Enhanced Search
• Produce time-coded • Store all into an asset • Optional Subtitles
Transcript • Mashup info
Partners:
MediaMosa @ 5th TF-Media Workshop
Porto, October 26, 2011 - SURFnet. We make innovation work
Friday, October 28, 11
16. MEDIAMOSA TRANSCRIPTING PROJECT
MediaMosa @ 5th TF-Media Workshop
Porto, October 26, 2011 - SURFnet. We make innovation work
Friday, October 28, 11
17. MEDIAMOSA TRANSCRIPTING PROJECT
MediaMosa @ 5th TF-Media Workshop
Porto, October 26, 2011 - SURFnet. We make innovation work
Friday, October 28, 11
18. MEDIAMOSA TRANSCRIPTING PROJECT
Subtitles:
MediaMosa @ 5th TF-Media Workshop
Porto, October 26, 2011 - SURFnet. We make innovation work
Friday, October 28, 11
19. MediaMosa 3.5
Focus on transcription technology (speech-2-text) and flexible
workflows
• Development is started
• beta release available: december 2011
1 SURFnet. We make innovation work
MediaMosa @ 5th TF-Media Workshop
Porto, October 26, 2011 - SURFnet. We make innovation work
Friday, October 28, 11
20. WWW
MediaMosa Directions
http://mediamosa.org
Online Demo
Q&A
http://demo.mediamosa.org
sa
MediaMo
Forum MediaMosa
http://mediamosa.org/forum
MediaMosa
Issue Tracker
http://mediamosa.org/trac
Source Code
https://github.com/mediamosa
Slideshare
ur
http://www.slideshare.net/MediaMosa
yo
or
Twitter s f n!
http://twitter.com/mediamosa ank t io
Th en
att
1 SURFnet. We make innovation work
MediaMosa @ 5th TF-Media Workshop
Porto, October 26, 2011 - SURFnet. We make innovation work
Friday, October 28, 11