The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
GTTS System for the Spoken Web Search Task at MediaEval 2012
1. GTTS System for the
Spoken Web Search Task
at MediaEval 2012
Amparo Varona, Mikel Penagarikano, Luis Javier Rodríguez Fuentes,
Germán Bordel, Mireia Diez
University of the Basque Country UPV/EHU
luisjavier.rodriguez@ehu.es
http://gtts.ehu.es
MediaEval 2012 - SWS Task - GTTS System - Pisa, October 4, 2012
2. HEARCH: Search on Broadcast News
(ASR + Lemmatization + Index)
MediaEval 2012 - SWS Task - GTTS System - Pisa, October 4, 2012
3. HEARCH-P: Search on Parliamentary Sessions
(Audio-Text Alignment + Index)
MediaEval 2012 - SWS Task - GTTS System - Pisa, October 4, 2012
4. Search on spoken resources
in the Internet
• Searching for text queries (computer)
• Searching for spoken queries (mobile)
• Need for a common representation:
• Acoustic (DTW-like approaches)
• Phonetic (Search on Phone-Lattices)
• Word-level (ASR-based approaches)
MediaEval 2012 - SWS Task - GTTS System - Pisa, October 4, 2012
5. SWS at MediaEval 2012
(for GTTS)
• Opportunity search onthe field resources
unrestricted
to enter
spoken
of
• Opportunity to access development and
evaluation data
• Opportunity tofield state-of-the-art from
experts in the
learn
• Our approach: search of the queries
phonetic representations
of n-best
on phone lattices of spoken resources
MediaEval 2012 - SWS Task - GTTS System - Pisa, October 4, 2012
6. GTTS System: how it works
• BUT decoders for Czech, Hungarian and Russian
• Reduced sets of phonetic classes (IPA clusters)
• Approximate string matching (n editions allowed):
Dong Wang’s Lattice2Multigram tool
• Scores: length-normalized + kind of log-likelihood ratio
with regard to all the detections in the same audio file
• Overlapped detections: only the most likely retained
• For each query: K most likely detections, z-normalization
and threshold applied
MediaEval 2012 - SWS Task - GTTS System - Pisa, October 4, 2012
7. GTTS System: how it performs
• Best configurations determined in preliminary experiments
on the development dataset
• Primary: 3-best query phone decodings, 2 editions allowed
in matchings
• Contrastive: 1-best query phone decoding, 2 editions
allowed in matchings
• Poor performance !!!
• Change in the approach: searching for the best detection of
each query in each audio document
MediaEval 2012 - SWS Task - GTTS System - Pisa, October 4, 2012