SlideShare una empresa de Scribd logo
1 de 13
Descargar para leer sin conexión
The	
  L 2F	
  Spoken	
  Web	
  Search	
  system	
  

                 for	
  Mediaeval	
  2012	
  
         Alberto	
  Abad	
  and	
  Ramón	
  F.	
  Astudillo	
  
                 L2F	
  -­‐	
  Spoken	
  Language	
  Systems	
  Lab	
  	
  
                               INESC-­‐ID	
  Lisboa,	
  Portugal	
  
                                alberto@l2f.inesc-­‐id.pt	
  




         Mediaeval	
  2012	
  Workshop	
  Pisa,	
  October	
  4,	
  2012	
  
   The L2F SWS system for Mediaeval 2012                                 Pisa, October 4, 2012   1	
  
IntroducJon	
  

  In	
  this	
  first	
  L2F/INESC-­‐ID	
  parJcipaJon	
  on	
  SWS	
  our	
  main	
  objecJves	
  were:	
  
      •  To	
  learn	
  
      •  To	
  have	
  fun	
  
      •  To	
  build	
  a	
  reasonable	
  system	
  (in	
  an	
  unreasonable	
  reduced	
  amount	
  of	
  Jme)	
  


  The	
  submiVed	
  L2F	
  SWS	
  system	
  exploits	
  hybrid	
  ANN/HMM	
  connecJonist	
  
   methods	
  for	
  both	
  query	
  tokenizaJon	
  and	
  acousJc	
  keyword	
  search	
  
      •  Composed	
  by	
  the	
  fusion	
  of	
  4	
  phoneJc-­‐based	
  SWS	
  sub-­‐systems	
  
           o  Based	
  on	
  our	
  in-­‐house	
  ASR	
  system	
  named	
  AUDIMUS	
  
      •  Each	
  sub-­‐system	
  uses	
  different	
  language-­‐dependent	
  acousJc	
  models:	
  
             o  European	
  Portuguese	
  (pt),	
  Brazilian	
  Portuguese	
  (br),	
  European	
  Spanish	
  (es)	
  and	
  
                American	
  English	
  (en)	
  
      •  Different	
  detecJon	
  score	
  normalizaJon	
  and	
  fusion	
  methods	
  invesJgated	
  
             o  SubmiVed	
  system	
  applies	
  per-­‐query	
  score	
  normalizaJon	
  (Q-­‐norm)	
  and	
  majority	
  voJng	
  
                (MV)	
  fusion	
  


               The L2F SWS system for Mediaeval 2012                                                Pisa, October 4, 2012             2	
  
The	
  baseline	
  speech	
  recognizer	
  

  AUDIMUS	
  is	
  our	
  in-­‐house	
  hybrid	
  HMM/MLP	
  speech	
  recognizer	
  




     •  Feature	
  extrac@on	
  MulJ-­‐stream	
  26	
  PLP,	
  26	
  logRASTA-­‐PLP,	
  28	
  MSG	
  and	
  39	
  ETSI	
  (only	
  8kHz)	
  
     •  MLP	
  Several	
  context	
  input	
  frames	
  (13-­‐15),	
  2	
  hidden-­‐layers	
  (500	
  units)	
  and	
  1	
  output	
  layer.	
  
          •  Output	
  layer	
  size	
  39	
  for	
  pt,	
  40	
  for	
  br,	
  30	
  for	
  es,	
  41	
  for	
  en.	
  
          •  Data	
  pt	
  115	
  hours	
  (57	
  BN+58	
  telephone);	
  	
  br	
  13	
  hours	
  of	
  BN	
  data;	
  es	
  57hours	
  (36	
  BN
                +21	
  telephone);	
  en	
  142	
  hours	
  of	
  BN	
  data	
  (HUB-­‐4	
  96	
  &	
  97)	
  
     •  HMM	
  topology	
  Single-­‐state	
  phonemes	
  (3	
  frames	
  minimum	
  duraJon)	
  
     •  Decoder	
  Uses	
  Weighted	
  Finite-­‐State	
  Transducer	
  (WFST)	
  approach	
  

                The L2F SWS system for Mediaeval 2012                                                        Pisa, October 4, 2012                   3	
  
Spoken	
  Query	
  TokenizaJon	
  	
  
  For	
  each	
  sub-­‐system,	
  obtain	
  a	
  phoneJc	
  
   tokenizaJon	
  of	
  the	
  queries	
  
      •  Similar	
  to	
  LID	
  parallel	
  phonotacJc	
  
         approaches	
  


  Use	
  a	
  phone-­‐loop	
  grammar	
  with	
  
   phoneme	
  minimum	
  duraJon	
  of	
  3	
  
   frames	
  to	
  obtain	
  1-­‐best	
  phoneme	
  chain	
  
      •  AlternaJve	
  n-­‐best	
  hypothesis	
  for	
  
         charactering	
  each	
  query	
  were	
  explored	
  with	
  
         unsaJsfactory	
  results	
  
             o  Other	
  possibiliJes	
  not	
  explored	
  (yet):	
  lakce,	
  CN,	
  etc…	
  	
  
      •  Influence	
  of	
  the	
  word	
  inserJon	
  penalty	
  
         (wip)	
  parameter	
  on	
  the	
  tokenizaJon	
  result	
  	
  




                The L2F SWS system for Mediaeval 2012                                                 Pisa, October 4, 2012   4	
  
Spoken	
  Query	
  Search	
  

  Spoken	
  query	
  search	
  based	
  on	
  AKWS	
  with	
  our	
  hybrid	
  speech	
  recognizer.	
  	
  
  Search	
  window	
  of	
  5	
  seconds	
  (2.5	
  seconds	
  Jme	
  shio)	
  
      •  Convert	
  the	
  problem	
  in	
  a	
  verificaJon	
  task	
  (originally	
  for	
  with	
  forced	
  alignment)	
  
      •  Convenient	
  for	
  fusion	
  purposes	
  
  Equally-­‐likely	
  1-­‐gram	
  LM	
  with	
  target	
  Query	
  and	
  Background	
  word:	
  
      •  Query	
  word	
  	
  
             o  Described	
  by	
  the	
  phoneJc	
  units	
  obtained	
  in	
  the	
  previous	
  tokenizaJon	
  stage	
  
      •  Background	
  word	
  	
  
             o  Described	
  by	
  the	
  special	
  phoneJc	
  class	
  background/filler	
  
             o  Minimum	
  duraJon	
  set	
  to	
  250	
  msec.	
  	
  
  DetecJon	
  score	
  of	
  detected	
  candidates	
  computed	
  as	
  the	
  average	
  
   phoneJc	
  log-­‐likelihood	
  raJos	
  of	
  the	
  query	
  term	
  	
  



               The L2F SWS system for Mediaeval 2012                                                  Pisa, October 4, 2012     5	
  
Spoken	
  Query	
  Search	
  
BG	
  modelling	
  with	
  HMM/MLP	
  


  Possible	
  approaches	
  
      •  Re-­‐train	
  MLP	
  ✖	
  
      •  Compute	
  posterior	
  probability	
  of	
  the	
  background	
  class	
  depending	
  on	
  the	
  other	
  
         classes	
  ✔	
  
             o  Mean	
  probability	
  of	
  the	
  top-­‐N	
  most	
  likely	
  outputs	
  
  For	
  the	
  SWS	
  system,	
  	
  
      •  We	
  compute	
  the	
  average	
  in	
  the	
  likelihood	
  domain	
  (top-­‐6)	
  




             o  The	
  decoder	
  operates	
  in	
  the	
  likelihood	
  domain,	
  so	
  there	
  is	
  not	
  need	
  for	
  and	
  add-­‐hoc	
  
                esJmaJon	
  of	
  the	
  BG	
  class	
  prior	
  
      •  We	
  use	
  a	
  background	
  scale	
  term	
  β	
  (exponenJal	
  in	
  the	
  likelihood	
  domain)	
  to	
  
         control	
  the	
  weight	
  of	
  the	
  BG	
  model	
  vs.	
  Query	
  
      •  This	
  β	
  scale	
  together	
  with	
  the	
  wip	
  term	
  strongly	
  affects	
  searching	
  results	
  
             o  Adjusted	
  following	
  a	
  non-­‐exhausJve	
  greedy	
  search	
  



                The L2F SWS system for Mediaeval 2012                                                               Pisa, October 4, 2012             6	
  
Score	
  normalizaJon,	
  fusion	
  and	
  calibraJon	
  

  Score	
  normaliza@on	
  schemes	
  explored:	
  
     •  Q-­‐norm	
  Assume	
  that	
  the	
  scores	
  are	
  dependent	
  of	
  the	
  queries	
  and	
  
        apply	
  a	
  by-­‐query	
  normalizaJon	
  
     •  F-­‐norm	
  Assume	
  that	
  the	
  scores	
  are	
  dependent	
  of	
  the	
  data	
  file	
  (of	
  
        the	
  collecJon)	
  and	
  apply	
  a	
  by-­‐file	
  normalizaJon	
  
     •  CombinaJons	
  (QF-­‐norm,	
  FQ-­‐norm)	
  

  Fusion	
  schemes	
  explored:	
  
     •  Candidate	
  detecJons	
  from	
  the	
  4	
  parallel	
  sub-­‐systems	
  are	
  kept	
  (or	
  
        rejected)	
  according	
  to	
  simple	
  combinaJon	
  rules:	
  
           o  AND	
  (all),	
  OR	
  (at	
  least	
  1	
  sys)	
  and	
  MV	
  (at	
  least	
  2	
  sys)	
  
     •  Final	
  detecJon	
  score	
  is	
  the	
  mean	
  score	
  	
  

  Decision	
  threshold	
  set	
  according	
  to	
  maxATWV	
  in	
  dev-­‐dev	
  

             The L2F SWS system for Mediaeval 2012                                                             Pisa, October 4, 2012   7	
  
Development	
  experiments	
  
Sub-­‐systems	
  performance	
  




              The L2F SWS system for Mediaeval 2012   Pisa, October 4, 2012   8	
  
Development	
  experiments	
  
Score	
  normalizaJon	
  strategies	
  




              The L2F SWS system for Mediaeval 2012   Pisa, October 4, 2012   9	
  
Development	
  experiments	
  
Fusion	
  strategies	
  (aoer	
  Q-­‐norm)	
  




                The L2F SWS system for Mediaeval 2012   Pisa, October 4, 2012   10	
  
Official	
  L2F	
  SWS2012	
  results	
  




                                                          Combined DET Plot                                                                                     Combined DET Plot                                                                                     Combined DET Plot
                          98                                                                                                    98                                                                                                    98
                                                               Random Performance                                                                                    Random Performance                                                                                    Random Performance
                Term Wtd. p-phonetic4_fusion_mv : ALL Data Max Val=0.486 Scr=-0.135                                   Term Wtd. p-phonetic4_fusion_mv : ALL Data Max Val=0.633 Scr=-0.435                                   Term Wtd. p-phonetic4_fusion_mv : ALL Data Max Val=0.523 Scr=-0.362
                95                                                                                                    95                                                                                                    95
              Term Wtd. p-phonetic4_fusion_mv: CTS Subset Max Val=0.486 Scr=-0.135                                  Term Wtd. p-phonetic4_fusion_mv: CTS Subset Max Val=0.633 Scr=-0.435                                  Term Wtd. p-phonetic4_fusion_mv: CTS Subset Max Val=0.523 Scr=-0.362

                          90                                                                                                    90                                                                                                    90


                          80                                                                                                    80                                                                                                    80
Miss probability (in %)




                                                                                                      Miss probability (in %)




                                                                                                                                                                                                            Miss probability (in %)
                          60                                                                                                    60                                                                                                    60



                          40                                                                                                    40                                                                                                    40



                          20                                                                                                    20                                                                                                    20


                          10                                                                                                    10                                                                                                    10

                          5                                                                                                     5                                                                                                     5
                          .0001   .001 .004 .01.02 .05 .1 .2    .5 1      2        5   10   20   40                             .0001   .001 .004 .01.02 .05 .1 .2    .5 1      2        5   10   20   40                             .0001   .001 .004 .01.02 .05 .1 .2    .5 1      2        5   10   20     40
                                                    False Alarm probability (in %)                                                                        False Alarm probability (in %)                                                                        False Alarm probability (in %)


                                                   dev-­‐eval	
                                                                                          eval-­‐dev	
                                                                                        eval-­‐eval	
  

                                                                The L2F SWS system for Mediaeval 2012                                                                                                                                          Pisa, October 4, 2012                                         11	
  
Conclusions	
  

  The	
  L2F	
  Spoken	
  Web	
  Search	
  system	
  fully	
  exploits	
  hybrid	
  ANN/HMM	
  
   speech	
  recogniJon:	
  
      •  Query	
  tokenizaJon	
  based	
  on	
  1-­‐best	
  phoneJc	
  decoding	
  
      •  Query	
  search	
  based	
  on	
  AKWS	
  (with	
  no	
  need	
  for	
  AM	
  re-­‐training)	
  
  The	
  submiVed	
  system	
  is	
  formed	
  by	
  the	
  fusion	
  of	
  four	
  language-­‐
   dependent	
  sub-­‐systems:	
  
      •  Q-­‐norm	
  score	
  normalizaJon	
  is	
  applied	
  to	
  each	
  individual	
  sub-­‐system	
  
      •  Fusion	
  is	
  done	
  following	
  a	
  majority	
  voJng	
  strategy	
  
  The	
  system	
  achieves	
  an	
  actual	
  ATWV	
  score	
  of	
  0.5195	
  in	
  the	
  eval-­‐eval	
  	
  
      •  Promising	
  given	
  the	
  simplicity	
  of	
  the	
  proposed	
  system	
  
      •  Robust	
  to	
  query	
  and	
  collecJon	
  sets	
  
            o  Best	
  performance	
  is	
  achieved	
  in	
  a	
  mismatched	
  condiJon!!	
  
      •  Reasonably	
  well-­‐calibrated	
  
  Future	
  work	
  Focused	
  in	
  improved	
  Query	
  tokenizaJon,	
  fusion	
  with	
  
   other	
  type	
  of	
  approaches	
  (DTW)	
  	
  


              The L2F SWS system for Mediaeval 2012                                           Pisa, October 4, 2012   12	
  
technology
                                                                                                            from seed




               technology
                         from seed




                                                                         L2 F - Spoken Language Systems Laboratory
Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento em Lisboa

                                                                                                               13
L2 F - Spoken Language Systems Laboratory

Más contenido relacionado

La actualidad más candente

Declarative Experimentation in Information Retrieval using PyTerrier
Declarative Experimentation in Information Retrieval using PyTerrierDeclarative Experimentation in Information Retrieval using PyTerrier
Declarative Experimentation in Information Retrieval using PyTerrierCrai Macdonald
 
Bandwidth measurement
Bandwidth measurementBandwidth measurement
Bandwidth measurementjeromy fu
 
Non autoregressive neural text-to-speech review
Non autoregressive neural text-to-speech reviewNon autoregressive neural text-to-speech review
Non autoregressive neural text-to-speech reviewJune-Woo Kim
 
Reliability Improvement For An Rfid Based Psychiatric Patient Localization
Reliability Improvement For An Rfid Based Psychiatric Patient LocalizationReliability Improvement For An Rfid Based Psychiatric Patient Localization
Reliability Improvement For An Rfid Based Psychiatric Patient Localizationwacerone
 
CUHK System for the Spoken Web Search task at Mediaeval 2012
CUHK System for the Spoken Web Search task at Mediaeval 2012CUHK System for the Spoken Web Search task at Mediaeval 2012
CUHK System for the Spoken Web Search task at Mediaeval 2012MediaEval2012
 
1909 BERT: why-and-how (CODE SEMINAR)
1909 BERT: why-and-how (CODE SEMINAR)1909 BERT: why-and-how (CODE SEMINAR)
1909 BERT: why-and-how (CODE SEMINAR)WarNik Chow
 
Video lectures for mba
Video lectures for mbaVideo lectures for mba
Video lectures for mbaEdhole.com
 
口試投影片(詹智傑) Final
口試投影片(詹智傑) Final口試投影片(詹智傑) Final
口試投影片(詹智傑) Final詹智傑
 
Performance analysis and implementation for nonbinary quasi cyclic ldpc decod...
Performance analysis and implementation for nonbinary quasi cyclic ldpc decod...Performance analysis and implementation for nonbinary quasi cyclic ldpc decod...
Performance analysis and implementation for nonbinary quasi cyclic ldpc decod...ijwmn
 
Workshop NGS data analysis - 3
Workshop NGS data analysis - 3Workshop NGS data analysis - 3
Workshop NGS data analysis - 3Maté Ongenaert
 
Gaweł mikołajczyk. i pv6 insecurities at first hop
Gaweł mikołajczyk. i pv6 insecurities at first hopGaweł mikołajczyk. i pv6 insecurities at first hop
Gaweł mikołajczyk. i pv6 insecurities at first hopYury Chemerkin
 
Stefano Giordano
Stefano GiordanoStefano Giordano
Stefano GiordanoGoWireless
 
FPGA Based Power Efficient Chanalizer For Software Defined Radio
FPGA Based Power Efficient Chanalizer For Software Defined RadioFPGA Based Power Efficient Chanalizer For Software Defined Radio
FPGA Based Power Efficient Chanalizer For Software Defined RadioIJMER
 
Sequence Learning with CTC technique
Sequence Learning with CTC techniqueSequence Learning with CTC technique
Sequence Learning with CTC techniqueChun Hao Wang
 

La actualidad más candente (20)

Declarative Experimentation in Information Retrieval using PyTerrier
Declarative Experimentation in Information Retrieval using PyTerrierDeclarative Experimentation in Information Retrieval using PyTerrier
Declarative Experimentation in Information Retrieval using PyTerrier
 
Dq31784792
Dq31784792Dq31784792
Dq31784792
 
Bandwidth measurement
Bandwidth measurementBandwidth measurement
Bandwidth measurement
 
Priority queuing
Priority queuing Priority queuing
Priority queuing
 
Non autoregressive neural text-to-speech review
Non autoregressive neural text-to-speech reviewNon autoregressive neural text-to-speech review
Non autoregressive neural text-to-speech review
 
Reliability Improvement For An Rfid Based Psychiatric Patient Localization
Reliability Improvement For An Rfid Based Psychiatric Patient LocalizationReliability Improvement For An Rfid Based Psychiatric Patient Localization
Reliability Improvement For An Rfid Based Psychiatric Patient Localization
 
CUHK System for the Spoken Web Search task at Mediaeval 2012
CUHK System for the Spoken Web Search task at Mediaeval 2012CUHK System for the Spoken Web Search task at Mediaeval 2012
CUHK System for the Spoken Web Search task at Mediaeval 2012
 
1909 BERT: why-and-how (CODE SEMINAR)
1909 BERT: why-and-how (CODE SEMINAR)1909 BERT: why-and-how (CODE SEMINAR)
1909 BERT: why-and-how (CODE SEMINAR)
 
Speaker Segmentation (2006)
Speaker Segmentation (2006)Speaker Segmentation (2006)
Speaker Segmentation (2006)
 
Video lectures for mba
Video lectures for mbaVideo lectures for mba
Video lectures for mba
 
Parallel computing(1)
Parallel computing(1)Parallel computing(1)
Parallel computing(1)
 
口試投影片(詹智傑) Final
口試投影片(詹智傑) Final口試投影片(詹智傑) Final
口試投影片(詹智傑) Final
 
Performance analysis and implementation for nonbinary quasi cyclic ldpc decod...
Performance analysis and implementation for nonbinary quasi cyclic ldpc decod...Performance analysis and implementation for nonbinary quasi cyclic ldpc decod...
Performance analysis and implementation for nonbinary quasi cyclic ldpc decod...
 
Workshop NGS data analysis - 3
Workshop NGS data analysis - 3Workshop NGS data analysis - 3
Workshop NGS data analysis - 3
 
Gaweł mikołajczyk. i pv6 insecurities at first hop
Gaweł mikołajczyk. i pv6 insecurities at first hopGaweł mikołajczyk. i pv6 insecurities at first hop
Gaweł mikołajczyk. i pv6 insecurities at first hop
 
Stefano Giordano
Stefano GiordanoStefano Giordano
Stefano Giordano
 
FPGA Based Power Efficient Chanalizer For Software Defined Radio
FPGA Based Power Efficient Chanalizer For Software Defined RadioFPGA Based Power Efficient Chanalizer For Software Defined Radio
FPGA Based Power Efficient Chanalizer For Software Defined Radio
 
encrption.PDF
encrption.PDFencrption.PDF
encrption.PDF
 
Sequence Learning with CTC technique
Sequence Learning with CTC techniqueSequence Learning with CTC technique
Sequence Learning with CTC technique
 
Gq2411921196
Gq2411921196Gq2411921196
Gq2411921196
 

Destacado

Activities for journalistic skills
Activities for journalistic skillsActivities for journalistic skills
Activities for journalistic skillsJNavarro0321
 
TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...
TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...
TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...MediaEval2012
 
The TUM Cumulative DTW Approach for the Mediaeval 2012 Spoken Web Search Task
The TUM Cumulative DTW Approach for the Mediaeval 2012 Spoken Web Search TaskThe TUM Cumulative DTW Approach for the Mediaeval 2012 Spoken Web Search Task
The TUM Cumulative DTW Approach for the Mediaeval 2012 Spoken Web Search TaskMediaEval2012
 
TUKE MediaEval 2012: Spoken Web Search using DTW and Unsupervised SVM
TUKE MediaEval 2012: Spoken Web Search using DTW and Unsupervised SVMTUKE MediaEval 2012: Spoken Web Search using DTW and Unsupervised SVM
TUKE MediaEval 2012: Spoken Web Search using DTW and Unsupervised SVMMediaEval2012
 
TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...
TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...
TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...MediaEval2012
 
6dicas– veda 4
6dicas– veda 46dicas– veda 4
6dicas– veda 4souzadea1
 
Como hacer una pagina web en wix sharon
Como hacer una pagina web en wix sharonComo hacer una pagina web en wix sharon
Como hacer una pagina web en wix sharonSharon Jimenez
 
Designinteração– veda 3
Designinteração– veda 3Designinteração– veda 3
Designinteração– veda 3souzadea1
 
The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...
The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...
The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...MediaEval2012
 
14 10 21_презентация сту
14 10 21_презентация сту14 10 21_презентация сту
14 10 21_презентация стуStanislav Litvinenko
 
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...MediaEval2012
 
How Spatial Segmentation improves the Multimodal Geo-Tagging
How Spatial Segmentation improves the Multimodal Geo-TaggingHow Spatial Segmentation improves the Multimodal Geo-Tagging
How Spatial Segmentation improves the Multimodal Geo-TaggingMediaEval2012
 
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...MediaEval2012
 
KIT at MediaEval 2012 – Content–based Genre Classification with Visual Cues
KIT at MediaEval 2012 – Content–based Genre Classification with Visual CuesKIT at MediaEval 2012 – Content–based Genre Classification with Visual Cues
KIT at MediaEval 2012 – Content–based Genre Classification with Visual CuesMediaEval2012
 
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect TaskNII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect TaskMediaEval2012
 
Intro totransportphenomenanew
Intro totransportphenomenanewIntro totransportphenomenanew
Intro totransportphenomenanewilovepurin
 
Ghent and Cardiff University at the 2012 Placing Task
Ghent and Cardiff University at the 2012 Placing TaskGhent and Cardiff University at the 2012 Placing Task
Ghent and Cardiff University at the 2012 Placing TaskMediaEval2012
 
2010 Marketing Plan
2010 Marketing Plan2010 Marketing Plan
2010 Marketing PlanJPemberton15
 

Destacado (20)

Activities for journalistic skills
Activities for journalistic skillsActivities for journalistic skills
Activities for journalistic skills
 
TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...
TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...
TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...
 
10 ρ. δρακουλησ
10 ρ. δρακουλησ10 ρ. δρακουλησ
10 ρ. δρακουλησ
 
The TUM Cumulative DTW Approach for the Mediaeval 2012 Spoken Web Search Task
The TUM Cumulative DTW Approach for the Mediaeval 2012 Spoken Web Search TaskThe TUM Cumulative DTW Approach for the Mediaeval 2012 Spoken Web Search Task
The TUM Cumulative DTW Approach for the Mediaeval 2012 Spoken Web Search Task
 
TUKE MediaEval 2012: Spoken Web Search using DTW and Unsupervised SVM
TUKE MediaEval 2012: Spoken Web Search using DTW and Unsupervised SVMTUKE MediaEval 2012: Spoken Web Search using DTW and Unsupervised SVM
TUKE MediaEval 2012: Spoken Web Search using DTW and Unsupervised SVM
 
TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...
TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...
TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...
 
κειμενο
κειμενοκειμενο
κειμενο
 
6dicas– veda 4
6dicas– veda 46dicas– veda 4
6dicas– veda 4
 
Como hacer una pagina web en wix sharon
Como hacer una pagina web en wix sharonComo hacer una pagina web en wix sharon
Como hacer una pagina web en wix sharon
 
Designinteração– veda 3
Designinteração– veda 3Designinteração– veda 3
Designinteração– veda 3
 
The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...
The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...
The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...
 
14 10 21_презентация сту
14 10 21_презентация сту14 10 21_презентация сту
14 10 21_презентация сту
 
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...
 
How Spatial Segmentation improves the Multimodal Geo-Tagging
How Spatial Segmentation improves the Multimodal Geo-TaggingHow Spatial Segmentation improves the Multimodal Geo-Tagging
How Spatial Segmentation improves the Multimodal Geo-Tagging
 
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...
 
KIT at MediaEval 2012 – Content–based Genre Classification with Visual Cues
KIT at MediaEval 2012 – Content–based Genre Classification with Visual CuesKIT at MediaEval 2012 – Content–based Genre Classification with Visual Cues
KIT at MediaEval 2012 – Content–based Genre Classification with Visual Cues
 
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect TaskNII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task
 
Intro totransportphenomenanew
Intro totransportphenomenanewIntro totransportphenomenanew
Intro totransportphenomenanew
 
Ghent and Cardiff University at the 2012 Placing Task
Ghent and Cardiff University at the 2012 Placing TaskGhent and Cardiff University at the 2012 Placing Task
Ghent and Cardiff University at the 2012 Placing Task
 
2010 Marketing Plan
2010 Marketing Plan2010 Marketing Plan
2010 Marketing Plan
 

Similar a The L2F Spoken Web Search system for Mediaeval 2012

VOICE PASSWORD BASED SPEAKER VERIFICATION SYSTEM USING VOWEL AND NON VOWEL RE...
VOICE PASSWORD BASED SPEAKER VERIFICATION SYSTEM USING VOWEL AND NON VOWEL RE...VOICE PASSWORD BASED SPEAKER VERIFICATION SYSTEM USING VOWEL AND NON VOWEL RE...
VOICE PASSWORD BASED SPEAKER VERIFICATION SYSTEM USING VOWEL AND NON VOWEL RE...niranjan kumar
 
Speech recognition final
Speech recognition finalSpeech recognition final
Speech recognition finalArchit Vora
 
Training Distributed Deep Recurrent Neural Networks with Mixed Precision on G...
Training Distributed Deep Recurrent Neural Networks with Mixed Precision on G...Training Distributed Deep Recurrent Neural Networks with Mixed Precision on G...
Training Distributed Deep Recurrent Neural Networks with Mixed Precision on G...Databricks
 
Design & implementation of machine learning algorithm in (2)
Design & implementation of machine learning algorithm in (2)Design & implementation of machine learning algorithm in (2)
Design & implementation of machine learning algorithm in (2)saurabh Kumar Chaudhary
 
2016 bioinformatics i_database_searching_wimvancriekinge
2016 bioinformatics i_database_searching_wimvancriekinge2016 bioinformatics i_database_searching_wimvancriekinge
2016 bioinformatics i_database_searching_wimvancriekingeProf. Wim Van Criekinge
 
Deep learning - Chatbot
Deep learning - ChatbotDeep learning - Chatbot
Deep learning - ChatbotLiam Bui
 
Development of voice password based speaker verification system
Development of voice password based speaker verification systemDevelopment of voice password based speaker verification system
Development of voice password based speaker verification systemniranjan kumar
 
Development of voice password based speaker verification system
Development of voice password based speaker verification systemDevelopment of voice password based speaker verification system
Development of voice password based speaker verification systemniranjan kumar
 
Ontology-based data access: why it is so cool!
Ontology-based data access: why it is so cool!Ontology-based data access: why it is so cool!
Ontology-based data access: why it is so cool!Josef Hardi
 
End-to-End Object Detection with Transformers
End-to-End Object Detection with TransformersEnd-to-End Object Detection with Transformers
End-to-End Object Detection with TransformersSeunghyun Hwang
 
Funda Gunes, Senior Research Statistician Developer & Patrick Koch, Principal...
Funda Gunes, Senior Research Statistician Developer & Patrick Koch, Principal...Funda Gunes, Senior Research Statistician Developer & Patrick Koch, Principal...
Funda Gunes, Senior Research Statistician Developer & Patrick Koch, Principal...MLconf
 
Toward wave net speech synthesis
Toward wave net speech synthesisToward wave net speech synthesis
Toward wave net speech synthesisNAVER Engineering
 
A Framework for Scene Recognition Using Convolutional Neural Network as Featu...
A Framework for Scene Recognition Using Convolutional Neural Network as Featu...A Framework for Scene Recognition Using Convolutional Neural Network as Featu...
A Framework for Scene Recognition Using Convolutional Neural Network as Featu...Tahmid Abtahi
 
Triantafyllia Voulibasi
Triantafyllia VoulibasiTriantafyllia Voulibasi
Triantafyllia VoulibasiISSEL
 
Targeting GPUs using OpenMP Directives on Summit with GenASiS: A Simple and...
Targeting GPUs using OpenMP  Directives on Summit with  GenASiS: A Simple and...Targeting GPUs using OpenMP  Directives on Summit with  GenASiS: A Simple and...
Targeting GPUs using OpenMP Directives on Summit with GenASiS: A Simple and...Ganesan Narayanasamy
 
Preparing OpenSHMEM for Exascale
Preparing OpenSHMEM for ExascalePreparing OpenSHMEM for Exascale
Preparing OpenSHMEM for Exascaleinside-BigData.com
 
OpenDiscovery
OpenDiscoveryOpenDiscovery
OpenDiscoverygwprice
 

Similar a The L2F Spoken Web Search system for Mediaeval 2012 (20)

VOICE PASSWORD BASED SPEAKER VERIFICATION SYSTEM USING VOWEL AND NON VOWEL RE...
VOICE PASSWORD BASED SPEAKER VERIFICATION SYSTEM USING VOWEL AND NON VOWEL RE...VOICE PASSWORD BASED SPEAKER VERIFICATION SYSTEM USING VOWEL AND NON VOWEL RE...
VOICE PASSWORD BASED SPEAKER VERIFICATION SYSTEM USING VOWEL AND NON VOWEL RE...
 
Database Searching
Database SearchingDatabase Searching
Database Searching
 
Speech recognition final
Speech recognition finalSpeech recognition final
Speech recognition final
 
Deep Learning for Machine Translation
Deep Learning for Machine TranslationDeep Learning for Machine Translation
Deep Learning for Machine Translation
 
Training Distributed Deep Recurrent Neural Networks with Mixed Precision on G...
Training Distributed Deep Recurrent Neural Networks with Mixed Precision on G...Training Distributed Deep Recurrent Neural Networks with Mixed Precision on G...
Training Distributed Deep Recurrent Neural Networks with Mixed Precision on G...
 
Design & implementation of machine learning algorithm in (2)
Design & implementation of machine learning algorithm in (2)Design & implementation of machine learning algorithm in (2)
Design & implementation of machine learning algorithm in (2)
 
Lect 1.pptx
Lect 1.pptxLect 1.pptx
Lect 1.pptx
 
2016 bioinformatics i_database_searching_wimvancriekinge
2016 bioinformatics i_database_searching_wimvancriekinge2016 bioinformatics i_database_searching_wimvancriekinge
2016 bioinformatics i_database_searching_wimvancriekinge
 
Deep learning - Chatbot
Deep learning - ChatbotDeep learning - Chatbot
Deep learning - Chatbot
 
Development of voice password based speaker verification system
Development of voice password based speaker verification systemDevelopment of voice password based speaker verification system
Development of voice password based speaker verification system
 
Development of voice password based speaker verification system
Development of voice password based speaker verification systemDevelopment of voice password based speaker verification system
Development of voice password based speaker verification system
 
Ontology-based data access: why it is so cool!
Ontology-based data access: why it is so cool!Ontology-based data access: why it is so cool!
Ontology-based data access: why it is so cool!
 
End-to-End Object Detection with Transformers
End-to-End Object Detection with TransformersEnd-to-End Object Detection with Transformers
End-to-End Object Detection with Transformers
 
Funda Gunes, Senior Research Statistician Developer & Patrick Koch, Principal...
Funda Gunes, Senior Research Statistician Developer & Patrick Koch, Principal...Funda Gunes, Senior Research Statistician Developer & Patrick Koch, Principal...
Funda Gunes, Senior Research Statistician Developer & Patrick Koch, Principal...
 
Toward wave net speech synthesis
Toward wave net speech synthesisToward wave net speech synthesis
Toward wave net speech synthesis
 
A Framework for Scene Recognition Using Convolutional Neural Network as Featu...
A Framework for Scene Recognition Using Convolutional Neural Network as Featu...A Framework for Scene Recognition Using Convolutional Neural Network as Featu...
A Framework for Scene Recognition Using Convolutional Neural Network as Featu...
 
Triantafyllia Voulibasi
Triantafyllia VoulibasiTriantafyllia Voulibasi
Triantafyllia Voulibasi
 
Targeting GPUs using OpenMP Directives on Summit with GenASiS: A Simple and...
Targeting GPUs using OpenMP  Directives on Summit with  GenASiS: A Simple and...Targeting GPUs using OpenMP  Directives on Summit with  GenASiS: A Simple and...
Targeting GPUs using OpenMP Directives on Summit with GenASiS: A Simple and...
 
Preparing OpenSHMEM for Exascale
Preparing OpenSHMEM for ExascalePreparing OpenSHMEM for Exascale
Preparing OpenSHMEM for Exascale
 
OpenDiscovery
OpenDiscoveryOpenDiscovery
OpenDiscovery
 

Más de MediaEval2012

MediaEval 2012 Opening
MediaEval 2012 OpeningMediaEval 2012 Opening
MediaEval 2012 OpeningMediaEval2012
 
A Multimodal Approach for Video Geocoding
A Multimodal Approach for   Video Geocoding A Multimodal Approach for   Video Geocoding
A Multimodal Approach for Video Geocoding MediaEval2012
 
Brave New Task: Musiclef Multimodal Music Tagging
Brave New Task: Musiclef Multimodal Music TaggingBrave New Task: Musiclef Multimodal Music Tagging
Brave New Task: Musiclef Multimodal Music TaggingMediaEval2012
 
Search and Hyperlinking Task at MediaEval 2012
Search and Hyperlinking Task at MediaEval 2012Search and Hyperlinking Task at MediaEval 2012
Search and Hyperlinking Task at MediaEval 2012MediaEval2012
 
CUNI at MediaEval 2012: Search and Hyperlinking Task
CUNI at MediaEval 2012: Search and Hyperlinking TaskCUNI at MediaEval 2012: Search and Hyperlinking Task
CUNI at MediaEval 2012: Search and Hyperlinking TaskMediaEval2012
 
DCU Search Runs at MediaEval 2012: Search and Hyperlinking Task
DCU Search Runs at MediaEval 2012: Search and Hyperlinking TaskDCU Search Runs at MediaEval 2012: Search and Hyperlinking Task
DCU Search Runs at MediaEval 2012: Search and Hyperlinking TaskMediaEval2012
 
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...MediaEval2012
 
Brave New Task: User Account Matching
Brave New Task: User Account MatchingBrave New Task: User Account Matching
Brave New Task: User Account MatchingMediaEval2012
 
The CLEF Initiative From 2010 to 2012 and Onwards
The CLEF Initiative From 2010 to 2012 and OnwardsThe CLEF Initiative From 2010 to 2012 and Onwards
The CLEF Initiative From 2010 to 2012 and OnwardsMediaEval2012
 
Overview of MediaEval 2012 Visual Privacy Task
Overview of MediaEval 2012 Visual Privacy TaskOverview of MediaEval 2012 Visual Privacy Task
Overview of MediaEval 2012 Visual Privacy TaskMediaEval2012
 
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...MediaEval2012
 
Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...
Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...
Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...MediaEval2012
 
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...MediaEval2012
 
The MediaEval 2012 Affect Task: Violent Scenes Detectio
The MediaEval 2012 Affect Task: Violent Scenes DetectioThe MediaEval 2012 Affect Task: Violent Scenes Detectio
The MediaEval 2012 Affect Task: Violent Scenes DetectioMediaEval2012
 
LIG at MediaEval 2012 affect task: use of a generic method
LIG at MediaEval 2012 affect task: use of a generic methodLIG at MediaEval 2012 affect task: use of a generic method
LIG at MediaEval 2012 affect task: use of a generic methodMediaEval2012
 
Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...
Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...
Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...MediaEval2012
 
UNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
UNICAMP-UFMG at MediaEval 2012: Genre Tagging TaskUNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
UNICAMP-UFMG at MediaEval 2012: Genre Tagging TaskMediaEval2012
 
ARF @ MediaEval 2012: Multimodal Video Classification
ARF @ MediaEval 2012: Multimodal Video ClassificationARF @ MediaEval 2012: Multimodal Video Classification
ARF @ MediaEval 2012: Multimodal Video ClassificationMediaEval2012
 

Más de MediaEval2012 (20)

MediaEval 2012 Opening
MediaEval 2012 OpeningMediaEval 2012 Opening
MediaEval 2012 Opening
 
Closing
ClosingClosing
Closing
 
A Multimodal Approach for Video Geocoding
A Multimodal Approach for   Video Geocoding A Multimodal Approach for   Video Geocoding
A Multimodal Approach for Video Geocoding
 
Brave New Task: Musiclef Multimodal Music Tagging
Brave New Task: Musiclef Multimodal Music TaggingBrave New Task: Musiclef Multimodal Music Tagging
Brave New Task: Musiclef Multimodal Music Tagging
 
Search and Hyperlinking Task at MediaEval 2012
Search and Hyperlinking Task at MediaEval 2012Search and Hyperlinking Task at MediaEval 2012
Search and Hyperlinking Task at MediaEval 2012
 
CUNI at MediaEval 2012: Search and Hyperlinking Task
CUNI at MediaEval 2012: Search and Hyperlinking TaskCUNI at MediaEval 2012: Search and Hyperlinking Task
CUNI at MediaEval 2012: Search and Hyperlinking Task
 
DCU Search Runs at MediaEval 2012: Search and Hyperlinking Task
DCU Search Runs at MediaEval 2012: Search and Hyperlinking TaskDCU Search Runs at MediaEval 2012: Search and Hyperlinking Task
DCU Search Runs at MediaEval 2012: Search and Hyperlinking Task
 
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
 
Brave New Task: User Account Matching
Brave New Task: User Account MatchingBrave New Task: User Account Matching
Brave New Task: User Account Matching
 
The CLEF Initiative From 2010 to 2012 and Onwards
The CLEF Initiative From 2010 to 2012 and OnwardsThe CLEF Initiative From 2010 to 2012 and Onwards
The CLEF Initiative From 2010 to 2012 and Onwards
 
Overview of MediaEval 2012 Visual Privacy Task
Overview of MediaEval 2012 Visual Privacy TaskOverview of MediaEval 2012 Visual Privacy Task
Overview of MediaEval 2012 Visual Privacy Task
 
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...
 
Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...
Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...
Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...
 
mevd2012 esra_
 mevd2012 esra_ mevd2012 esra_
mevd2012 esra_
 
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...
 
The MediaEval 2012 Affect Task: Violent Scenes Detectio
The MediaEval 2012 Affect Task: Violent Scenes DetectioThe MediaEval 2012 Affect Task: Violent Scenes Detectio
The MediaEval 2012 Affect Task: Violent Scenes Detectio
 
LIG at MediaEval 2012 affect task: use of a generic method
LIG at MediaEval 2012 affect task: use of a generic methodLIG at MediaEval 2012 affect task: use of a generic method
LIG at MediaEval 2012 affect task: use of a generic method
 
Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...
Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...
Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...
 
UNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
UNICAMP-UFMG at MediaEval 2012: Genre Tagging TaskUNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
UNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
 
ARF @ MediaEval 2012: Multimodal Video Classification
ARF @ MediaEval 2012: Multimodal Video ClassificationARF @ MediaEval 2012: Multimodal Video Classification
ARF @ MediaEval 2012: Multimodal Video Classification
 

Último

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 

Último (20)

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 

The L2F Spoken Web Search system for Mediaeval 2012

  • 1. The  L 2F  Spoken  Web  Search  system   for  Mediaeval  2012   Alberto  Abad  and  Ramón  F.  Astudillo   L2F  -­‐  Spoken  Language  Systems  Lab     INESC-­‐ID  Lisboa,  Portugal   alberto@l2f.inesc-­‐id.pt   Mediaeval  2012  Workshop  Pisa,  October  4,  2012   The L2F SWS system for Mediaeval 2012 Pisa, October 4, 2012 1  
  • 2. IntroducJon     In  this  first  L2F/INESC-­‐ID  parJcipaJon  on  SWS  our  main  objecJves  were:   •  To  learn   •  To  have  fun   •  To  build  a  reasonable  system  (in  an  unreasonable  reduced  amount  of  Jme)     The  submiVed  L2F  SWS  system  exploits  hybrid  ANN/HMM  connecJonist   methods  for  both  query  tokenizaJon  and  acousJc  keyword  search   •  Composed  by  the  fusion  of  4  phoneJc-­‐based  SWS  sub-­‐systems   o  Based  on  our  in-­‐house  ASR  system  named  AUDIMUS   •  Each  sub-­‐system  uses  different  language-­‐dependent  acousJc  models:   o  European  Portuguese  (pt),  Brazilian  Portuguese  (br),  European  Spanish  (es)  and   American  English  (en)   •  Different  detecJon  score  normalizaJon  and  fusion  methods  invesJgated   o  SubmiVed  system  applies  per-­‐query  score  normalizaJon  (Q-­‐norm)  and  majority  voJng   (MV)  fusion   The L2F SWS system for Mediaeval 2012 Pisa, October 4, 2012 2  
  • 3. The  baseline  speech  recognizer     AUDIMUS  is  our  in-­‐house  hybrid  HMM/MLP  speech  recognizer   •  Feature  extrac@on  MulJ-­‐stream  26  PLP,  26  logRASTA-­‐PLP,  28  MSG  and  39  ETSI  (only  8kHz)   •  MLP  Several  context  input  frames  (13-­‐15),  2  hidden-­‐layers  (500  units)  and  1  output  layer.   •  Output  layer  size  39  for  pt,  40  for  br,  30  for  es,  41  for  en.   •  Data  pt  115  hours  (57  BN+58  telephone);    br  13  hours  of  BN  data;  es  57hours  (36  BN +21  telephone);  en  142  hours  of  BN  data  (HUB-­‐4  96  &  97)   •  HMM  topology  Single-­‐state  phonemes  (3  frames  minimum  duraJon)   •  Decoder  Uses  Weighted  Finite-­‐State  Transducer  (WFST)  approach   The L2F SWS system for Mediaeval 2012 Pisa, October 4, 2012 3  
  • 4. Spoken  Query  TokenizaJon       For  each  sub-­‐system,  obtain  a  phoneJc   tokenizaJon  of  the  queries   •  Similar  to  LID  parallel  phonotacJc   approaches     Use  a  phone-­‐loop  grammar  with   phoneme  minimum  duraJon  of  3   frames  to  obtain  1-­‐best  phoneme  chain   •  AlternaJve  n-­‐best  hypothesis  for   charactering  each  query  were  explored  with   unsaJsfactory  results   o  Other  possibiliJes  not  explored  (yet):  lakce,  CN,  etc…     •  Influence  of  the  word  inserJon  penalty   (wip)  parameter  on  the  tokenizaJon  result     The L2F SWS system for Mediaeval 2012 Pisa, October 4, 2012 4  
  • 5. Spoken  Query  Search     Spoken  query  search  based  on  AKWS  with  our  hybrid  speech  recognizer.       Search  window  of  5  seconds  (2.5  seconds  Jme  shio)   •  Convert  the  problem  in  a  verificaJon  task  (originally  for  with  forced  alignment)   •  Convenient  for  fusion  purposes     Equally-­‐likely  1-­‐gram  LM  with  target  Query  and  Background  word:   •  Query  word     o  Described  by  the  phoneJc  units  obtained  in  the  previous  tokenizaJon  stage   •  Background  word     o  Described  by  the  special  phoneJc  class  background/filler   o  Minimum  duraJon  set  to  250  msec.       DetecJon  score  of  detected  candidates  computed  as  the  average   phoneJc  log-­‐likelihood  raJos  of  the  query  term     The L2F SWS system for Mediaeval 2012 Pisa, October 4, 2012 5  
  • 6. Spoken  Query  Search   BG  modelling  with  HMM/MLP     Possible  approaches   •  Re-­‐train  MLP  ✖   •  Compute  posterior  probability  of  the  background  class  depending  on  the  other   classes  ✔   o  Mean  probability  of  the  top-­‐N  most  likely  outputs     For  the  SWS  system,     •  We  compute  the  average  in  the  likelihood  domain  (top-­‐6)   o  The  decoder  operates  in  the  likelihood  domain,  so  there  is  not  need  for  and  add-­‐hoc   esJmaJon  of  the  BG  class  prior   •  We  use  a  background  scale  term  β  (exponenJal  in  the  likelihood  domain)  to   control  the  weight  of  the  BG  model  vs.  Query   •  This  β  scale  together  with  the  wip  term  strongly  affects  searching  results   o  Adjusted  following  a  non-­‐exhausJve  greedy  search   The L2F SWS system for Mediaeval 2012 Pisa, October 4, 2012 6  
  • 7. Score  normalizaJon,  fusion  and  calibraJon     Score  normaliza@on  schemes  explored:   •  Q-­‐norm  Assume  that  the  scores  are  dependent  of  the  queries  and   apply  a  by-­‐query  normalizaJon   •  F-­‐norm  Assume  that  the  scores  are  dependent  of  the  data  file  (of   the  collecJon)  and  apply  a  by-­‐file  normalizaJon   •  CombinaJons  (QF-­‐norm,  FQ-­‐norm)     Fusion  schemes  explored:   •  Candidate  detecJons  from  the  4  parallel  sub-­‐systems  are  kept  (or   rejected)  according  to  simple  combinaJon  rules:   o  AND  (all),  OR  (at  least  1  sys)  and  MV  (at  least  2  sys)   •  Final  detecJon  score  is  the  mean  score       Decision  threshold  set  according  to  maxATWV  in  dev-­‐dev   The L2F SWS system for Mediaeval 2012 Pisa, October 4, 2012 7  
  • 8. Development  experiments   Sub-­‐systems  performance   The L2F SWS system for Mediaeval 2012 Pisa, October 4, 2012 8  
  • 9. Development  experiments   Score  normalizaJon  strategies   The L2F SWS system for Mediaeval 2012 Pisa, October 4, 2012 9  
  • 10. Development  experiments   Fusion  strategies  (aoer  Q-­‐norm)   The L2F SWS system for Mediaeval 2012 Pisa, October 4, 2012 10  
  • 11. Official  L2F  SWS2012  results   Combined DET Plot Combined DET Plot Combined DET Plot 98 98 98 Random Performance Random Performance Random Performance Term Wtd. p-phonetic4_fusion_mv : ALL Data Max Val=0.486 Scr=-0.135 Term Wtd. p-phonetic4_fusion_mv : ALL Data Max Val=0.633 Scr=-0.435 Term Wtd. p-phonetic4_fusion_mv : ALL Data Max Val=0.523 Scr=-0.362 95 95 95 Term Wtd. p-phonetic4_fusion_mv: CTS Subset Max Val=0.486 Scr=-0.135 Term Wtd. p-phonetic4_fusion_mv: CTS Subset Max Val=0.633 Scr=-0.435 Term Wtd. p-phonetic4_fusion_mv: CTS Subset Max Val=0.523 Scr=-0.362 90 90 90 80 80 80 Miss probability (in %) Miss probability (in %) Miss probability (in %) 60 60 60 40 40 40 20 20 20 10 10 10 5 5 5 .0001 .001 .004 .01.02 .05 .1 .2 .5 1 2 5 10 20 40 .0001 .001 .004 .01.02 .05 .1 .2 .5 1 2 5 10 20 40 .0001 .001 .004 .01.02 .05 .1 .2 .5 1 2 5 10 20 40 False Alarm probability (in %) False Alarm probability (in %) False Alarm probability (in %) dev-­‐eval   eval-­‐dev   eval-­‐eval   The L2F SWS system for Mediaeval 2012 Pisa, October 4, 2012 11  
  • 12. Conclusions     The  L2F  Spoken  Web  Search  system  fully  exploits  hybrid  ANN/HMM   speech  recogniJon:   •  Query  tokenizaJon  based  on  1-­‐best  phoneJc  decoding   •  Query  search  based  on  AKWS  (with  no  need  for  AM  re-­‐training)     The  submiVed  system  is  formed  by  the  fusion  of  four  language-­‐ dependent  sub-­‐systems:   •  Q-­‐norm  score  normalizaJon  is  applied  to  each  individual  sub-­‐system   •  Fusion  is  done  following  a  majority  voJng  strategy     The  system  achieves  an  actual  ATWV  score  of  0.5195  in  the  eval-­‐eval     •  Promising  given  the  simplicity  of  the  proposed  system   •  Robust  to  query  and  collecJon  sets   o  Best  performance  is  achieved  in  a  mismatched  condiJon!!   •  Reasonably  well-­‐calibrated     Future  work  Focused  in  improved  Query  tokenizaJon,  fusion  with   other  type  of  approaches  (DTW)     The L2F SWS system for Mediaeval 2012 Pisa, October 4, 2012 12  
  • 13. technology from seed technology from seed L2 F - Spoken Language Systems Laboratory Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento em Lisboa 13 L2 F - Spoken Language Systems Laboratory