Term-Dependent Confidence Normalisation for Out-of-Vocabulary Spoken Term Detection
Term-Dependent Confidence Normalisation for Out-of-Vocabulary Spoken Term Detection作者机构:Centre for Speech Technology ResearchUniversity of Edinburgh10 Crichton Street Human Computer Technology Laboratory (HCTLab)School of Computer Engineering and Telecommunication University Autonomous of MadridAvenue Francisco Tomás y Valiente 1128049MadridSpain
出 版 物:《Journal of Computer Science & Technology》 (计算机科学技术学报(英文版))
年 卷 期:2012年第27卷第2期
页 面:358-375页
核心收录:
基 金:Engineering and Physical Sciences Research Council EPSRC (EP/I031022/1)
主 题:confidence estimation discriminative model spoken term detection speech recognition
摘 要:An important component of a spoken term detection (STD) system involves estimating confidence measures of hypothesised detections.A potential problem of the widely used lattice-based confidence estimation,however,is that the confidence scores are treated uniformly for all search terms,regardless of how much they may differ in terms of phonetic or linguistic *** problem is particularly evident for out-of-vocabulary (OOV) terms which tend to exhibit high intra-term *** address the impact of term diversity on confidence measures,we propose in this work a term-dependent normalisation technique which compensates for term diversity in confidence *** first derive an evaluation-metric-oriented normalisation that optimises the evaluation metric by compensating for the diverse occurrence rates among terms,and then propose a linear bias compensation and a discriminative compensation to deal with the bias problem that is inherent in lattice-based confidence measurement and from which the Term Specific Threshold (TST) approach *** tested the proposed technique on speech data from the multi-party meeting domain with two state-ofthe-art STD systems based on phonemes and words *** experimental results demonstrate that the confidence normalisation approach leads to a significant performance improvement in STD,particularly for OOV terms with phonemebased systems.