An Introduction to the Chinese Speech Recognition Front-End of the NICT/ATR Multi-Lingual Speech Translation System
An Introduction to the Chinese Speech Recognition Front-End of the NICT/ATR Multi-Lingual Speech Translation System作者机构:Knowledge Creating Communication Research Center National Institute of Information and Communications Technology 2-2-2 Keihanna Science City Kyoto 619-0288 Japan ATR Spoken Language Translation Research Laboratories 2-2-2 Keihanna Science City Kyoto 619-0288 Japan ATR Knowledge Science Laboratories 2-2-2 Keihanna Science City Kyoto 619-0288 Japan Knowledge Creating Communication Research Center National Institute of Information and Communications Technology 2-2-2 Keihanna Science City Kyoto 619-0288 Japan
出 版 物:《Tsinghua Science and Technology》 (清华大学学报(自然科学版(英文版))
年 卷 期:2008年第13卷第4期
页 面:545-552页
核心收录:
学科分类:1305[艺术学-设计学(可授艺术学、工学学位)] 13[艺术学] 08[工学] 081104[工学-模式识别与智能系统] 0804[工学-仪器科学与技术] 081101[工学-控制理论与控制工程] 0811[工学-控制科学与工程]
主 题:Chinese speech recognition mutual information phoneme set design hidden Markov network minimum description length successive state splitting multi-class composite N-grams
摘 要:This paper introduces several important features of the Chinese large vocabulary continuous speech recognition system in the NICT/ATR multi-lingual speech-to-speech translation system. The features include: (1) a flexible way to derive an information rich phoneme set based on mutual information between a text corpus and its phoneme set; (2) a hidden Markov network acoustic model and a successive state splitting algorithm to generate its model topology based on a minimum description length criterion; and (3) advanced language modeling using multi-class composite N-grams. These features allow a recognition performance of 90% character accuracy in tourism related dialogue with a real time response speed.