Corpus-based Speech-to-speech Translation in Travel Domain
会议名称:《第九届全国人机语音通讯学术会议》
会议日期:2007年
学科分类:08[工学] 081203[工学-计算机应用技术] 0812[工学-计算机科学与技术(可授工学、理学学位)]
摘 要:We describe a multi-lingual speech-to-speech translation (S2ST) system, which is mainly focused on translation between English and Asian languages (Japanese and Chinese). There are three main modules of our S2ST system: large vocabulary continuous speech recognition, machine text-to-text (T2T) translation, and text-to-speech synthesis. All of them are multilingual and are designed using state-of-the-art technologies. A corpus-based statistical machine learning framework forms the basis of our system. We use a parallel multilingual database consisting of over 1,000,000 sentences that cover a broad range of travel-related conversations. Recent evaluation of the overall system showed that speech-to-speech translation quality is high, being at the level of a person having a TOEIC score of 650 out of the perfect score of 990. Additionally, field experiment results in real various places will be introduced