An End-to-end Speech Recognition Algorithm based on Attention Mechanism
作者单位:College of Information Science and Engineering Northeastern University
会议名称:《第三十九届中国控制会议》
会议日期:2020年
学科分类:0711[理学-系统科学] 12[管理学] 1201[管理学-管理科学与工程(可授管理学、工学学位)] 07[理学] 081104[工学-模式识别与智能系统] 08[工学] 0835[工学-软件工程] 0811[工学-控制科学与工程] 0812[工学-计算机科学与技术(可授工学、理学学位)]
关 键 词:Speech recognition End-to-end technology Endpoint detection Attentional mechanism Spectrogram
摘 要:End-to-end speech recognition system is a major research field in speech recognition. The most typical model is the end-to-end speech recognition system based on CTC where RNN mining to and time sequence information are adopted, and a series of assumptions of HMM are discarded to obtain a good recognition rate. However, the CTC-based model is more dependent on the speech model and have a longer training cycle. Therefore, in the framework of traditional acoustic model, this paper proposes to train a feature extraction network of spectrogram based on attention mechanism by using prior knowledge. Firstly, it was spliced in the front end based on CTC model, and then the number of layers of cyclic neural network based on CTC model was reduced. Finally, it was combined to retrain. The experimental results show that the training time of the combined model is effectively reduced, and the accuracy of speech recognition is further improved.