CNN-SELF-ATTENTION-DNN ARCHITECTURE FOR MANDARIN RECOGNITION
作者单位:Harbin Engineering University
会议名称:《第32届中国控制与决策会议》
会议日期:2020年
学科分类:0711[理学-系统科学] 12[管理学] 1201[管理学-管理科学与工程(可授管理学、工学学位)] 07[理学] 081104[工学-模式识别与智能系统] 08[工学] 0835[工学-软件工程] 0811[工学-控制科学与工程] 0812[工学-计算机科学与技术(可授工学、理学学位)]
关 键 词:Self-attention end-to-end Mandarin recognition CTC
摘 要:Connectionist temporal classification(CTC) is a frequently used approach for end-to-end speech *** can be used to calculate CTC loss with artificial neural network such as recurrent neural network(RNN) and convolutional neural network(CNN).Recently,the self-attention architecture has been proposed to replace RNN due to its parallelism in *** this paper,we propose CNN-SELF-ATTENTION-DNN CTC architecture which use selfattention to replace RNN and combine with CNN and deep neural network(DNN).We evaluate it with an about 170 hours Mandarin speech dataset and achieved 7% absolute character error rate(CER) reduction compared to DEEP-CNN with less training time than ***,we do an analysis on the test results to find the factors that affect CER.