Nonlinear Prediction with Deep Recurrent Neural Networks for Non-Blind Audio Bandwidth Extension
Nonlinear Prediction with Deep Recurrent Neural Networks for Non-Blind Audio Bandwidth Extension作者机构:National Engineering Research Center for Multimedia Software School of Computer Science Wuhan University Wuhan 430072 China Institute of Big Data and internet Innovation Hunan University of Commerce Changsha 410205 China Software College East China University of Technology Nanchang 330013 China Collaborative Innovation Center for Economics crime investigation and prevention technology Jiangxi Province Nanchang 330103China Hubei Key Laboratory of Multimedia and Network Communication Engineering Wuhan University Wuhan 430072 China Collaborative Innovation Center of Geospatial Technology Wuhan 430079 China
出 版 物:《China Communications》 (中国通信(英文版))
年 卷 期:2018年第15卷第4期
页 面:72-85页
核心收录:
学科分类:0810[工学-信息与通信工程] 0711[理学-系统科学] 0808[工学-电气工程] 0809[工学-电子科学与技术(可授工学、理学学位)] 07[理学] 0839[工学-网络空间安全] 0812[工学-计算机科学与技术(可授工学、理学学位)]
基 金:supported by the National Natural Science Foundation of China under Grant No. 61762005, 61231015, 61671335, 61702472, 61701194, 61761044, 61471271 National High Technology Research and Development Program of China (863 Program) under Grant No. 2015AA016306 Hubei Province Technological Innovation Major Project under Grant No. 2016AAA015 the Science Project of Education Department of Jiangxi Province under No. GJJ150585 The Opening Project of Collaborative Innovation Center for Economics Crime Investigation and Prevention Technology, Jiangxi Province, under Grant No. JXJZXTCX-025
主 题:audio coding non-blind audiobandwidth extension context correlation deeprecurrent neural network
摘 要:Non-blind audio bandwidth extension is a standard technique within contemporary audio codecs to efficiently code audio signals at low bitrates. In existing methods, in most cases high frequencies signal is usually generated by a duplication of the corresponding low frequencies and some parameters of high frequencies. However, the perception quality of coding will significantly degrade if the correlation between high frequencies and low frequencies becomes weak. In this paper, we quantitatively analyse the correlation via computing mutual information value. The analysis results show the correlation also exists in low frequency signal of the context dependent frames besides the current frame. In order to improve the perception quality of coding, we propose a novel method of high frequency coarse spectrum generation to improve the conventional replication method. In the proposed method, the coarse high frequency spectrums are generated by a nonlinear mapping model using deep recurrent neural network. The experiments confirm that the proposed method shows better performance than the reference methods.