Using Hybrid Penalty and Gated Linear Units to Improve Wasserstein Generative Adversarial Networks for Single-Channel Speech Enhancement
作者机构:School of Computer ScienceQinghai Normal UniversityXining810008China The State Key Laboratory of Tibetan Intelligent Information Processing and ApplicationXining810008China School of Electronic and Information EngineeringLanzhou City UniversityLanzhou730000China
出 版 物:《Computer Modeling in Engineering & Sciences》 (工程与科学中的计算机建模(英文))
年 卷 期:2023年第135卷第6期
页 面:2155-2172页
核心收录:
学科分类:08[工学] 080203[工学-机械设计及理论] 0835[工学-软件工程] 0802[工学-机械工程] 0701[理学-数学] 0811[工学-控制科学与工程] 0812[工学-计算机科学与技术(可授工学、理学学位)]
基 金:supported by the National Science Foundation under Grant No.62066039
主 题:Speech enhancement generative adversarial networks hybrid penalty gated linear units multi-scale convolution
摘 要:Recently,speech enhancement methods based on Generative Adversarial Networks have achieved good performance in time-domain noisy ***,the training of Generative Adversarial Networks has such problems as convergence difficulty,model collapse,*** this work,an end-to-end speech enhancement model based on Wasserstein Generative Adversarial Networks is proposed,and some improvements have been made in order to get faster convergence speed and better generated speech ***,in the generator coding part,each convolution layer adopts different convolution kernel sizes to conduct convolution operations for obtaining speech coding information from multiple scales;a gated linear unit is introduced to alleviate the vanishing gradient problem with the increase of network depth;the gradient penalty of the discriminator is replaced with spectral normalization to accelerate the convergence rate of themodel;a hybrid penalty termcomposed of L1 regularization and a scale-invariant signal-to-distortion ratio is introduced into the loss function of the generator to improve the quality of generated *** experimental results on both TIMIT corpus and Tibetan corpus show that the proposed model improves the speech quality significantly and accelerates the convergence speed of the model.