咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >Using Hybrid Penalty and Gated... 收藏

Using Hybrid Penalty and Gated Linear Units to Improve Wasserstein Generative Adversarial Networks for Single-Channel Speech Enhancement

作     者:Xiaojun Zhu Heming Huang 

作者机构:School of Computer ScienceQinghai Normal UniversityXining810008China The State Key Laboratory of Tibetan Intelligent Information Processing and ApplicationXining810008China School of Electronic and Information EngineeringLanzhou City UniversityLanzhou730000China 

出 版 物:《Computer Modeling in Engineering & Sciences》 (工程与科学中的计算机建模(英文))

年 卷 期:2023年第135卷第6期

页      面:2155-2172页

核心收录:

学科分类:08[工学] 080203[工学-机械设计及理论] 0835[工学-软件工程] 0802[工学-机械工程] 0701[理学-数学] 0811[工学-控制科学与工程] 0812[工学-计算机科学与技术(可授工学、理学学位)] 

基  金:supported by the National Science Foundation under Grant No.62066039 

主  题:Speech enhancement generative adversarial networks hybrid penalty gated linear units multi-scale convolution 

摘      要:Recently,speech enhancement methods based on Generative Adversarial Networks have achieved good performance in time-domain noisy ***,the training of Generative Adversarial Networks has such problems as convergence difficulty,model collapse,*** this work,an end-to-end speech enhancement model based on Wasserstein Generative Adversarial Networks is proposed,and some improvements have been made in order to get faster convergence speed and better generated speech ***,in the generator coding part,each convolution layer adopts different convolution kernel sizes to conduct convolution operations for obtaining speech coding information from multiple scales;a gated linear unit is introduced to alleviate the vanishing gradient problem with the increase of network depth;the gradient penalty of the discriminator is replaced with spectral normalization to accelerate the convergence rate of themodel;a hybrid penalty termcomposed of L1 regularization and a scale-invariant signal-to-distortion ratio is introduced into the loss function of the generator to improve the quality of generated *** experimental results on both TIMIT corpus and Tibetan corpus show that the proposed model improves the speech quality significantly and accelerates the convergence speed of the model.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分