Deep Learning-based Environmental Sound Classification Using Feature Fusion and Data Enhancement
作者机构:Department of Computer ScienceCOMSATS University IslamabadVehari CampusPakistan Department of Computer ScienceUniversity of Engineering and Technology LahorePakistan Department of Computer ScienceCollege of Computers and Information TechnologyTaif UniversityTaif21974Saudi Arabia
出 版 物:《Computers, Materials & Continua》 (计算机、材料和连续体(英文))
年 卷 期:2023年第74卷第1期
页 面:1069-1091页
核心收录:
学科分类:12[管理学] 080801[工学-电机与电器] 1201[管理学-管理科学与工程(可授管理学、工学学位)] 0808[工学-电气工程] 081104[工学-模式识别与智能系统] 08[工学] 0835[工学-软件工程] 0811[工学-控制科学与工程] 0812[工学-计算机科学与技术(可授工学、理学学位)]
基 金:the Taif University Researchers Supporting Project number(TURSP-2020/36) Taif University Taif Saudi Arabia
主 题:Environmental sound classification convolutional neural network deep learning transformer data augmentation
摘 要:Environmental sound classification(ESC)involves the process of distinguishing an audio stream associated with numerous environmental *** common aspects such as the framework difference,overlapping of different sound events,and the presence of various sound sources during recording make the ESC task much more complicated and *** research is to propose a deep learning model to improve the recognition rate of environmental sounds and reduce the model training time under limited computation *** this research,the performance of transformer and convolutional neural networks(CNN)are *** audio features,chromagram,Mel-spectrogram,tonnetz,Mel-Frequency Cepstral Coefficients(MFCCs),delta MFCCs,delta-delta MFCCs and spectral contrast,are extracted fromtheUrbanSound8K,ESC-50,and ESC-10,***,this research also employed three data enhancement methods,namely,white noise,pitch tuning,and time stretch to reduce the risk of overfitting issue due to the limited audio *** evaluation of various experiments demonstrates that the best performance was achieved by the proposed transformer model using seven audio features on enhanced *** UrbanSound8K,ESC-50,and ESC-10,the highest attained accuracies are 0.98,0.94,and 0.97 *** experimental results reveal that the proposed technique can achieve the best performance for ESC problems.