咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >Exploring Sequential Feature S... 收藏

Exploring Sequential Feature Selection in Deep Bi-LSTM Models for Speech Emotion Recognition

作     者:Fatma Harby Mansor Alohali Adel Thaljaoui Amira Samy Talaat 

作者机构:Computer Science DepartmentFuture Academy-Higher Future Institute for Specialized Technological StudiesCairo12622Egypt Department of Computer Science and Information College of Science at ZulfiMajmaah UniversityP.O.Box 66Al-Majmaah11952Saudi Arabia Preparatory Institute for Engineering Studies of GafsaZarrougGafsa2112Tunisia Computers and Systems DepartmentElectronics Research InstituteCairo12622Egypt 

出 版 物:《Computers, Materials & Continua》 (计算机、材料和连续体(英文))

年 卷 期:2024年第78卷第2期

页      面:2689-2719页

核心收录:

学科分类:081203[工学-计算机应用技术] 08[工学] 0835[工学-软件工程] 0812[工学-计算机科学与技术(可授工学、理学学位)] 

基  金:Majmaah University, MU, (R-2023-757) Majmaah University, MU 

主  题:Artificial intelligence application multi features sequential selection speech emotion recognition deep Bi-LSTM 

摘      要:Machine Learning(ML)algorithms play a pivotal role in Speech Emotion Recognition(SER),although they encounter a formidable obstacle in accurately discerning a speaker’s emotional *** examination of the emotional states of speakers holds significant importance in a range of real-time applications,including but not limited to virtual reality,human-robot interaction,emergency centers,and human behavior *** identifying emotions in the SER process relies on extracting relevant information from audio *** studies on SER have predominantly utilized short-time characteristics such as Mel Frequency Cepstral Coefficients(MFCCs)due to their ability to capture the periodic nature of audio signals *** these traits may improve their ability to perceive and interpret emotional depictions appropriately,MFCCS has some *** this study aims to tackle the aforementioned issue by systematically picking multiple audio cues,enhancing the classifier model’s efficacy in accurately discerning human *** utilized dataset is taken from the EMO-DB database,preprocessing input speech is done using a 2D Convolution Neural Network(CNN)involves applying convolutional operations to spectrograms as they afford a visual representation of the way the audio signal frequency content changes over *** next step is the spectrogram data normalization which is crucial for Neural Network(NN)training as it aids in faster *** the five auditory features MFCCs,Chroma,Mel-Spectrogram,Contrast,and Tonnetz are extracted from the spectrogram *** attitude of feature selection is to retain only dominant features by excluding the irrelevant *** this paper,the Sequential Forward Selection(SFS)and Sequential Backward Selection(SBS)techniques were employed for multiple audio cues features ***,the feature sets composed from the hybrid feature extraction methods are fed into the deep Bidirectional Long Short

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分