An Effective Feature Generation and Selection Approach for Lymph Disease Recognition
作者机构:School of Computer and SoftwareNanjing University of Information Science and TechnologyNanjing210044China Institute of HydrobiologyChinese Academy of SciencesWuhan430072China
出 版 物:《Computer Modeling in Engineering & Sciences》 (工程与科学中的计算机建模(英文))
年 卷 期:2021年第129卷第11期
页 面:567-594页
核心收录:
学科分类:0831[工学-生物医学工程(可授工学、理学、医学学位)] 1002[医学-临床医学] 1001[医学-基础医学(可授医学、理学学位)] 08[工学] 0805[工学-材料科学与工程(可授工学、理学学位)] 0812[工学-计算机科学与技术(可授工学、理学学位)]
基 金:This work is supported by the Startup Foundation for Introducing Talent of NUIST Project No.2243141701103
主 题:Disease data mining feature selection classification lymph diagnosis
摘 要:Health care data mining is noteworthy in disease diagnosis and recognition *** exist several potentials to further improve the performance of machine learning based-classification methods in healthcare data *** selection of a substantial subset of features is one of the feasible approaches to achieve improved recognition results of classification methods in disease diagnosis *** the present study,a novel combined approach of feature generation using latent semantic analysis(LSA)and selection using ranker search(RAS)has been proposed to improve the performance of classification methods in lymph disease diagnosis *** performance of the proposed combined approach(LSA-RAS)for feature generation and selection is validated using three function-based and two tree-based classification *** performance of the LSA-RAS selected features is compared with the original attributes and other subsets of attributes and features chosen by nine different attributes and features selection approaches in the analysis of a most widely used benchmark and open access lymph disease *** LSA-RAS selected features improve the recognition accuracy of the classification methods significantly in the diagnosis prediction of the lymph *** tree-based classification methods have better recognition accuracy than the function-based classification *** best performance(recognition accuracy of 93.91%)is achieved for the logistic model tree(LMT)classification method using the feature subset generated by the proposed combined approach(LSA-RAS).