咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >Variable importance-weighted R... 收藏

Variable importance-weighted Random Forests

Variable importance-weighted Random Forests

作     者:Yiyi Liu Hongyu Zhao 

作者机构:Department of Biostatistics School of Public Health Yale University New Haven CT 06511 USA Program of Computational Biology and Bioinformatics Yale University New Haven CT 06511 USA 

出 版 物:《Frontiers of Electrical and Electronic Engineering in China》 (中国电气与电子工程前沿(英文版))

年 卷 期:2017年第5卷第4期

页      面:338-351页

核心收录:

学科分类:12[管理学] 1201[管理学-管理科学与工程(可授管理学、工学学位)] 08[工学] 

基  金:supported in part by the National Institutes of Health 

主  题:Random Forests variable importance score classification regression 

摘      要:Background: Random Forests is a popular classification and regression method that has proven powerful for various prediction problems in biological studies. However, its performance often deteriorates when the number of features increases. To address this limitation, feature elimination Random Forests was proposed that only uses features with the largest variable importance scores. Yet the performance of this method is not satisfying, possibly due to its rigid feature selection, and increased correlations between trees of forest. Methods: We propose variable importance-weighted Random Forests, which instead of sampling features with equal probability at each node to build up trees, samples features according to their variable importance scores, and then select the best split from the randomly selected features. Results: We evaluate the performance of our method through comprehensive simulation and real data analyses, for both regression and classification. Compared to the standard Random Forests and the feature elimination Random Forests methods, our proposed method has improved performance in most cases. Conclusions: By incorporating the variable importance scores into the random feature selection step, our method can better utilize more informative features without completely ignoring less informative ones, hence has improved prediction accuracy in the presence of weak signals and large noises. We have implemented an R package "viRandomForests" based on the original R package "randomForest" and it can be freely downloaded from http:// ***/software.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分