咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >Over-sampling algorithm for im... 收藏

Over-sampling algorithm for imbalanced data classification

Over-sampling algorithm for imbalanced data classification

作     者:XU Xiaolong CHEN Wen SUN Yanfei 

作者机构:Jiangsu Key Laboratory of Big Data Security&Intelligent ProcessingNanjing University of Posts and TelecommunicationsNanjing 210023China Institute of Big Data Research at YanchengNanjing University of Posts and TelecommunicationsYancheng 224000China Office of Scientific R&DNanjing University of Posts and TelecommunicationsNanjing 210023China 

出 版 物:《Journal of Systems Engineering and Electronics》 (系统工程与电子技术(英文版))

年 卷 期:2019年第30卷第6期

页      面:1182-1191页

核心收录:

学科分类:0711[理学-系统科学] 0808[工学-电气工程] 07[理学] 0809[工学-电子科学与技术(可授工学、理学学位)] 0802[工学-机械工程] 0811[工学-控制科学与工程] 0812[工学-计算机科学与技术(可授工学、理学学位)] 

基  金:supported by the National Key Research and Development Program of China(2018YFB1003700) the Scientific and Technological Support Project(Society)of Jiangsu Province(BE2016776) the“333” project of Jiangsu Province(BRA2017228 BRA2017401) the Talent Project in Six Fields of Jiangsu Province(2015-JNHB-012) 

主  题:imbalanced data density-based spatial clustering of applications with noise(DBSCAN) synthetic minority over sampling technique(SMOTE) over-sampling. 

摘      要:For imbalanced datasets, the focus of classification is to identify samples of the minority class. The performance of current data mining algorithms is not good enough for processing imbalanced datasets. The synthetic minority over-sampling technique(SMOTE) is specifically designed for learning from imbalanced datasets, generating synthetic minority class examples by interpolating between minority class examples nearby. However, the SMOTE encounters the overgeneralization problem. The densitybased spatial clustering of applications with noise(DBSCAN) is not rigorous when dealing with the samples near the borderline.We optimize the DBSCAN algorithm for this problem to make clustering more reasonable. This paper integrates the optimized DBSCAN and SMOTE, and proposes a density-based synthetic minority over-sampling technique(DSMOTE). First, the optimized DBSCAN is used to divide the samples of the minority class into three groups, including core samples, borderline samples and noise samples, and then the noise samples of minority class is removed to synthesize more effective samples. In order to make full use of the information of core samples and borderline samples,different strategies are used to over-sample core samples and borderline samples. Experiments show that DSMOTE can achieve better results compared with SMOTE and Borderline-SMOTE in terms of precision, recall and F-value.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分