Improving Association Rules Accuracy in Noisy Domains Using Instance Reduction Techniques
作者机构:College of Computing and InformaticsSaudi Electronic UniversityRiyadh11673Saudi Arabia King Abdullah II School for Information TechnologyThe University of JordanAmman11942Jordan Faculty of EngineeringPort Said UniversityPort Said42523Egypt
出 版 物:《Computers, Materials & Continua》 (计算机、材料和连续体(英文))
年 卷 期:2022年第72卷第8期
页 面:3719-3749页
核心收录:
学科分类:0808[工学-电气工程] 0809[工学-电子科学与技术(可授工学、理学学位)] 07[理学] 08[工学] 071102[理学-系统分析与集成] 0831[工学-生物医学工程(可授工学、理学、医学学位)] 0711[理学-系统科学] 0805[工学-材料科学与工程(可授工学、理学学位)] 081101[工学-控制理论与控制工程] 0701[理学-数学] 0811[工学-控制科学与工程] 0812[工学-计算机科学与技术(可授工学、理学学位)] 0801[工学-力学(可授工学、理学学位)] 081103[工学-系统工程]
基 金:The APC was funded by the Deanship of Scientific Research Saudi Electronic University
主 题:Association rules classification instance reduction techniques classification overfitting noise data cleansing
摘 要:Association rules’learning is a machine learning method used in finding underlying associations in large *** intentionally or unintentionally present,noise in training instances causes overfitting while building the classifier and negatively impacts classification *** paper uses instance reduction techniques for the datasets before mining the association rules and building the *** reduction techniques were originally developed to reduce memory requirements in instance-based *** paper utilizes them to remove noise from the dataset before training the association rules *** experiments were conducted to assess the accuracy of association rules with different instance reduction techniques,namely:DecrementalReduction Optimization Procedure(DROP)3,DROP5,ALL K-Nearest Neighbors(ALLKNN),Edited Nearest Neighbor(ENN),and Repeated Edited Nearest Neighbor(RENN)in different noise *** show that instance reduction techniques substantially improved the average classification accuracy on three different noise levels:0%,5%,and 10%.The RENN algorithm achieved the highest levels of accuracy with a significant improvement on seven out of eight used datasets from the University of California Irvine(UCI)machine learning *** improvements were more apparent in the 5%and the 10%noise *** RENN was applied,the average classification accuracy for the eight datasets in the zero-noise test enhanced from 70.47%to 76.65%compared to the original *** average accuracy was improved from 66.08%to 77.47%for the 5%-noise case and from 59.89%to 77.59%in the 10%-noise *** confidence was also reported in building the association rules when RENN was *** above results indicate that RENN is a good solution in removing noise and avoiding overfitting during the construction of the association rules classifier,especially in noisy domains.