Remove Redundancy Samples for SVM in A Chinese Word Segmentation Task
Remove Redundancy Samples for SVM in A Chinese Word Segmentation Task作者机构:School of lnformation Science and Engineering Northeastern University Shenyang 110004 China
出 版 物:《通讯和计算机(中英文版)》 (Journal of Communication and Computer)
年 卷 期:2006年第3卷第5期
页 面:103-107页
学科分类:08[工学] 0812[工学-计算机科学与技术(可授工学、理学学位)]
摘 要:This pap r proposes an algorithm that can remove a large number of redundancy samples in a task of using SVM for Chinese word segmentation, and it will not drop much of the final experimental performance. This can ease the training of Chinese word segmentation to a certain extent. This algorithm is fast and needs no extra cost. Both theoretical analysis and experiments show that this algorithm works better, it removes almost 45% of the redundancy samples and the precision ration of our Chinese word segmentation drops less than 3%.1