Supervised Contrastive Learning with Term Weighting for Improving Chinese Text Classification
作者机构:School of Cyber Science and EngineeringWuhan UniversityWuhan 430000China
出 版 物:《Tsinghua Science and Technology》 (清华大学学报(自然科学版(英文版))
年 卷 期:2023年第28卷第1期
页 面:59-68页
核心收录:
学科分类:1205[管理学-图书情报与档案管理] 0502[文学-外国语言文学] 050201[文学-英语语言文学] 05[文学] 0812[工学-计算机科学与技术(可授工学、理学学位)]
基 金:supported by the National Natural Science Foundation of China (No.U1936122) Primary Research&Developement Plan of Hubei Province (Nos.2020BAB101 and 2020BAA003)
主 题:Chinese text classification Supervised Contrastive Learning(SCL) Term Weighting(TW) Temporal Convolution Network(TCN)
摘 要:With the rapid growth of information retrieval technology,Chinese text classification,which is the basis of information content security,has become a widely discussed *** view of the huge difference compared with English,Chinese text task is more complex in semantic information ***,most existing Chinese text classification approaches typically regard feature representation and feature selection as the key points,but fail to take into account the learning strategy that adapts to the ***,these approaches compress the Chinese word into a representation vector,without considering the distribution of the term among the categories of *** order to improve the effect of Chinese text classification,a unified method,called Supervised Contrastive Learning with Term Weighting(SCL-TW),is proposed in this *** contrastive learning makes full use of a large amount of unlabeled data to improve model *** SCL-TW,we calculate the score of term weighting to optimize the process of data augmentation of Chinese ***,the transformed features are fed into a temporal convolution network to conduct feature *** verifications are conducted on two Chinese benchmark *** results demonstrate that SCL-TW outperforms other advanced Chinese text classification approaches by an amazing margin.