Hierarchical clustering based on single-pass for breaking topic detection and tracking
Hierarchical clustering based on single-pass for breaking topic detection and tracking作者机构:School of ComputersGuangdong University of Technology School of Software EngineeringSouth China University of Technology
出 版 物:《High Technology Letters》 (高技术通讯(英文版))
年 卷 期:2018年第24卷第4期
页 面:369-377页
核心收录:
基 金:Supported by the National Natural Science Foundation of China(No.61502312) the Fundamental Research Funds for the Central Universities(No.2017BQ024) the Natural Science Foundation of Guangdong Province(No.2017A030310428) the Science and Technology Programm of Guangzhou(No.201806020075,20180210025)
主 题:topic detection and tracking(TDT) single-pass hierarchical clustering text clustering topic modeling
摘 要:Single-pass is commonly used in topic detection and tracking( TDT) due to its simplicity,high efficiency and low cost. When dealing with large-scale data,time cost will increase sharply and clustering performance will be affected greatly. Aiming at this problem,hierarchical clustering algorithm based on single-pass is proposed,which is inspired by hierarchical and concurrent ideas to divide clustering process into three stages. News reports are classified into different categories *** there are twice single-pass clustering processes in the same category,and one agglomerative clustering among different categories. In addition,for semantic similarity in news reports,topic model is improved based on named entities. Experimental results show that the proposed method can effectively accelerate the process as well as improve the performance.