咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >Keyword Extraction Based on tf... 收藏

Keyword Extraction Based on tf/idf for Chinese News Document

Keyword Extraction Based on tf/idf for Chinese News Document

作     者:LI Juanzi FAN Qi'na ZHANG Kuo 

作者机构:Department of Computer Science and Technology Tsinghua University Beijing 100084 China 

出 版 物:《Wuhan University Journal of Natural Sciences》 (武汉大学学报(自然科学英文版))

年 卷 期:2007年第12卷第5期

页      面:917-921页

学科分类:081203[工学-计算机应用技术] 08[工学] 0835[工学-软件工程] 0812[工学-计算机科学与技术(可授工学、理学学位)] 

基  金:Supported by the National Natural Science Foundation of China (90604025) 

主  题:keyword extraction keyphrase extraction news keyword 

摘      要:Keyword extraction is an important research topic of information retrieval. This paper gave the specification of keywords in Chinese news documents based on analyzing linguistic characteristics of news documents and then proposed a new keyword extraction method based on tf/idf with multi-strategies. The approach selected candidate keywords of uni-, hi- and tri-grams, and then defines the features according to their morphological characters and context information. Moreover, the paper proposed several strategies to amend the incomplete words gotten from the word segmentation and found unknown potential keywords in news documents. Experimental results show that our proposed method can significantly outperform the baseline method. We also applied it to retrospective event detection. Experimental results show that the accuracy and efficiency of news retrospective event detection can be significantly improved.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分