咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >Blog Post Extraction Using Tit... 收藏
Blog Post Extraction Using Title Finding

Blog Post Extraction Using Title Finding

作     者:Linhai Song~(1,2),Xueqi Cheng~1,Yan Guo~1,Bo Wu~(1,2),Yu Wang~(1,2+) 1 Institute of Computing Technology,Chinese Academy of Sciences,Beijing 2 Graduate School of the Chinese Academy of Sciences,Beijing 

会议名称:《第五届全国信息检索学术会议》

会议日期:2009年

学科分类:081203[工学-计算机应用技术] 08[工学] 0835[工学-软件工程] 0812[工学-计算机科学与技术(可授工学、理学学位)] 

基  金:863高技术研究发展计划资助(项目编号:2007AA01Z438) 

关 键 词:Blog Post Title Finding VIPS SVM 

摘      要:With the development of Web2.0,web mining applications pay more attention to blog *** order to prevent noises in blog pages from affecting the precision of web mining algorithms,it is very necessary to acquire posts from biog pages *** this paper,we propose a blog post extraction algorithm which uses title *** are two stages in the *** the first stage,text nodes which indicate the title of the post are found and used as the beginning of the *** take a machine learning approach to realize this stage,and employ SVM as classification *** the second stage,we find the end of the *** methods are introduced in this stage,one uses VIPS segmentation results,and the other is based on hand-coded rules. Experiments are conducted to see how titles are found and how posts are *** results show that our algorithm can obtain promising results.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分