咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >Unsupervised Graph-Based Tibet... 收藏

Unsupervised Graph-Based Tibetan Multi-Document Summarization

作     者:Xiaodong Yan Yiqin Wang Wei Song Xiaobing Zhao A.Run Yang Yanxing 

作者机构:School of Information and EngineeringMinzu University of ChinaBeijing100081China National Language Resource Monitoring&Research CenterMinority Languages BranchBeijing100081China University of CaliforniaIrvineCalifornia92617USA Department of PhysicsNew Jersey Institute of TechnologyNewarkNew Jersey07102-1982USA 

出 版 物:《Computers, Materials & Continua》 (计算机、材料和连续体(英文))

年 卷 期:2022年第73卷第10期

页      面:1769-1781页

核心收录:

学科分类:0502[文学-外国语言文学] 050201[文学-英语语言文学] 05[文学] 

基  金:This work was supported in part by the National Science Foundation Project of P.R.China 484 under Grant No.52071349 partially supported by Young and Middle-aged Talents Project of the State Ethnic Affairs 487 Commission 

主  题:Multi-document summarization text clustering topic feature fusion graphic model 

摘      要:Text summarization creates subset that represents the most important or relevant information in the original content,which effectively reduce information *** neural network method has achieved good results in the task of text summarization both in Chinese and English,but the research of text summarization in low-resource languages is still in the exploratory stage,especially in ***’s more,there is no large-scale annotated corpus for text *** lack of dataset severely limits the development of low-resource text *** this case,unsupervised learning approaches are more appealing in low-resource languages as they do not require labeled *** this paper,we propose an unsupervised graph-based Tibetan multi-document summarization method,which divides a large number of Tibetan news documents into topics and extracts the summarization of each *** obtained by using traditional graph-based methods have high redundancy and the division of documents topics are not detailed *** terms of topic division,we adopt two level clustering methods converting original document into document-level and sentence-level graph,next we take both linguistic and deep representation into account and integrate external corpus into graph to obtain the sentence semantic *** the shortcomings of the traditional K-Means clustering method and perform more detailed clustering of *** model sentence clusters into graphs,finally remeasure sentence nodes based on the topic semantic information and the impact of topic features on sentences,higher topic relevance summary is *** order to promote the development of Tibetan text summarization,and to meet the needs of relevant researchers for high-quality Tibetan text summarization datasets,this paper manually constructs a Tibetan summarization dataset and carries out relevant *** experiment results show that our method can effectively improve the

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分