咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >Focused crawling strategies ba... 收藏

Focused crawling strategies based on ontologies and simulated annealing methods for rainstorm disaster domain knowledge

基于本体和模拟退火算法的暴雨灾害主题爬虫策略

作     者:Jingfa LIU Fan LI Ruoyao DING Zi’ang LIU Jingfa LIU;Fan LI;Ruoyao DING;Zi'ang LIU

作者机构:Guangzhou Key Laboratory of Multilingual Intelligent ProcessingGuangdong University of Foreign StudiesGuangzhou 510006China School of Information Science and TechnologyGuangdong University of Foreign StudiesGuangzhou 510006China School of Computer and SoftwareNanjing University of Information Science&TechnologyNanjing 210044China Faculty of ScienceUniversity of AlbertaEdmonton T6G2H6Canada 

出 版 物:《Frontiers of Information Technology & Electronic Engineering》 (信息与电子工程前沿(英文版))

年 卷 期:2022年第23卷第8期

页      面:1189-1204页

核心收录:

学科分类:07[理学] 070601[理学-气象学] 081203[工学-计算机应用技术] 08[工学] 0706[理学-大气科学] 0835[工学-软件工程] 0812[工学-计算机科学与技术(可授工学、理学学位)] 

基  金:supported by the Special Foundation of Guangzhou Key Laboratory of Multilingual Intelligent Processing,China(No.201905010008) the Program of Science and Technology of Guangzhou,China(No.202002030238) the Guangdong Basic and Applied Basic Research Foundation,China(No.2021A1515011974) 

主  题:Focused crawler Ontology Priority evaluation Simulated annealing Rainstorm disaster 

摘      要:At present,focused crawler is a crucial method for obtaining effective domain knowledge from massive heterogeneous *** most current focused crawling technologies,there are some difficulties in obtaining high-quality crawling *** main difficulties are the establishment of topic benchmark models,the assessment of topic relevance of hyperlinks,and the design of crawling *** this paper,we use domain ontology to build a topic benchmark model for a specific topic,and propose a novel multiple-filtering strategy based on local ontology and global ontology(MFSLG).A comprehensive priority evaluation method(CPEM)based on the web text and link structure is introduced to improve the computation precision of topic relevance for unvisited hyperlinks,and a simulated annealing(SA)method is used to avoid the focused crawler falling into local optima of the *** incorporating SA into the focused crawler with MFSLG and CPEM for the first time,two novel focused crawler strategies based on ontology and SA(FCOSA),including FCOSA with only global ontology(FCOSA_G)and FCOSA with both local ontology and global ontology(FCOSA_LG),are proposed to obtain topic-relevant webpages about rainstorm disasters from the *** results show that the proposed crawlers outperform the other focused crawling strategies on different performance metric indices.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分