A Multi-Threaded Semantic Focused Crawler
A Multi-Threaded Semantic Focused Crawler作者机构:Department of Computer ScienceUniversity of DelhiDelhi-110007India Department of Computer ScienceKeshav MahavidyalayaUniversity of DelhiDelhi-110007India Department of Computer ScienceDyal Singh CollegeUniversity of DelhiDelhi-110007India
出 版 物:《Journal of Computer Science & Technology》 (计算机科学技术学报(英文版))
年 卷 期:2012年第27卷第6期
页 面:1233-1242页
核心收录:
学科分类:081203[工学-计算机应用技术] 08[工学] 080402[工学-测试计量技术及仪器] 0804[工学-仪器科学与技术] 0835[工学-软件工程] 0812[工学-计算机科学与技术(可授工学、理学学位)]
主 题:eLearning semantic focused crawler semantically expanded term ontology
摘 要:The Web comprises of voluminous rich learning content. The volume of ever growing learning resources however leads to the problem of information overload. A large number of irrelevant search results generated from search engines based on keyword matching techniques further augment the problem. A learner in such a scenario needs semantically matched learning resources as the search results. Keeping in view the volume of content and significance of semantic knowledge, our paper proposes a multi-threaded semantic focused crawler (SFC) specially designed and implemented to crawl on the WWW for educational learning content. The proposed SFC utilizes domain ontology to expand a topic term and a set of seed URLs to initiate the crawl. The results obtained by multiple iterations of the crawl on various topics are shown and compared with the results obtained by executing an open source crawler on the similar dataset. The results are evaluated using Semantic Similarity, a vector space model based metric, and the harvest ratio.