Unsupervised WSD by Finding the Predominant Sense Using Context as a Dynamic Thesaurus
Unsupervised WSD by Finding the Predominant Sense Using Context as a Dynamic Thesaurus作者机构:San Pablo Catholic UniversityArequipaPeru Center for Computing ResearchNational Polytechnic InstituteMexico City07738Mexico Nara Institute of Science and TechnologyTakayamaIkomaNara 630-0192Japan
出 版 物:《Journal of Computer Science & Technology》 (计算机科学技术学报(英文版))
年 卷 期:2010年第25卷第5期
页 面:1030-1039页
核心收录:
学科分类:12[管理学] 1201[管理学-管理科学与工程(可授管理学、工学学位)] 0808[工学-电气工程] 081104[工学-模式识别与智能系统] 08[工学] 0835[工学-软件工程] 0701[理学-数学] 0811[工学-控制科学与工程] 0812[工学-计算机科学与技术(可授工学、理学学位)]
基 金:Supported by the Mexican Government(SNI SIP-IPN COFAA-IPN and PIFI-IPN) CONACYT and the Japanese Government
主 题:word sense disambiguation word space model semantic similarity text corpus thesaurus
摘 要:We present and analyze an unsupervised method for Word Sense Disambiguation(WSD).Our work is based on the method presented by McCarthy et *** 2004 for finding the predominant sense of each word in the entire *** maximization algorithm allows weighted terms(similar words) from a distributional thesaurus to accumulate a score for each ambiguous word sense,i.e.,the sense with the highest score is chosen based on votes from a weighted list of terms related to the ambiguous *** list is obtained using the distributional similarity method proposed by Lin Dekang to obtain a *** the method of McCarthy et al.,every occurrence of the ambiguous word uses the same thesaurus,regardless of the context where the ambiguous word *** method accounts for the context of a word when determining the sense of an ambiguous word by building the list of distributed similar words based on the syntactic context of the ambiguous *** obtain a top precision of 77.54%of accuracy versus 67.10%of the original method tested on *** also analyze the effect of the number of weighted terms in the tasks of finding the Most Precuent Sense(MFS) and WSD,and experiment with several corpora for building the Word Space Model.