Acquiring Selectional Preferences in a Thai Lexical Database
作者单位:Thai Computational Linguistics Laboratory Communications Research Laboratory 112 Paholyothin Road Klong 1 Klong Luang Pathumthani 12120 Thai Computational Linguistics Laboratory Communications Research Laboratory 112 Paholyothin Road Klong 1 Klong Luang Pathumthani 12120 Thai Computational Linguistics Laboratory Communications Research Laboratory 112 Paholyothin Road Klong 1 Klong Luang Pathumthani 12120 Thai Computational Linguistics Laboratory Communications Research Laboratory 112 Paholyothin Road Klong 1 Klong Luang Pathumthani 12120
会议名称:《第一届自然语言处理联合学术会议》
会议日期:2004年
学科分类:081203[工学-计算机应用技术] 08[工学] 0835[工学-软件工程] 0812[工学-计算机科学与技术(可授工学、理学学位)]
摘 要:正In this paper, we consider the problem of enriching a Thai lexical database by extending the semantic information with seiectional preferences. We propose a novel approach for acquiring selectional preferences of verbs, which is motivated by the tree cut model. We apply a model selection technique called the Bayesian Information Criterion (BIC). Given a semantic hierarchy, our goal is to generalize initial noun classes to the most plausible levels on that hierarchy. We present an iterative algorithm for generalization. The algorithm performs agglomerative merging on the semantic hierarchy in a bottomup manner. The BIC is used to measure the improvement of the model both locally and globally. In our experiments, we consider the Web as large corpora. We also propose approaches for extracting examples from the Web. Preliminarily experimental results are given to show the feasibility and effectiveness of our approach.