A Multiple Feature Approach for Disorder Normalization in Clinical Notes
A Multiple Feature Approach for Disorder Normalization in Clinical Notes作者机构:School of ComputerWuhan University Department of Chinese Language and LiteratureHubei University of Art and Science Shandong Key Lab of Language Resource Development and ApplicationLudong University
出 版 物:《Wuhan University Journal of Natural Sciences》 (武汉大学学报(自然科学英文版))
年 卷 期:2016年第21卷第6期
页 面:482-490页
核心收录:
学科分类:081203[工学-计算机应用技术] 08[工学] 0835[工学-软件工程] 0812[工学-计算机科学与技术(可授工学、理学学位)]
基 金:Supported by the National Natural Science Foundation of China(61133012,61202193,61373108) the Major Projects of the National Social Science Foundation of China(11&ZD189) the Chinese Postdoctoral Science Foundation(2013M540593,2014T70722) the Open Foundation of Shandong Key Laboratory of Language Resource Development and Application
主 题:natural language processing disorder normalization Levenshtein distance semantic composition multiple features
摘 要:In this paper we propose a multiple feature approach for the normalization task which can map each disorder mention in the text to a unique unified medical language system(UMLS)concept unique identifier(CUI). We develop a two-step method to acquire a list of candidate CUIs and their associated preferred names using UMLS API and to choose the closest CUI by calculating the similarity between the input disorder mention and each candidate. The similarity calculation step is formulated as a classification problem and multiple features(string features,ranking features,similarity features,and contextual features) are used to normalize the disorder mentions. The results show that the multiple feature approach improves the accuracy of the normalization task from 32.99% to 67.08% compared with the Meta Map baseline.