Research on Text Mining of Syndrome Element Syndrome Differentiation by Natural Language Processing
运用自然语言处理对证素辨证学进行文本挖掘研究(英文)作者机构:Hunan University of Chinese MedicineChangshaHunan 410208China TCM Diagnostic InstituteHunan University of Chinese MedicineChangshaHunan 410208China Administration of Traditional Chinese Medicine of Hunan ProvinceChangshaHunan 410008China Guangzhou Jiayibang Health Management Co.Ltd.GuangzhouGuangdong 510000China
出 版 物:《Digital Chinese Medicine》 (数字中医药(英文))
年 卷 期:2019年第2卷第2期
页 面:61-71页
学科分类:100505[医学-中医诊断学] 1005[医学-中医学] 081203[工学-计算机应用技术] 08[工学] 0835[工学-软件工程] 0812[工学-计算机科学与技术(可授工学、理学学位)] 10[医学]
基 金:the funding support from the National Natural Science Foundation of China (No. 81874429) Digital and Applied Research Platform for Diagnosis of Traditional Chinese Medicine (No. 49021003005) 2018 Hunan Provincial Postgraduate Research Innovation Project (No. CX2018B465) Excellent Youth Project of Hunan Education Department in 2018 (No. 18B241)
主 题:Syndrome element syndrome differentiation (SESD) Natural language processing (NLP) Diagnostics of TCM Artificial intelligence Text mining
摘 要:Objective Natural language processing (NLP) was used to excavate and visualize the core content of syndrome element syndrome differentiation (SESD). Methods The first step was to build a text mining and analysis environment based on Python language, and built a corpus based on the core chapters of SESD. The second step was to digitalize the corpus. The main steps included word segmentation, information cleaning and merging, document-entry matrix, dictionary compilation and information conversion. The third step was to mine and display the internal information of SESD corpus by means of word cloud, keyword extraction and visualization. Results NLP played a positive role in computer recognition and comprehension of SESD. Different chapters had different keywords and weights. Deficiency syndrome elements were an important component of SESD, such as Qi deficiencyYang deficiency and Yin deficiency. The important syndrome elements of substantiality included Blood stasisQi stagnation, etc. Core syndrome elements were closely related. Conclusions Syndrome differentiation and treatment was the core of SESD. Using NLP to excavate syndromes differentiation could help reveal the internal relationship between syndromes differentiation and provide basis for artificial intelligence to learn syndromes differentiation.