咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >Text Mining and Analysis of Tr... 收藏

Text Mining and Analysis of Treatise on Febrile Diseases Based on Natural Language Processing

Text Mining and Analysis of Treatise on Febrile Diseases Based on Natural Language Processing

作     者:Kai Zhao Na Shi Zhen Sa Hua-Xing Wang Chun-Hua Lu Xiao-Ying Xu Kai Zhao;Na Shi;Zhen Sa;Hua-Xing Wang;Chun-Hua Lu;Xiao-Ying Xu

作者机构:School of Traditional Chinese MedicineBeijing University of Chinese Medicine School of Life ScienceBeijing University of Chinese Medicine Beijing 100029China 

出 版 物:《World Journal of Traditional Chinese Medicine》 (世界中医药杂志(英文))

年 卷 期:2020年第6卷第1期

页      面:67-73页

学科分类:100208[医学-临床检验诊断学] 1002[医学-临床医学] 081203[工学-计算机应用技术] 08[工学] 0835[工学-软件工程] 10[医学] 0812[工学-计算机科学与技术(可授工学、理学学位)] 

主  题:Knowledge discovery natural language processing text mining traditional Chinese medicine literature treatise on febrile diseases 

摘      要:Objective:With using natural language processing (NLP) technology to analyze and process the text of Treatise on Febrile Diseases (TFDs)for the sake of finding important information, this paper attempts to apply NLP in the field of text mining of traditional Chinese medicine (TCM)literature. Materials and Methods:Based on the Python language, the experiment invoked the NLP toolkit such as Jieba, nltk, gensim,and sklearn library, and combined with Excel and Word software. The text of TFDs was sequentially cleaned, segmented, and moved the stopped words, and then implementing word frequency statistics and analysis, keyword extraction, named entity recognition (NER) and other operations, finally calculating text similarity. Results:Jieba can accurately identify the herbal name in TFDs. Word frequency statistics based on the word segmentation found that warm therapy is an important treatment of TFDs. Guizhi decoction is the main prescription,and five core decoctions are identified. Keyword extraction based on the term frequency-inverse document frequency algorithm is *** accuracy of NER in TFDs is about 86%;latent semantic indexing model calculating the similarity,Understanding of Synopsis of Golden Chamber (SGC) is much more similar with SGC than with TFDs. The results meet expectation. Conclusions:It lays a research foundation for applying NLP to the field of text mining of unstructured TCM literature. With the combination of deep learning technology,NLP as an important branch of artificial intelligence will have broader application prospective in the field of text mining in TCM literature and construction of TCM knowledge graph as well as TCM knowledge services.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分