Adversarial Active Learning for Named Entity Recognition in Cybersecurity
作者机构:Zhengzhou Institute of Information Science and TechnologyZhengzhou450001China College of Letters and ScienceUniversity of Wisconsin-MadisonMadison53706USA
出 版 物:《Computers, Materials & Continua》 (计算机、材料和连续体(英文))
年 卷 期:2021年第66卷第1期
页 面:407-420页
核心收录:
学科分类:0831[工学-生物医学工程(可授工学、理学、医学学位)] 0808[工学-电气工程] 0809[工学-电子科学与技术(可授工学、理学学位)] 08[工学] 0805[工学-材料科学与工程(可授工学、理学学位)] 0701[理学-数学] 0812[工学-计算机科学与技术(可授工学、理学学位)] 0801[工学-力学(可授工学、理学学位)]
基 金:the National Natural Science Foundation of China undergrant 61501515.
主 题:Adversarial learning active learning named entity recognition dynamic attention mechanism
摘 要:Owing to the continuous barrage of cyber threats,there is a massive amount of cyber threat intelligence.However,a great deal of cyber threat intelligence come from textual sources.For analysis of cyber threat intelligence,many security analysts rely on cumbersome and time-consuming manual efforts.Cybersecurity knowledge graph plays a significant role in automatics analysis of cyber threat intelligence.As the foundation for constructing cybersecurity knowledge graph,named entity recognition(NER)is required for identifying critical threat-related elements from textual cyber threat intelligence.Recently,deep neural network-based models have attained very good results in NER.However,the performance of these models relies heavily on the amount of labeled data.Since labeled data in cybersecurity is scarce,in this paper,we propose an adversarial active learning framework to effectively select the informative samples for further annotation.In addition,leveraging the long short-term memory(LSTM)network and the bidirectional LSTM(BiLSTM)network,we propose a novel NER model by introducing a dynamic attention mechanism into the BiLSTM-LSTM encoderdecoder.With the selected informative samples annotated,the proposed NER model is retrained.As a result,the performance of the NER model is incrementally enhanced with low labeling cost.Experimental results show the effectiveness of the proposed method.