Marginal Attacks of Generating Adversarial Examples for Spam Filtering
Marginal Attacks of Generating Adversarial Examples for Spam Filtering作者机构:Cyberspace Institute of Advanced Technology Guangzhou University
出 版 物:《Chinese Journal of Electronics》 (电子学报(英文))
年 卷 期:2021年第30卷第4期
页 面:595-602页
核心收录:
学科分类:08[工学] 080402[工学-测试计量技术及仪器] 0804[工学-仪器科学与技术]
基 金:supported by the National Natural Science Foundation of China (No.61902082, No.U20B2046) the Guangdong Province Key Research and Development Plan (No.2019B010136003) the Guangdong Higher Education Innovation Group (No.2020KCXTD007) the Guangzhou Higher Education Innovation Group (No.202032854) Guangdong Province Universities and Colleges Pearl River Scholar Funded Scheme (2019)
主 题:Spam filtering Marginal attack Adversarial example Machine learning
摘 要:Digit information has been used in many areas and has been widely spread in the Internet era because of its convenience. However, many ill-disposed attackers, such as spammers take advantage of such convenience to send unsolicited information, such as advertisements, frauds, and pornographic messages to mislead users and this might cause severe *** many spam filters have been proposed in detecting spams, they are vulnerable and could be misled by some carefully crafted adversarial *** this paper, we propose the marginal attack methods of generating such adversarial examples to fool a naive Bayesian spam ***, we propose three methods to select sensitive words from a sentence and add them at the end of the sentence. Through extensive experiments, we show that the generated adversarial examples could largely reduce the filter s detecting accuracy, e.g. by adding only one word, the accuracy could be reduced from 93.6% to 55.8%. Furthermore, we evaluate the transferability of the generated adversarial examples against other traditional filters such as logic regression, decision tree and linear support vector machine based filters. The evaluation results show that these filters accuracy is also reduced dramatically; especially,the decision tree based filter s accuracy drops from 100%to 1.51% by inserting only one word.