Dynamic Spectrum Anti-Jamming with Distributed Learning and Transfer Learning
作者机构:Key Laboratory of Dynamic Cognitive System of Electromagnetic Spectrum SpaceMinistry of Industry and Information TechnologyNanjing University of Aeronautics and AstronauticsNanjing 210016China The 723 Institute of CSSCYangzhou 225001China School of Electronic Information and CommunicationsHuazhong University of Science and TechnologyWuhan 430074China School of Electronic and Information EngineeringSouth China University of TechnologyGuangzhou 510641China
出 版 物:《China Communications》 (中国通信(英文版))
年 卷 期:2023年第20卷第12期
页 面:52-65页
核心收录:
学科分类:11[军事学] 080904[工学-电磁场与微波技术] 12[管理学] 0809[工学-电子科学与技术(可授工学、理学学位)] 08[工学] 110503[军事学-军事通信学] 0810[工学-信息与通信工程] 1105[军事学-军队指挥学] 1104[军事学-战术学] 1201[管理学-管理科学与工程(可授管理学、工学学位)] 082601[工学-武器系统与运用工程] 081104[工学-模式识别与智能系统] 081105[工学-导航、制导与控制] 0826[工学-兵器科学与技术] 081001[工学-通信与信息系统] 0835[工学-软件工程] 081002[工学-信号与信息处理] 0811[工学-控制科学与工程] 0812[工学-计算机科学与技术(可授工学、理学学位)]
基 金:partially supported by the National Natural Science Foundation of China under Grant U2001210,61901216,61827801 the Natural Science Foundation of Jiangsu Province under Grant BK20190400
主 题:A3C anti-jamming reinforcement learning spectrum transfer learning wireless system
摘 要:Physical-layer security issues in wireless systems have attracted great *** this paper,we investigate the spectrum anti-jamming(AJ)problem for data transmissions between *** fast-changing physical-layer jamming attacks in the time/frequency domain,frequency resources have to be configured for devices in advance with unknown jamming patterns(*** time-frequency distribution of the jamming signals)to avoid jamming signals emitted by malicious *** process can be formulated as a Markov decision process and solved by reinforcement learning(RL).Unfortunately,stateof-the-art RL methods may put pressure on the system which has limited computing *** a result,we propose a novel RL,by integrating the asynchronous advantage actor-critic(A3C)approach with the kernel method to learn a flexible frequency pre-configuration ***,in the presence of time-varying jamming patterns,the traditional AJ strategy can not adapt to the dynamic interference *** handle this issue,we design a kernelbased feature transfer learning method to adjust the structure of the policy function *** results reveal that our proposed approach can significantly outperform various baselines,in terms of the average normalized throughput and the convergence speed of policy learning.