Deep reinforcement learning-based resource allocation for D2D communications in heterogeneous cellular networks
作者机构:School of Information Science and EngineeringShandong Normal UniversityJinan250358China School of Information and CommunicationGuilin University of Electronic TechnologyGuilin541004China
出 版 物:《Digital Communications and Networks》 (数字通信与网络(英文版))
年 卷 期:2022年第8卷第5期
页 面:834-842页
核心收录:
学科分类:0810[工学-信息与通信工程] 080904[工学-电磁场与微波技术] 0808[工学-电气工程] 0809[工学-电子科学与技术(可授工学、理学学位)] 0839[工学-网络空间安全] 08[工学] 080402[工学-测试计量技术及仪器] 0804[工学-仪器科学与技术] 0835[工学-软件工程] 081001[工学-通信与信息系统] 0812[工学-计算机科学与技术(可授工学、理学学位)]
基 金:The work presented in this paper was supported in part by the National Natural Science Foundation of China(No.61801278,61972237 and 61901247) Shandong Provincial scientific research programs in colleges and universities(J18KA310) the Key Laboratory of Cognitive Radio and Information Processing,Ministry of Education(Guilin University of Electronic Technology)(CRKL190205) the Shandong Provincial Natural Science Foundation of China(No.ZR2019MF017)
主 题:Deep reinforcement learning Heterogeneous cellular networks Device-to-device communication Millimeter wave communication Resource allocation
摘 要:Device-to-Device(D2D)communication-enabled Heterogeneous Cellular Networks(HCNs)have been a promising technology for satisfying the growing demands of smart mobile devices in fifth-generation mobile *** introduction of Millimeter Wave(mm-wave)communications into D2D-enabled HCNs allows higher system capacity and user data rates to be ***,interference among cellular and D2D links remains severe due to spectrum *** this paper,to guarantee user Quality of Service(QoS)requirements and effectively manage the interference among users,we focus on investigating the joint optimization problem of mode selection and channel allocation in D2D-enabled HCNs with mm-wave and cellular *** optimization problem is formulated as the maximization of the system sum-rate under QoS constraints of both cellular and D2D users in *** solve it,a distributed multiagent deep Q-network algorithm is proposed,where the reward function is redefined according to the optimization *** addition,to reduce signaling overhead,a partial information sharing strategy that does not observe global information is proposed for D2D agents to select the optimal mode and channel through *** results illustrate that the proposed joint optimization algorithm possesses good convergence and achieves better system performance compared with other existing schemes.