Recently, with the increasing complexity of multiplex Unmanned Aerial Vehicles (multi-UAVs) collaboration in dynamic task environments, multi-UAVs systems have shown new characteristics of inter-coupling among multipl...
详细信息
Recently, with the increasing complexity of multiplex Unmanned Aerial Vehicles (multi-UAVs) collaboration in dynamic task environments, multi-UAVs systems have shown new characteristics of inter-coupling among multiplex groups and intra-correlation within groups. However, previous studies often overlooked the structural impact of dynamic risks on agents among multiplex UAV groups, which is a critical issue for modern multi-UAVs communication to address. To address this problem, we integrate the influence of dynamic risks on agents among multiplex UAV group structures into a multi-UAVs task migration problem and formulate it as a partially observable Markov game. We then propose a Hybrid Attention multi-agent reinforcement learning (HAMRL) algorithm, which uses attention structures to learn the dynamic characteristics of the task environment, and it integrates hybrid attention mechanisms to establish efficient intra- and inter-group communication aggregation for information extraction and group collaboration. Experimental results show that in this comprehensive and challenging model, our algorithm significantly outperforms state-of-the-art algorithms in terms of convergence speed and algorithm performance due to the rational design of communication mechanisms.
To guarantee the heterogeneous delay requirements of the diverse vehicular services,it is necessary to design a full cooperative policy for both Vehicle to Infrastructure(V2I)and Vehicle to Vehicle(V2V)*** paper inves...
详细信息
To guarantee the heterogeneous delay requirements of the diverse vehicular services,it is necessary to design a full cooperative policy for both Vehicle to Infrastructure(V2I)and Vehicle to Vehicle(V2V)*** paper investigates the reduction of the delay in edge information sharing for V2V links while satisfying the delay requirements of the V2I ***,a mean delay minimization problem and a maximum individual delay minimization problem are formulated to improve the global network performance and ensure the fairness of a single user,respectively.A multi-agent reinforcement learning framework is designed to solve these two problems,where a new reward function is proposed to evaluate the utilities of the two optimization objectives in a unified ***,a proximal policy optimization approach is proposed to enable each V2V user to learn its policy using the shared global network *** effectiveness of the proposed approach is finally validated by comparing the obtained results with those of the other baseline approaches through extensive simulation experiments.
To support popular Internet of Things(IoT)applications such as virtual reality and mobile games,edge computing provides a front-end distributed computing archetype of centralized cloud computing with low latency and d...
详细信息
To support popular Internet of Things(IoT)applications such as virtual reality and mobile games,edge computing provides a front-end distributed computing archetype of centralized cloud computing with low latency and distributed data ***,it is challenging for multiple users to offload their computation tasks because they are competing for spectrum and computation as well as Radio Access Technologies(RAT)*** this paper,we investigate computation offloading mechanism of multiple selfish users with resource allocation in IoT edge computing networks by formulating it as a stochastic *** user is a learning agent observing its local network environment to learn optimal decisions on either local computing or edge computing with a goal of minimizing long term system cost by choosing its transmit power level,RAT and sub-channel without knowing any information of the other *** users’decisions are coupling at the gateway,we define the reward function of each user by considering the aggregated effect of other ***,a multi-agent reinforcement learning framework is developed to solve the game with the proposed Independent Learners based multi-agent Q-learning(IL-based MA-Q)*** demonstrate that the proposed IL-based MA-Q algorithm is feasible to solve the formulated problem and is more energy efficient without extra cost on channel estimation at the centralized ***,compared with the other three benchmark algorithms,it has better system cost performance and achieves distributed computation offloading.
The smart grid utilizes the demand side management technology to motivate energy users towards cutting demand during peak power consumption periods, which greatly improves the operation efficiency of the power grid. H...
详细信息
The smart grid utilizes the demand side management technology to motivate energy users towards cutting demand during peak power consumption periods, which greatly improves the operation efficiency of the power grid. However, as the number of energy users participating in the smart grid continues to increase, the demand side management strategy of individual agent is greatly affected by the dynamic strategies of other agents. In addition, the existing demand side management methods, which need to obtain users’ power consumption information,seriously threaten the users’ privacy. To address the dynamic issue in the multi-microgrid demand side management model, a novel multi-agent reinforcement learning method based on centralized training and decentralized execution paradigm is presented to mitigate the damage of training performance caused by the instability of training experience. In order to protect users’ privacy, we design a neural network with fixed parameters as the encryptor to transform the users’ energy consumption information from low-dimensional to high-dimensional and theoretically prove that the proposed encryptor-based privacy preserving method will not affect the convergence property of the reinforcement learning algorithm. We verify the effectiveness of the proposed demand side management scheme with the real-world energy consumption data of Xi’an, Shaanxi, China. Simulation results show that the proposed method can effectively improve users’ satisfaction while reducing the bill payment compared with traditional reinforcement learning(RL) methods(i.e., deep Q learning(DQN), deep deterministic policy gradient(DDPG),QMIX and multi-agent deep deterministic policy gradient(MADDPG)). The results also demonstrate that the proposed privacy protection scheme can effectively protect users’ privacy while ensuring the performance of the algorithm.
The recent proliferation of Fifth-Generation(5G)networks and Sixth-Generation(6G)networks has given rise to Vehicular Crowd Sensing(VCS)systems which solve parking collisions by effectively incentivizing vehicle ***,i...
详细信息
The recent proliferation of Fifth-Generation(5G)networks and Sixth-Generation(6G)networks has given rise to Vehicular Crowd Sensing(VCS)systems which solve parking collisions by effectively incentivizing vehicle ***,instead of being an isolated module,the incentive mechanism usually interacts with other *** on this,we capture this synergy and propose a Collision-free Parking Recommendation(CPR),a novel VCS system framework that integrates an incentive mechanism,a non-cooperative VCS game,and a multi-agent reinforcement learning algorithm,to derive an optimal parking strategy in real ***,we utilize an LSTM method to predict parking areas roughly for recommendations *** incentive mechanism is designed to motivate vehicle participation by considering dynamically priced parking tasks and social network *** order to cope with stochastic parking collisions,its non-cooperative VCS game further analyzes the uncertain interactions between vehicles in parking *** its multi-agent reinforcement learning algorithm models the VCS campaign as a multi-agent Markov decision process that not only derives the optimal collision-free parking strategy for each vehicle independently,but also proves that the optimal parking strategy for each vehicle is ***,numerical results demonstrate that CPR can accomplish parking tasks at a 99.7%accuracy compared with other baselines,efficiently recommending parking spaces.
reinforcementlearning(RL)techniques are being studied to solve the Demand and Capacity Balancing(DCB)problems to fully exploit their computational performance.A locally gen-eralised multi-agent reinforcement learning...
详细信息
reinforcement learning(RL)techniques are being studied to solve the Demand and Capacity Balancing(DCB)problems to fully exploit their computational performance.A locally gen-eralised multi-agent reinforcement learning(MARL)for real-world DCB problems is *** proposed method can deploy trained agents directly to unseen scenarios in a specific Air Traffic Flow Management(ATFM)region to quickly obtain a satisfactory *** this method,agents of all flights in a scenario form a multi-agent decision-making system based on partial *** trained agent with the customised neural network can be deployed directly on the corresponding flight,allowing it to solve the DCB problem jointly.A cooperation coefficient is introduced in the reward function,which is used to adjust the agent’s cooperation preference in a multi-agent system,thereby controlling the distribution of flight delay time allocation.A multi-iteration mechanism is designed for the DCB decision-making framework to deal with problems arising from non-stationarity in MARL and to ensure that all hotspots are *** based on large-scale high-complexity real-world scenarios are conducted to verify the effectiveness and efficiency of the *** a statis-tical point of view,it is proven that the proposed method is generalised within the scope of the flights and sectors of interest,and its optimisation performance outperforms the standard computer-assisted slot allocation and state-of-the-art RL-based DCB *** sensitivity analysis preliminarily reveals the effect of the cooperation coefficient on delay time allocation.
In multi-agent confrontation scenarios, a jammer is constrained by the single limited performance and inefficiency of practical application. To cope with these issues, this paper aims to investigate the multi-agent ja...
详细信息
In multi-agent confrontation scenarios, a jammer is constrained by the single limited performance and inefficiency of practical application. To cope with these issues, this paper aims to investigate the multi-agent jamming problem in a multi-user scenario, where the coordination between the jammers is considered. Firstly, a multi-agent Markov decision process (MDP) framework is used to model and analyze the multi-agent jamming problem. Secondly, a collaborative multi-agent jamming algorithm (CMJA) based on reinforcement learning is proposed. Finally, an actual intelligent jamming system is designed and built based on software-defined radio (SDR) platform for simulation and platform verification. The simulation and platform verification results show that the proposed CMJA algorithm outperforms the independent Q-learning method and provides a better jamming effect.
This paper examines the difficulties of managing distributed power systems,notably due to the increasing use of renewable energy sources,and focuses on voltage control challenges exacerbated by their variable nature i...
详细信息
This paper examines the difficulties of managing distributed power systems,notably due to the increasing use of renewable energy sources,and focuses on voltage control challenges exacerbated by their variable nature in modern power *** tackle the unique challenges of voltage control in distributed renewable energy networks,researchers are increasingly turning towards multi-agent reinforcement learning(MARL).However,MARL raises safety concerns due to the unpredictability in agent actions during their exploration *** unpredictability can lead to unsafe control *** mitigate these safety concerns in MARL-based voltage control,our study introduces a novel approach:Safety-Constrainedmulti-agent reinforcement learning(SC-MARL).This approach incorporates a specialized safety constraint module specifically designed for voltage control within the MARL *** module ensures that the MARL agents carry out voltage control actions *** experiments demonstrate that,in the 33-buses,141-buses,and 322-buses power systems,employing SC-MARL for voltage control resulted in a reduction of the Voltage Out of Control Rate(%***)from0.43,0.24,and 2.95 to 0,0.01,and 0.03,***,the Reactive Power Loss(Q loss)decreased from 0.095,0.547,and 0.017 to 0.062,0.452,and 0.016 in the corresponding systems.
China's natural disaster situation presents a complex and severe scenario, resulting in substantial human and material losses as a result of large-scale emergencies. Recognizing the significance of aviation emergency ...
详细信息
China's natural disaster situation presents a complex and severe scenario, resulting in substantial human and material losses as a result of large-scale emergencies. Recognizing the significance of aviation emergency rescue, the state provides strong support for its development. However, China's current aviation emergency rescue system is still under construction and encounters various challenges;one such challenge is to match the dynamically changing multi-point rescue demands with the limited availability of aircraft dispatch. We propose a dynamic task assignment model and a trainable model framework for aviation emergency rescue based on multi-agent reinforcement learning. Combined with a targeted design, the scheduling matching problem is transformed into a stochastic game process from the rescue location perspective. Subsequently, an optimized strategy model with high robustness can be obtained by solving the training framework. Comparative experiments demonstrate that the proposed model is able to achieve higher assignment benefits by considering the dynamic nature of rescue demands and the limited availability of rescue helicopter crews. Additionally, the model is able to achieve higher task assignment rates and average time satisfaction by assigning tasks in a more efficient and timely manner. The results suggest that the proposed dynamic task assignment model is a promising approach for improving the efficiency of aviation emergency rescue.
In the rapidly evolving landscape of today’s digital economy,Financial Technology(Fintech)emerges as a trans-formative force,propelled by the dynamic synergy between Artificial Intelligence(AI)and Algorithmic *** in-...
详细信息
In the rapidly evolving landscape of today’s digital economy,Financial Technology(Fintech)emerges as a trans-formative force,propelled by the dynamic synergy between Artificial Intelligence(AI)and Algorithmic *** in-depth investigation delves into the intricacies of merging multi-agent reinforcement learning(MARL)and Explainable AI(XAI)within Fintech,aiming to refine Algorithmic Trading *** meticulous examination,we uncover the nuanced interactions of AI-driven agents as they collaborate and compete within the financial realm,employing sophisticated deep learning techniques to enhance the clarity and adaptability of trading *** AI-infused Fintech platforms harness collective intelligence to unearth trends,mitigate risks,and provide tailored financial guidance,fostering benefits for individuals and enterprises navigating the digital *** research holds the potential to revolutionize finance,opening doors to fresh avenues for investment and asset management in the digital ***,our statistical evaluation yields encouraging results,with metrics such as Accuracy=0.85,Precision=0.88,and F1 Score=0.86,reaffirming the efficacy of our approach within Fintech and emphasizing its reliability and innovative prowess.
暂无评论