To address the challenges posed by a large number of disaster-waiver-affected users and the complexities of scaling centralized algorithms for rapidly restoring emergency communication services, the paper proposes a distributed intent-based optimization architecture based on multi-agent reinforcement learning. This approach aims to mitigate service discrepancies and dynamics among users. In the network feature layer, a distributed K-sums clustering algorithm considers variations in user services. Each UAV base station autonomously and minimally adjusts the local network structure based on user requirements. It selects user features from the cluster center as input states for the multi-agent reinforcement learning neural network. In the trajectory regulation layer, the paper introduces a multi-agent maximum entropy reinforcement learning (MASAC) algorithm. The UAV base station, acting as an intelligent node, governs its flight trajectory within the framework of “distributed training – distributed execution.” The paper incorporates techniques such as integrated learning and curriculum learning to enhance training stability and convergence speed. Simulation results demonstrate the effectiveness of our distributed K-sums clustering algorithm in terms of load efficiency and cluster balance, outperforming the traditional K-means algorithm. Additionally, the UAV base station trajectory control algorithm based on MASAC significantly reduces communication interruptions, enhances network spectral efficiency, and surpasses existing reinforcement learning methods.
Deployment of Unmanned Aerial Vehicles in Next-Generation Wireless Communication Network Using Multi-Agent Reinforcement Learning
Pau, Giovanni
;
2024-01-01
Abstract
To address the challenges posed by a large number of disaster-waiver-affected users and the complexities of scaling centralized algorithms for rapidly restoring emergency communication services, the paper proposes a distributed intent-based optimization architecture based on multi-agent reinforcement learning. This approach aims to mitigate service discrepancies and dynamics among users. In the network feature layer, a distributed K-sums clustering algorithm considers variations in user services. Each UAV base station autonomously and minimally adjusts the local network structure based on user requirements. It selects user features from the cluster center as input states for the multi-agent reinforcement learning neural network. In the trajectory regulation layer, the paper introduces a multi-agent maximum entropy reinforcement learning (MASAC) algorithm. The UAV base station, acting as an intelligent node, governs its flight trajectory within the framework of “distributed training – distributed execution.” The paper incorporates techniques such as integrated learning and curriculum learning to enhance training stability and convergence speed. Simulation results demonstrate the effectiveness of our distributed K-sums clustering algorithm in terms of load efficiency and cluster balance, outperforming the traditional K-means algorithm. Additionally, the UAV base station trajectory control algorithm based on MASAC significantly reduces communication interruptions, enhances network spectral efficiency, and surpasses existing reinforcement learning methods.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.