Dynamic Clustering and Resource Allocation Using Deep Reinforcement Learning for Smart Duplex Networ

Dynamic Clustering and Resource Allocation Using Deep Reinforcement Learning for Smart Duplex Networ

Abstract:

Ultra dense networks (UDNs) with smart-duplex (SD), which allows the base stations (BSs) to flexibly switch between the half-duplex (HD) and full-duplex (FD), are expected to support high-density transmissions. However, to centrally handle a large network is costly, while distributed processing may suffer from the severe performance loss due to the complicated intercell interferences in the UDNs. This article aims to balance the system performance and clustering cost of the SD UDNs by dividing all small cells into several clusters. A Markov decision process (MDP) problem is formulated to maximize the average weighted sum of network throughput and clustering cost for all clusters. To approximately solve this problem, we first adopt an affinity propagation method to determine the number of clusters and the center of each cluster. Then, by treating small cells as agents, the original MDP problem is proved to be equivalent to a multiagent MDP to maximize the average reward of all small cells. Next, a multiagent deep reinforcement learning (DRL) is proposed to jointly implement the dynamic clustering for noncenter small cells, resource allocation, and duplex mode selection. Simulation results show that SD has prominent advantages over both the HD and FD in UDNs, and the proposed multiagent DRL outperforms other clustering schemes under the considered scenarios.