Abstract:
To solve a decentralized radio resource management problem in a 5G vehicular network, we propose a novel resource allocation algorithm based on a multiagent deep reinforcement learning (MARL). We let each vehicle act as an individual agent that can select a unique combination of transport block (TB) and transmission power to broadcast periodic packets. Agent explores the environment and collects observations that later will be used to find the best combination of TB and transmission power. We apply an actor-critic reinforcement learning technique to choose optimal TB for each agent. To eliminate a nonstationarity in a multiagent setting, we utilize a centralized training that allows all agents to share their observations over critic networks. The information shared through critic network can assist each agent to learn the policies of other agents. In a decentralized execution, each agent may only use its actor network and local observation to find the most appropriate TB in the given level of transmission power. While training, the actions taken by actor are evaluated by corresponding critic that maps Q-value for all feasible actions in the given state. Our method results in 18% higher packet reception ratio than a spectrum allocation scheme based on a double DQN. The proposed method achieves 33% higher reward than the previous state-of-the-art that is also based on MARL.