PPO Based PDACB Traffic Control Scheme for Massive IoV Communications

PPO Based PDACB Traffic Control Scheme for Massive IoV Communications

Abstract:

Traffic control is regarded as a key issue to alleviate congestion in internet of vehicles (IoV) machine-type communications (MTC). Recently, many traffic control schemes have been studied, such as access class barring (ACB) scheme and back-off (BO) scheme. However, the dynamics of traffic and the heterogeneous requirements of different IoV applications are not considered in most existing studies, which is significant for the random access resource allocation. In this paper, we consider a hybrid scheme, combining the priority dynamic ACB (PDACB) scheme and BO scheme. The IoV devices are classified depending on different delay characteristics, where the delay-sensitive devices are classified as high priority. The target is to maximum the successful transmission of packets with the success rate constraint by adjusting the various ACB factors. Proximal policy optimization (PPO) algorithm as a unique deep reinforcement learning (DRL) method is utilized in this paper, which can obtain continuous action space and solve for the optimal ACB factors without estimating backlog of nodes. A quick convergence is achieved by designing sensible state space, action space and reward. The access capability of the PDACB traffic control scheme is verified by simulations.