Abstract:
Dynamic resource allocation (DRA) is the key technology to improve the network performance in resource-limited multibeam satellite (MBS) systems. The aim is to find a policy that maximizes the expected long-term resource utilization. Existing iterative metaheuristics DRA optimization algorithms are not practical due to the high computational complexity. To solve the problem of unknown dynamics and prohibitive computation, a deep reinforcement learning-based framework (DRLF) is proposed for DRA problems in MBS systems. A novel image-like tensor reformulation on the system environments is adopted to extract traffic spatial and temporal features. A use case of dynamic channel allocation in DRLF is simulated and shows the effectiveness of the proposed DRLF in time-varying scenarios.