Abstract:
Edge computing is promising to provide computational resources for connected vehicles. Resource demands for edge servers vary due to vehicle mobility. It is then challenging to reserve edge servers to meet variable demands. Existing schemes rely on statistical information of resource demands to determine edge server reservation. They are infeasible in practice, since the reservation based on statistics cannot adapt to time-varying demands. In this paper, a spatio-temporal reinforcement learning scheme called DeepReserve is developed to learn variable demands and then reserve edge servers accordingly. DeepReserve is adapted from the deep deterministic policy gradient algorithm with two major enhancements. First, by observing that the spatio-temporal correlation in vehicle traffic leads to the same property in resource demands of CVs, a convolutional LSTM network is employed to encode resource demands observed by edge servers for inference of future demands. Second, an action amender is designed to make sure an action does not violate spatio-temporal correlation. We also design a new training method, i.e., DR-Train, to stabilize the training procedure. DeepReserve is evaluated via experiments based on real-world datasets. Results show it achieves better performance than state-of-the-art approaches that require accurate demand information.