On Deep Recurrent Reinforcement Learning for Active Visual Tracking of Space Noncooperative Objects

On Deep Recurrent Reinforcement Learning for Active Visual Tracking of Space Noncooperative Objects

Abstract:

Active tracking of space noncooperative object that merely relies on vision camera is greatly significant for autonomous rendezvous and debris removal. Considering its Partial Observable Markov Decision Process (POMDP) property, this letter proposes a novel deep recurrent neural network architecture, named as recurrent and attention module based active visual tracking (RAMAVT), incorporating Multi-Head Attention (MHA) module and Squeeze-and-Excitation (SE) layer that remarkably improve the representative ability of neural network with almost no extra computational cost. It has been successfully applied to value-based and policy gradient-based deep reinforcement learning algorithm, and learned to drive the chasing spacecraft to follow arbitrary space noncooperative object with high-frequency and near-optimal velocity control commands. Extensive experiments and robustness evaluations implemented on space non-cooperative object active tracking (SNCOAT) benchmark show the betterment and robustness of our method compared with other state-of-the-art active visual trackers. In addition, we make further ablation study and interpretability research on RAMAVT of which validity and rationality have been demonstrated.