Abstract:
Multi-object tracking (MOT) is an essential task in the computer vision field. With the fast development of deep learning technology in recent years, MOT has achieved great improvement. However, some challenges still remain, such as sensitiveness to occlusion, instability under different lighting conditions, and non-robustness to deformable objects, causing incorrect temporal associations. To address such common challenges in most of the existing trackers, in this paper, a tracklet booster (TBooster) algorithm is proposed to correct the association errors resulting from existing trackers. The correction of the association error from TBooster has two folds: split tracklets on potential ID-change positions and then connect multiple tracklets into one if they are from the same object. To achieve this goal, the TBooster consists of two components, i.e. , Splitter and Connector. In Splitter, an architecture with stacked temporal dilated convolution blocks is employed for the splitting position prediction via label smoothing strategy with adaptive Gaussian kernels. In Connector, a multi-head self-attention-based encoder is exploited for the tracklet embedding, which is further used to connect tracklets into full tracks. We conduct sufficient experiments on MOT17 and MOT20 benchmark datasets and achieve promising results. Combined with the proposed tracklet booster, existing trackers can achieve large improvements on the IDF1 score, which shows the effectiveness of the proposed TBooster.