Divergence Driven Consistency Training for Semi Supervised Facial Age Estimation

Divergence Driven Consistency Training for Semi Supervised Facial Age Estimation

Abstract:

Facial age estimation has attracted considerable attention owing to its great potential in applications. However, it still falls short of reliable age estimation due to the lack of sufficient training data with accurate age labels. Using conventional semi-supervised methods to exploit unlabeled data appears to be a good solution, but it does not yield sufficient performance gains while significantly increasing training time. Therefore, to tackle these problems, we present a Divergence-driven Consistency Training (DCT) method for enhancing both efficiency and performance in this paper. Following the idea of pseudo-labeling and consistency regularization, we assign pseudo labels predicted by the teacher model to unlabeled samples and then train the student model on labeled and unlabeled samples based on consistency regularization. Based on this, we propose two main promotions. The first is the Efficient Sample Selection (ESS) strategy, which is based on the Divergence Score to select effective samples from massive unlabeled images to reduce the training time and improve efficiency. The second is Identity Consistency (IC) regularization as the additional loss function, which introduces a high dependency of aging traits on a person. Moreover, we propose Local Prediction (LP), which is a plug-and-play component, to capture local semantics. Extensive experiments on multiple age benchmark datasets, including CACD, Morph II, MIVIA, and Chalearn LAP 2015, indicate DCT outperforms the state-of-the-art approaches significantly.