Positional Encoding: Improving Class-Imbalanced Motorcycle Helmet use Classification

Positional Encoding: Improving Class-Imbalanced Motorcycle Helmet use Classification

Abstract:

Recent advances in the automated detection of motorcycle riders’ helmet use have enabled road safety actors to process large scale video data efficiently and with high accuracy. To distinguish drivers from passengers in helmet use, the most straightforward way is to train a multi-class classifier, where each class corresponds to a specific combination of rider position and individual riders’ helmet use. However, such strategy results in long-tailed data distribution, with critically low class samples for a number of uncommon classes. In this paper, we propose a novel approach to address this limitation. Let n be the maximum number of riders a motorcycle can hold, we encode the helmet use on a motorcycle as a vector with 2n bits, where the first n bits denote if the encoded positions have riders, and the latter n bits denote if the rider in the corresponding position wears a helmet. With the novel helmet use positional encoding, we propose a deep learning model that stands on existing image classification architecture. The model simultaneously trains 2n binary classifiers, which allows more balanced samples for training. This method is simple to implement and requires no hyperparameter tuning. Experimental results demonstrate our approach outperforms the state-of-the-art approaches by 1.9% accuracy.