Joint Hand Detection and Rotation Estimation Using CNN in Matlab

Joint Hand Detection and Rotation Estimation Using CNN in Matlab

Abstract:

Hand detection is essential for many hand related tasks, e.g., recovering hand pose and understanding gesture. However, hand detection in uncontrolled environments is challenging due to the flexibility of wrist joint and cluttered background. We propose a convolutional neural network (CNN), which formulates in-plane rotation explicitly to solve hand detection and rotation estimation jointly. Our network architecture adopts the backbone of faster R-CNN to generate rectangular region proposals and extract local features. The rotation network takes the feature as input and estimates an in-plane rotation which manages to align the hand, if any in the proposal, to the upward direction. A derotation layer is then designed to explicitly rotate the local spatial feature map according to the rotation network and feed aligned feature map for detection. Experiments show that our method outperforms the state-of-the-art detection models on widely-used benchmarks, such as Oxford and Egohands database. Further analysis show that rotation estimation and classification can mutually benefit each other.