Adaptive Cost Volume Representation for Unsupervised High Resolution Stereo Matching

Adaptive Cost Volume Representation for Unsupervised High Resolution Stereo Matching

Abstract:

Learning-based stereomatching methods have produced remarkable results in recent years. However, typical supervised learning-based methods always suffer from the non-negligible problem of costly and time-consuming depth annotations. To mitigate this issue, in this work, a multi-stage unsupervised stereo matching method based on the cascaded Siamese network is proposed. To obtain a better performance on depth annotations, the improvements of this work are as follows. Firstly, sparse costs are constructed to predict the coarse disparity, and an adaptive sampling strategy is developed to dynamically adjust the sampling interval and effectively narrow the disparity search range. The proposed cost sparse and sampling strategy can certainly guarantee the accuracy of disparity estimation under the limited memory requirements. Then, geometric constraints with left and right semantic features are integrated into the loss function to learn the inherent matching correspondences. Next, information entropy of the probability volume is used to measure the quality of estimated disparity and designed as weighted guidance for the photometric loss. Finally, a pixel-wise disparity refinement module is designed to achieve high-resolution disparity estimation at the final stage. Experimental results on the datasets, including SceneFlow and KITTI, show the effectiveness and practicability of our method with limited memory consumption and running time.