Abstract:
The state of the arts (SOTAs) of single image super-resolution always exploit guidance from gradient prior. The fusion of gradient guidance is implemented by channel-wise concatenation followed by a convolutional layer. However, the kernels sharing in spatial positions cannot adaptively tune the effect of gradient guidance for all feature positions. To resolve this problem, a novel network module is proposed to simulate the traditional Joint Trilateral Filter (JTF) by extending the definition domain from pixels to features. Moreover, to improve the efficiency and flexibility, the functions of JTF kernel generation for image features and gradient features are explicitly learned instead of individual kernel weights, e.g., the exponential functions in the traditional JTF. Based on the proposed JTF modules, this paper follows the gradient-guided framework which simultaneously infers high-resolution (HR) image features and HR gradient features within two parallel branches, respectively. Specifically, by treating image features and gradient features as cross guidance to each other, the proposed JTF modules adaptively adjust the fusion patterns for local features via a bi-directional way. By doing so, the quality of image features and gradient features is alternatively enhanced. Compared with SOTAs, the proposed JTF-SISR shows improvement which is evaluated for multiple upsampling scales and degradation modes on 5 synthetic datasets, i.e., Set5, Set14, B100, Urban100 and Manga109, and 1 real dataset, i.e., RealSRSet.