Abstract:
Attention mechanisms improve the classification accuracies by enhancing the salient information for hyperspectral images (HSIs). However, existing HSI attention models are driven by advanced achievements of computer vision, which are not able to fully exploit the spectral–spatial structure prior of HSIs and effectively refine features from a global perspective. In this article, we propose a unified attention paradigm (UAP) that defines the attention mechanism as a general three-stage process including optimizing feature representations, strengthening information interaction, and emphasizing meaningful information. Meanwhile, we designed a novel efficient spectral–spatial attention module (ESSAM) under this paradigm, which adaptively adjusts feature responses along the spectral and spatial dimensions at an extremely low parameter cost. Specifically, we construct a parameter-free spectral attention block that employs multiscale structured encodings and similarity calculations to perform global cross-channel interactions, and a memory-enhanced spatial attention block that captures key semantics of images stored in a learnable memory unit and models global spatial relationship by constructing semantic-to-pixel dependencies. ESSAM takes full account of the spatial distribution and low-dimensional characteristics of HSIs, with better interpretability and lower complexity. We develop a dense convolutional network based on efficient spectral–spatial attention network (ESSAN) and experiment on three real hyperspectral datasets. The experimental results demonstrate that the proposed ESSAM brings higher accuracy improvement compared to advanced attention models.