An Aggregated Convolutional Transformer Based on Slices and Channels for Multivariate Time Series Cl

An Aggregated Convolutional Transformer Based on Slices and Channels for Multivariate Time Series Cl

Abstract:

Convolutional neural network has achieved remarkable success, and has excellent local feature extraction ability. Similarly, Transformer has been developed markedly in recent years, achieving excellent representation capabilities in terms of global features, which has aroused heated discussions. In terms of multivariate time series classification, most previous networks had convolution and long and short-term memory structures. This paper innovatively proposes a combination of Transformer-encoder and convolutional structures, which we refer to as the Multivariate time series classification Convolutional Transformer Network (MCTNet). The different advantages of convolutional neural network and self-attention are used to capture potential deep information in multivariate time series more accurately. The Transformer is considered to be data-hungry, and combined with the induction bias of the convolutional neural network to solve this problem, early features are extracted through the convolutional layers, and the both squeeze and excitation convolution encoder (BC-Encoder) structure is proposed. Attentional prototype learning is also used to mitigate the limited label problem. Moreover, a new network design that focuses on slices and channels is proposed, moving beyond the concept that the use of Transformer will require many parameters. Experimental results from 26 datasets of the well-known multivariate time series archive UEA show that the performance of our model is better than that of most state-of-the-art models.