Abstract:
This article proposes a method of automatically classifying videos into human emotional categories by imitating human neural processes, in which emotion-specific electroencephalography (EEG) characteristics are generated by audiovisual stimulation. In the proposed method, sophisticated emotional features are first extracted from EEG signals that are generated while a subject watches a video, using a sample-attention-based deep neural network encoder. Next, the direct mapping relationship between the extracted emotional EEG features and audiovisual features extracted from the contents of the corresponding videos are learned by a deep belief network. For practical application, the EEG features corresponding to the input video are automatically generated based on the machine’s learned ability without measuring human EEG signals and subsequently applied to a segment-attention-based deep neural network decoder for emotion classification of the video. The experimental results demonstrated that the proposed method significantly outperforms the existing methods with an average accuracy of about 95% for classifying four emotional classes of the video. As for automated emotional video classification, our artificial emotional EEG features-based approach obtains competitive performance, comparable to models that directly measure EEG, and can be generalized to various audiovisual data sets.