Abstract:
Sentiment analysis is the task of detecting opinions of people from text using techniques of natural language processing. It is critical in assisting businesses in actively improving their company strategy and better understanding client feedback on their products. Recently, the researchers have shown that deep learning models, namely convolutional neural network (CNN), recurrent neural networks (RNNs), and contextualized transformer-based word embeddings, give hopeful results for extracting sentiment from text. Withal, bidirectional RNN utilizes two directions of RNN to better extract long-term dependences, CNN has the benefit of high-level features extracting, and it may not examine a sequence of correlations efficiently. In addition, the transformer-based word embeddings are the computational resources needed to fine-tune to solve the problem of overfitting on small datasets. For that, we propose in this work a combination of different RNNs models [e.g., long short-term memory (LSTM), bidirectional LSTM, and gated recurrent unit (GRU)] and CNN, using different word embeddings (MultiFiT, XLNet, and CamemBERT). The experimental results show that the combination of GRU and CNN with XLNet has an apparent improvement upon the state of the art on three French datasets: 1) French Amazon Customer Reviews, 2) AlloCiné Dataset, and 3) French Twitter Sentiment Analysis, with 96.5%, 90.1%, and 89.6% accuracies in decreasing order, respectively.