A More Robust Model to Answer Noisy Questions in KBQA in Python

A More Robust Model to Answer Noisy Questions in KBQA in Python

Abstract:

In practical applications, the raw input to a Knowledge Based Question Answering (KBQA) system may vary in forms, expressions, sources, etc. As a result, the actual input to the system may contain various errors caused by various noise in raw data and processes of transmission, transformation, translation, etc. As a result, it is significant to evaluate and enhance the robustness of a KBQA model to various noisy questions. In this paper, we generate 29 datasets of various noisy questions based on the original SimpleQuestions dataset to evaluate and enhance the robustness of a KBQA model, and propose a model which is more robust to various noisy questions. Compared with traditional methods, the main contribution in this paper is that we propose a method of generating datasets of different noisy questions to evaluate the robustness of a KBQA model, and propose a KBQA model which contains incremental learning and Mask Language Model (MLM) in the question answering process, so that our model is less affected by different kinds of noise in questions and achieves higher accuracies on datasets of different noisy questions, which shows its robustness. Experimental results show that our model achieves an average accuracy of 78.1% on these datasets and outperforms the baseline BERT-based model by an average margin of 5.0% with the similar training cost. In addition, further experiments show that our model is compatible with other pre-trained models such as ALBERT and ELECTRA.