Slovak Dataset for Multilingual Question Answering

Slovak Dataset for Multilingual Question Answering

admin

admin

Feb 3, 2024 - 17:02

0 18

Abstract:

SK-QuAD is the first manually annotated dataset of questions and answers in Slovak. It consists of more than 91k factual questions and answers from various fields. Each question has an answer marked in the corresponding paragraph. It also contains negative examples in the form of “unanswered questions” and “plausible answers”. The dataset is published free of charge for scientific use. We aim to contribute to the creation of Slovak or multilingual systems for generating an answer to a question in a natural language. The paper provides an overview of the existing datasets for question answering. It describes the annotation process and statistically analyzes the created content. The dataset expands the possibilities of training and evaluation of multilingual language models. Experiments show that the dataset achieves state-of-the-art results for Slovak and improves question answering for other languages in zero-shot learning. We compare the effect of machine-translated data with manually annotated. Additional data improve the modeling for low-resourced languages.

Click Here To See More

Tags:

Previous Article

Regionwise Generative Adversarial Image Inpainting for Large Missing Areas

Spectrum Surveying Active Radio Map Estimation With Autonomous UAVs

What's Your Reaction?

0

Like

0

Dislike

0

Love

0

Funny

0

Angry

0

Sad

0

Wow

Related Posts

Financial Risk Prediction Model of Listed Companies Bas...

admin Dec 23, 2021 0 30

Diagnosis of Malaria Using Double Hidden Layer Extreme ...

admin Feb 1, 2024 0 17

Vision and Language Navigation Based on Cross Modal Fea...

admin Jan 18, 2024 0 19

A Stock Price Prediction Model Based on Investor Sentim...

admin Feb 2, 2024 0 21

Event Driven Model Predictive Control With Deep Learnin...

admin Feb 2, 2024 0 15

Convolutional Neural Network Based Speckle Tracking for...

admin Jan 24, 2024 0 15

Comments