Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

UIT-ViCoV19QA: A Dataset for COVID-19 Community-based Question Answering on Vietnamese Language (2209.06668v1)

Published 14 Sep 2022 in cs.CL

Abstract: For the last two years, from 2020 to 2021, COVID-19 has broken disease prevention measures in many countries, including Vietnam, and negatively impacted various aspects of human life and the social community. Besides, the misleading information in the community and fake news about the pandemic are also serious situations. Therefore, we present the first Vietnamese community-based question answering dataset for developing question answering systems for COVID-19 called UIT-ViCoV19QA. The dataset comprises 4,500 question-answer pairs collected from trusted medical sources, with at least one answer and at most four unique paraphrased answers per question. Along with the dataset, we set up various deep learning models as baseline to assess the quality of our dataset and initiate the benchmark results for further research through commonly used metrics such as BLEU, METEOR, and ROUGE-L. We also illustrate the positive effects of having multiple paraphrased answers experimented on these models, especially on Transformer - a dominant architecture in the field of study.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Triet Minh Thai (2 papers)
  2. Ngan Ha-Thao Chu (1 paper)
  3. Anh Tuan Vo (1 paper)
  4. Son T. Luu (26 papers)
Citations (3)

Summary

We haven't generated a summary for this paper yet.