research-article

PoQuAD - The Polish Question Answering Dataset - Description and Analysis

Authors:

Aleksandra Zwierzchowska,

Natalia Zawadzka-Paluektau,

Łukasz KobylińskiAuthors Info & Claims

K-CAP '23: Proceedings of the 12th Knowledge Capture Conference 2023

Pages 105 - 113

https://doi.org/10.1145/3587259.3627548

Published: 05 December 2023 Publication History

Abstract

This paper showcases PoQuAD — a SQuAD-like contribution to building Question Answering tools for Polish. It largely follows the usual Machine Reading Comprehension format, but a few innovations are added, the key ones being: lower density of annotation, an abstractive answer layer, and the inclusion of polar questions. Additionally, just like in SQuAD 2.0, ‘impossible questions’ are included, to ensure that the models can learn to abstain from answering in situations of insufficient information. The dataset consists of 70k question-answer pairs with contexts from Polish Wikipedia. Some linguistic analysis of the data is provided and discussed, alongside experiments on baseline performance of Question Answering models. The baseline results are compared against a human, and few-shot GPT baseline, and analyzed with respect to different factors which might affect the difficulty of the task. Both human and machine performance is slightly lower than the figures listed for other similar datasets. It is argued that this is because the alterations with respect to SQuAD make the task more challenging. Additionally, the subtask of recognizing impossible questions remains difficult for Machine Reading Comprehension models. Finally, the robustness of models is estimated by a selection of adversarial attacks.

References

[1]

Mohammad Yasin Ayoubi, Sajjad & Davoodeh. 2021. PersianQA: a dataset for Persian Question Answering. https://github.com/SajjjadAyobi/PersianQA.

[2]

Henryk Borzymowski. 2020. Polish QA Model. model trained on HuggingFace, https://huggingface.co/henryk/bert-base-multilingual-cased-finetuned-polish-squad2.

[3]

Maxime De Bruyn, Ehsan Lotfi, Jeska Buhmann, and Walter Daelemans. 2021. MFAQ: a Multilingual FAQ Dataset. arxiv:2109.12870 [cs.CL]

[4]

Aleksandra Chrabrowa, Łukasz Dragan, Karol Grzegorczyk, Dariusz Kajtoch, Mikołaj Koszowski, Robert Mroczkowski, and Piotr Rybak. 2022. Evaluation of Transfer Learning for Polish with a Text-to-Text Model. arXiv preprint arXiv:2205.08808 (2022).

[5]

Alexis Conneau, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, Edouard Grave, Myle Ott, Luke Zettlemoyer, and Veselin Stoyanov. 2019. Unsupervised Cross-lingual Representation Learning at Scale. CoRR (2019). arXiv:1911.02116http://arxiv.org/abs/1911.02116

[6]

Sławomir Dadas. [n. d.]. Polish BART. https://github.com/sdadas/polish-nlp-resources#bart

[7]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. CoRR (2018). arxiv:1810.04805http://arxiv.org/abs/1810.04805

[8]

Martin d’Hoffschmidt, Wacim Belblidia, Tom Brendlé, Quentin Heinrich, and Maxime Vidal. 2020. FQuAD: French Question Answering Dataset. https://doi.org/10.48550/ARXIV.2002.06071

[9]

Pavel Efimov, Andrey Chertok, Leonid Boytsov, and Pavel Braslavski. 2020. SberQuAD – Russian Reading Comprehension Dataset: Description and Analysis. In Lecture Notes in Computer Science. Springer International Publishing, 3–15. https://doi.org/10.1007/978-3-030-58219-7_1

Digital Library

[10]

Quentin Heinrich, Gautier Viaud, and Wacim Belblidia. 2022. FQuAD2.0: French Question Answering and Learning When You Don’t Know. In Proceedings of the Thirteenth Language Resources and Evaluation Conference. European Language Resources Association, Marseille, France, 2205–2214. https://aclanthology.org/2022.lrec-1.237

[11]

Daniel Hládek, Ján Staš, Jozef Juhár, and Tomáš Koctúr. 2023. Slovak Dataset for Multilingual Question Answering. IEEE Access 11 (2023), 32869–32881. https://doi.org/10.1109/ACCESS.2023.3262308

[12]

Robin Jia and Percy Liang. 2017. Adversarial Examples for Evaluating Reading Comprehension Systems. arxiv:1707.07328 [cs.CL]

[13]

Seungyoung Lim, Myungji Kim, and Jooyoul Lee. 2019. KorQuAD1.0: Korean QA Dataset for Machine Reading Comprehension. https://doi.org/10.48550/ARXIV.1909.07005

[14]

Jiahua Liu, Yankai Lin, Zhiyuan Liu, and Maosong Sun. 2019. XQA: A Cross-lingual Open-domain Question Answering Dataset. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, Italy, 2358–2368. https://doi.org/10.18653/v1/P19-1227

[15]

Shayne Longpre, Yi Lu, and Joachim Daiber. 2020. MKQA: A Linguistically Diverse Benchmark for Multilingual Open Domain Question Answering. https://arxiv.org/pdf/2007.15207.pdf

[16]

Kateřina Macková and Milan Straka. 2020. Reading Comprehension in Czech via Machine Translation and Cross-lingual Transfer. https://doi.org/10.48550/ARXIV.2007.01667

[17]

Michał Marcińczuk, Adam Radziszewski, Maciej Piasecki, Dominik Piasecki, and Marcin Ptak. 2013. Evaluation of baseline information retrieval for Polish open-domain Question Answering system. In Proceedings of the International Conference Recent Advances in Natural Language Processing RANLP 2013. INCOMA Ltd. Shoumen, BULGARIA, Hissar, Bulgaria, 428–435. https://aclanthology.org/R13-1056

[18]

Robert Mroczkowski, Piotr Rybak, Alina Wróblewska, and Ireneusz Gawlik. 2021. HerBERT: Efficiently Pretrained Transformer-based Language Model for Polish. In Proceedings of the 8th Workshop on Balto-Slavic Natural Language Processing. Association for Computational Linguistics, Kiyv, Ukraine, 1–10. https://www.aclweb.org/anthology/2021.bsnlp-1.1

[19]

Timo Möller, Julian Risch, and Malte Pietsch. 2021. GermanQuAD and GermanDPR: Improving Non-English Question Answering and Passage Retrieval. https://doi.org/10.48550/ARXIV.2104.12741

[20]

Kiet Nguyen, Vu Nguyen, Anh Nguyen, and Ngan Nguyen. 2020. A Vietnamese Dataset for Evaluating Machine Reading Comprehension. In Proceedings of the 28th International Conference on Computational Linguistics. International Committee on Computational Linguistics, Barcelona, Spain (Online), 2595–2605. https://doi.org/10.18653/v1/2020.coling-main.233

[21]

Pranav Rajpurkar, Robin Jia, and Percy Liang. 2018. Know What You Don’t Know: Unanswerable Questions for SQuAD. https://doi.org/10.48550/ARXIV.1806.03822

[22]

Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev, and Percy Liang. 2016. SQuAD: 100,000+ Questions for Machine Comprehension of Text. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Austin, Texas, 2383–2392. https://doi.org/10.18653/v1/D16-1264

[23]

Anna Rogers, Matt Gardner, and Isabelle Augenstein. 2023. QA Dataset Explosion: A Taxonomy of NLP Resources for Question Answering and Reading Comprehension. Comput. Surveys 55, 10 (feb 2023), 1–45. https://doi.org/10.1145/3560260

Digital Library

[24]

Piotr Rybak. 2023. MAUPQA: Massive Automatically-created Polish Question Answering Dataset. In Proceedings of the 9th Workshop on Slavic Natural Language Processing 2023 (SlavicNLP 2023). Association for Computational Linguistics, Dubrovnik, Croatia, 11–16. https://aclanthology.org/2023.bsnlp-1.2

[25]

Piotr Rybak, Piotr Przybyła, and Maciej Ogrodniczuk. 2022. Improving Question Answering Performance through Manual Annotation: Costs, Benefits and Strategies. arxiv:2212.08897 [cs.CL]

[26]

Radoslav Sabol, Marek Medveď, and Aleš Horák. 2019. Czech Question Answering with Extended SQAD v3.0 Benchmark Dataset. tištěná verze "print". In Proceedings of the Thirteenth Workshop on Recent Advances in Slavonic Natural Languages Processing, RASLAN 2019 (Brno), Aleš Horák, Pavel Rychlý, and Adam Rambousek (Eds.). Tribun EU, Brno, 99–108.

[27]

Viktor Schlegel, Marco Valentino, Andre Freitas, Goran Nenadic, and Riza Batista-Navarro. 2020. A Framework for Evaluation of Machine Reading Comprehension Gold Standards. In Proceedings of the Twelfth Language Resources and Evaluation Conference. European Language Resources Association, Marseille, France, 5359–5369. https://aclanthology.org/2020.lrec-1.660

[28]

Priyanka Sen and Amir Saffari. 2020. What do Models Learn from Question Answering Datasets?. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.emnlp-main.190

[29]

ByungHoon So, Kyuhong Byun, Kyungwon Kang, and Seongjin Cho. 2022. JaQuAD: Japanese Question Answering Dataset for Machine Reading Comprehension. https://doi.org/10.48550/ARXIV.2202.01764

[30]

Saku Sugawara, Kentaro Inui, Satoshi Sekine, and Akiko Aizawa. 2018. What Makes Reading Comprehension Questions Easier?arxiv:1808.09384 [cs.CL]

[31]

Linting Xue, Noah Constant, Adam Roberts, Mihir Kale, Rami Al-Rfou, Aditya Siddhant, Aditya Barua, and Colin Raffel. 2020. mT5: A massively multilingual pre-trained text-to-text transformer. CoRR (2020). arXiv:2010.11934https://arxiv.org/abs/2010.11934

[32]

Mark Yatskar. 2019. A Qualitative Comparison of CoQA, SQuAD 2.0 and QuAC. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 2318–2323. https://doi.org/10.18653/v1/N19-1241

[33]

Changchang Zeng, Shaobo Li, Qin Li, Jie Hu, and Jianjun Hu. 2020. A Survey on Machine Reading Comprehension: Tasks, Evaluation Metrics and Benchmark Datasets. arxiv:2006.11880 [cs.CL]

Index Terms

PoQuAD - The Polish Question Answering Dataset - Description and Analysis
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
2. Information systems
  1. Information retrieval
    1. Retrieval tasks and goals
      1. Question answering

Recommendations

Towards a Polish Question Answering Dataset (PoQuAD)
From Born-Physical to Born-Virtual: Augmenting Intelligence in Digital Libraries
Abstract
This paper presents the efforts towards creating PoQuAD, a dataset for training automatic question answering models in Polish. It justifies why having native data is vital for training accurate Question Answering systems. PoQuAD broadly follows ...
XLMRQA: Open-Domain Question Answering on Vietnamese Wikipedia-Based Textual Knowledge Source
Intelligent Information and Database Systems
Abstract
Question answering (QA) is a natural language understanding task within the fields of information retrieval and information extraction that has attracted much attention from the computational linguistics and artificial intelligence research ...
Building a Closed-Domain Question Answering System for a Low-Resource Language
In recent years, the Question Answering System (QAS) has been widely used to develop many systems, such as conversation systems, chatbots, and intelligent search. Depending on the amount of information or knowledge that the system processes, the system ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

K-CAP '23: Proceedings of the 12th Knowledge Capture Conference 2023

December 2023

270 pages

ISBN:9798400701412

DOI:10.1145/3587259

Editors:
Brent Venable
University of West Florida and Institute for Human and Machine Cognition, Pensacola, FL, USA
,
Daniel Garijo
Ontology Engineering Group, Universidad Politécnica de Madrid, Spain
,
Brian Jalaian
University of West Florida and Institute for Human & Machine Cognition, Pensacola, FL, USA

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGAI: ACM Special Interest Group on Artificial Intelligence

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 December 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

European Regional Development Fund

Conference

K-CAP '23

Sponsor:

SIGAI

K-CAP '23: Knowledge Capture Conference 2023

December 5 - 7, 2023

FL, Pensacola, USA

Acceptance Rates

Overall Acceptance Rate 55 of 198 submissions, 28%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
71
Total Downloads

Downloads (Last 12 months)25
Downloads (Last 6 weeks)2

Reflects downloads up to 21 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents