skip to main content
10.1145/3587259.3627548acmconferencesArticle/Chapter ViewAbstractPublication Pagesk-capConference Proceedingsconference-collections
research-article

PoQuAD - The Polish Question Answering Dataset - Description and Analysis

Published: 05 December 2023 Publication History

Abstract

This paper showcases PoQuAD — a SQuAD-like contribution to building Question Answering tools for Polish. It largely follows the usual Machine Reading Comprehension format, but a few innovations are added, the key ones being: lower density of annotation, an abstractive answer layer, and the inclusion of polar questions. Additionally, just like in SQuAD 2.0, ‘impossible questions’ are included, to ensure that the models can learn to abstain from answering in situations of insufficient information. The dataset consists of 70k question-answer pairs with contexts from Polish Wikipedia. Some linguistic analysis of the data is provided and discussed, alongside experiments on baseline performance of Question Answering models. The baseline results are compared against a human, and few-shot GPT baseline, and analyzed with respect to different factors which might affect the difficulty of the task. Both human and machine performance is slightly lower than the figures listed for other similar datasets. It is argued that this is because the alterations with respect to SQuAD make the task more challenging. Additionally, the subtask of recognizing impossible questions remains difficult for Machine Reading Comprehension models. Finally, the robustness of models is estimated by a selection of adversarial attacks.

References

[1]
Mohammad Yasin Ayoubi, Sajjad & Davoodeh. 2021. PersianQA: a dataset for Persian Question Answering. https://github.com/SajjjadAyobi/PersianQA.
[2]
Henryk Borzymowski. 2020. Polish QA Model. model trained on HuggingFace, https://huggingface.co/henryk/bert-base-multilingual-cased-finetuned-polish-squad2.
[3]
Maxime De Bruyn, Ehsan Lotfi, Jeska Buhmann, and Walter Daelemans. 2021. MFAQ: a Multilingual FAQ Dataset. arxiv:2109.12870 [cs.CL]
[4]
Aleksandra Chrabrowa, Łukasz Dragan, Karol Grzegorczyk, Dariusz Kajtoch, Mikołaj Koszowski, Robert Mroczkowski, and Piotr Rybak. 2022. Evaluation of Transfer Learning for Polish with a Text-to-Text Model. arXiv preprint arXiv:2205.08808 (2022).
[5]
Alexis Conneau, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, Edouard Grave, Myle Ott, Luke Zettlemoyer, and Veselin Stoyanov. 2019. Unsupervised Cross-lingual Representation Learning at Scale. CoRR (2019). arXiv:1911.02116http://arxiv.org/abs/1911.02116
[6]
Sławomir Dadas. [n. d.]. Polish BART. https://github.com/sdadas/polish-nlp-resources#bart
[7]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. CoRR (2018). arxiv:1810.04805http://arxiv.org/abs/1810.04805
[8]
Martin d’Hoffschmidt, Wacim Belblidia, Tom Brendlé, Quentin Heinrich, and Maxime Vidal. 2020. FQuAD: French Question Answering Dataset. https://doi.org/10.48550/ARXIV.2002.06071
[9]
Pavel Efimov, Andrey Chertok, Leonid Boytsov, and Pavel Braslavski. 2020. SberQuAD – Russian Reading Comprehension Dataset: Description and Analysis. In Lecture Notes in Computer Science. Springer International Publishing, 3–15. https://doi.org/10.1007/978-3-030-58219-7_1
[10]
Quentin Heinrich, Gautier Viaud, and Wacim Belblidia. 2022. FQuAD2.0: French Question Answering and Learning When You Don’t Know. In Proceedings of the Thirteenth Language Resources and Evaluation Conference. European Language Resources Association, Marseille, France, 2205–2214. https://aclanthology.org/2022.lrec-1.237
[11]
Daniel Hládek, Ján Staš, Jozef Juhár, and Tomáš Koctúr. 2023. Slovak Dataset for Multilingual Question Answering. IEEE Access 11 (2023), 32869–32881. https://doi.org/10.1109/ACCESS.2023.3262308
[12]
Robin Jia and Percy Liang. 2017. Adversarial Examples for Evaluating Reading Comprehension Systems. arxiv:1707.07328 [cs.CL]
[13]
Seungyoung Lim, Myungji Kim, and Jooyoul Lee. 2019. KorQuAD1.0: Korean QA Dataset for Machine Reading Comprehension. https://doi.org/10.48550/ARXIV.1909.07005
[14]
Jiahua Liu, Yankai Lin, Zhiyuan Liu, and Maosong Sun. 2019. XQA: A Cross-lingual Open-domain Question Answering Dataset. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, Italy, 2358–2368. https://doi.org/10.18653/v1/P19-1227
[15]
Shayne Longpre, Yi Lu, and Joachim Daiber. 2020. MKQA: A Linguistically Diverse Benchmark for Multilingual Open Domain Question Answering. https://arxiv.org/pdf/2007.15207.pdf
[16]
Kateřina Macková and Milan Straka. 2020. Reading Comprehension in Czech via Machine Translation and Cross-lingual Transfer. https://doi.org/10.48550/ARXIV.2007.01667
[17]
Michał Marcińczuk, Adam Radziszewski, Maciej Piasecki, Dominik Piasecki, and Marcin Ptak. 2013. Evaluation of baseline information retrieval for Polish open-domain Question Answering system. In Proceedings of the International Conference Recent Advances in Natural Language Processing RANLP 2013. INCOMA Ltd. Shoumen, BULGARIA, Hissar, Bulgaria, 428–435. https://aclanthology.org/R13-1056
[18]
Robert Mroczkowski, Piotr Rybak, Alina Wróblewska, and Ireneusz Gawlik. 2021. HerBERT: Efficiently Pretrained Transformer-based Language Model for Polish. In Proceedings of the 8th Workshop on Balto-Slavic Natural Language Processing. Association for Computational Linguistics, Kiyv, Ukraine, 1–10. https://www.aclweb.org/anthology/2021.bsnlp-1.1
[19]
Timo Möller, Julian Risch, and Malte Pietsch. 2021. GermanQuAD and GermanDPR: Improving Non-English Question Answering and Passage Retrieval. https://doi.org/10.48550/ARXIV.2104.12741
[20]
Kiet Nguyen, Vu Nguyen, Anh Nguyen, and Ngan Nguyen. 2020. A Vietnamese Dataset for Evaluating Machine Reading Comprehension. In Proceedings of the 28th International Conference on Computational Linguistics. International Committee on Computational Linguistics, Barcelona, Spain (Online), 2595–2605. https://doi.org/10.18653/v1/2020.coling-main.233
[21]
Pranav Rajpurkar, Robin Jia, and Percy Liang. 2018. Know What You Don’t Know: Unanswerable Questions for SQuAD. https://doi.org/10.48550/ARXIV.1806.03822
[22]
Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev, and Percy Liang. 2016. SQuAD: 100,000+ Questions for Machine Comprehension of Text. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Austin, Texas, 2383–2392. https://doi.org/10.18653/v1/D16-1264
[23]
Anna Rogers, Matt Gardner, and Isabelle Augenstein. 2023. QA Dataset Explosion: A Taxonomy of NLP Resources for Question Answering and Reading Comprehension. Comput. Surveys 55, 10 (feb 2023), 1–45. https://doi.org/10.1145/3560260
[24]
Piotr Rybak. 2023. MAUPQA: Massive Automatically-created Polish Question Answering Dataset. In Proceedings of the 9th Workshop on Slavic Natural Language Processing 2023 (SlavicNLP 2023). Association for Computational Linguistics, Dubrovnik, Croatia, 11–16. https://aclanthology.org/2023.bsnlp-1.2
[25]
Piotr Rybak, Piotr Przybyła, and Maciej Ogrodniczuk. 2022. Improving Question Answering Performance through Manual Annotation: Costs, Benefits and Strategies. arxiv:2212.08897 [cs.CL]
[26]
Radoslav Sabol, Marek Medveď, and Aleš Horák. 2019. Czech Question Answering with Extended SQAD v3.0 Benchmark Dataset. tištěná verze "print". In Proceedings of the Thirteenth Workshop on Recent Advances in Slavonic Natural Languages Processing, RASLAN 2019 (Brno), Aleš Horák, Pavel Rychlý, and Adam Rambousek (Eds.). Tribun EU, Brno, 99–108.
[27]
Viktor Schlegel, Marco Valentino, Andre Freitas, Goran Nenadic, and Riza Batista-Navarro. 2020. A Framework for Evaluation of Machine Reading Comprehension Gold Standards. In Proceedings of the Twelfth Language Resources and Evaluation Conference. European Language Resources Association, Marseille, France, 5359–5369. https://aclanthology.org/2020.lrec-1.660
[28]
Priyanka Sen and Amir Saffari. 2020. What do Models Learn from Question Answering Datasets?. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.emnlp-main.190
[29]
ByungHoon So, Kyuhong Byun, Kyungwon Kang, and Seongjin Cho. 2022. JaQuAD: Japanese Question Answering Dataset for Machine Reading Comprehension. https://doi.org/10.48550/ARXIV.2202.01764
[30]
Saku Sugawara, Kentaro Inui, Satoshi Sekine, and Akiko Aizawa. 2018. What Makes Reading Comprehension Questions Easier?arxiv:1808.09384 [cs.CL]
[31]
Linting Xue, Noah Constant, Adam Roberts, Mihir Kale, Rami Al-Rfou, Aditya Siddhant, Aditya Barua, and Colin Raffel. 2020. mT5: A massively multilingual pre-trained text-to-text transformer. CoRR (2020). arXiv:2010.11934https://arxiv.org/abs/2010.11934
[32]
Mark Yatskar. 2019. A Qualitative Comparison of CoQA, SQuAD 2.0 and QuAC. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 2318–2323. https://doi.org/10.18653/v1/N19-1241
[33]
Changchang Zeng, Shaobo Li, Qin Li, Jie Hu, and Jianjun Hu. 2020. A Survey on Machine Reading Comprehension: Tasks, Evaluation Metrics and Benchmark Datasets. arxiv:2006.11880 [cs.CL]

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
K-CAP '23: Proceedings of the 12th Knowledge Capture Conference 2023
December 2023
270 pages
ISBN:9798400701412
DOI:10.1145/3587259
  • Editors:
  • Brent Venable,
  • Daniel Garijo,
  • Brian Jalaian
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 December 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Machine Reading Comprehension
  2. Natural Language Processing
  3. Question Answering

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

Conference

K-CAP '23
Sponsor:
K-CAP '23: Knowledge Capture Conference 2023
December 5 - 7, 2023
FL, Pensacola, USA

Acceptance Rates

Overall Acceptance Rate 55 of 198 submissions, 28%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 71
    Total Downloads
  • Downloads (Last 12 months)25
  • Downloads (Last 6 weeks)2
Reflects downloads up to 21 Jan 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media