Multilingual and cross-lingual document classification: A meta-learning approach

van der Heijden, Niels; Yannakoudakis, Helen; Mishra, Pushkar; Shutova, Ekaterina

Computer Science > Computation and Language

arXiv:2101.11302 (cs)

[Submitted on 27 Jan 2021 (v1), last revised 24 Apr 2021 (this version, v2)]

Title:Multilingual and cross-lingual document classification: A meta-learning approach

Authors:Niels van der Heijden, Helen Yannakoudakis, Pushkar Mishra, Ekaterina Shutova

View PDF

Abstract:The great majority of languages in the world are considered under-resourced for the successful application of deep learning methods. In this work, we propose a meta-learning approach to document classification in limited-resource setting and demonstrate its effectiveness in two different settings: few-shot, cross-lingual adaptation to previously unseen languages; and multilingual joint training when limited target-language data is available during training. We conduct a systematic comparison of several meta-learning methods, investigate multiple settings in terms of data availability and show that meta-learning thrives in settings with a heterogeneous task distribution. We propose a simple, yet effective adjustment to existing meta-learning methods which allows for better and more stable learning, and set a new state of the art on several languages while performing on-par on others, using only a small amount of labeled data.

Comments:	11 pages, 1 figure
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2101.11302 [cs.CL]
	(or arXiv:2101.11302v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2101.11302
Journal reference:	Association for Computational Linguistics, Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, 2021, 1966--1976

Submission history

From: Niels van der Heijden [view email]
[v1] Wed, 27 Jan 2021 10:22:56 UTC (97 KB)
[v2] Sat, 24 Apr 2021 10:24:38 UTC (97 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2021-01

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Helen Yannakoudakis
Pushkar Mishra
Ekaterina Shutova

export BibTeX citation

Computer Science > Computation and Language

Title:Multilingual and cross-lingual document classification: A meta-learning approach

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Multilingual and cross-lingual document classification: A meta-learning approach

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators