The Fine Line between Linguistic Generalization and Failure in Seq2Seq-Attention Models

Weber, Noah; Shekhar, Leena; Balasubramanian, Niranjan

Computer Science > Computation and Language

arXiv:1805.01445 (cs)

[Submitted on 3 May 2018 (v1), last revised 8 May 2018 (this version, v2)]

Title:The Fine Line between Linguistic Generalization and Failure in Seq2Seq-Attention Models

Authors:Noah Weber, Leena Shekhar, Niranjan Balasubramanian

View PDF

Abstract:Seq2Seq based neural architectures have become the go-to architecture to apply to sequence to sequence language tasks. Despite their excellent performance on these tasks, recent work has noted that these models usually do not fully capture the linguistic structure required to generalize beyond the dense sections of the data distribution \cite{ettinger2017towards}, and as such, are likely to fail on samples from the tail end of the distribution (such as inputs that are noisy \citep{belkinovnmtbreak} or of different lengths \citep{bentivoglinmtlength}). In this paper, we look at a model's ability to generalize on a simple symbol rewriting task with a clearly defined structure. We find that the model's ability to generalize this structure beyond the training distribution depends greatly on the chosen random seed, even when performance on the standard test set remains the same. This suggests that a model's ability to capture generalizable structure is highly sensitive. Moreover, this sensitivity may not be apparent when evaluating it on standard test sets.

Comments:	Workshop on New Forms of Generalization in Deep Learning and NLP (NAACL 2018), revised to update some references
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:1805.01445 [cs.CL]
	(or arXiv:1805.01445v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.1805.01445

Submission history

From: Noah Weber [view email]
[v1] Thu, 3 May 2018 17:45:33 UTC (31 KB)
[v2] Tue, 8 May 2018 18:19:33 UTC (31 KB)

Computer Science > Computation and Language

Title:The Fine Line between Linguistic Generalization and Failure in Seq2Seq-Attention Models

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:The Fine Line between Linguistic Generalization and Failure in Seq2Seq-Attention Models

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators