Gating Mechanisms for Combining Character and Word-level Word Representations: an Empirical Study

Jorge Balazs, Yutaka Matsuo


Abstract
In this paper we study how different ways of combining character and word-level representations affect the quality of both final word and sentence representations. We provide strong empirical evidence that modeling characters improves the learned representations at the word and sentence levels, and that doing so is particularly useful when representing less frequent words. We further show that a feature-wise sigmoid gating mechanism is a robust method for creating representations that encode semantic similarity, as it performed reasonably well in several word similarity datasets. Finally, our findings suggest that properly capturing semantic similarity at the word level does not consistently yield improved performance in downstream sentence-level tasks.
Anthology ID:
N19-3016
Volume:
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop
Month:
June
Year:
2019
Address:
Minneapolis, Minnesota
Editors:
Sudipta Kar, Farah Nadeem, Laura Burdick, Greg Durrett, Na-Rae Han
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
110–124
Language:
URL:
https://aclanthology.org/N19-3016
DOI:
10.18653/v1/N19-3016
Bibkey:
Cite (ACL):
Jorge Balazs and Yutaka Matsuo. 2019. Gating Mechanisms for Combining Character and Word-level Word Representations: an Empirical Study. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop, pages 110–124, Minneapolis, Minnesota. Association for Computational Linguistics.
Cite (Informal):
Gating Mechanisms for Combining Character and Word-level Word Representations: an Empirical Study (Balazs & Matsuo, NAACL 2019)
Copy Citation:
PDF:
https://aclanthology.org/N19-3016.pdf
Poster:
 N19-3016.Poster.pdf
Code
 jabalazs/gating
Data
GLUEMPQA Opinion CorpusMultiNLISICKSNLISSTSST-2SST-5SentEval