Learning Zero-Shot Multifaceted Visually Grounded Word Embeddings via Multi-Task Training

Shahmohammadi, Hassan; Lensch, Hendrik P. A.; Baayen, R. Harald

Computer Science > Computation and Language

arXiv:2104.07500 (cs)

[Submitted on 15 Apr 2021 (v1), last revised 13 Sep 2021 (this version, v2)]

Title:Learning Zero-Shot Multifaceted Visually Grounded Word Embeddings via Multi-Task Training

Authors:Hassan Shahmohammadi, Hendrik P. A. Lensch, R. Harald Baayen

View PDF

Abstract:Language grounding aims at linking the symbolic representation of language (e.g., words) into the rich perceptual knowledge of the outside world. The general approach is to embed both textual and visual information into a common space -the grounded space-confined by an explicit relationship between both modalities. We argue that this approach sacrifices the abstract knowledge obtained from linguistic co-occurrence statistics in the process of acquiring perceptual information. The focus of this paper is to solve this issue by implicitly grounding the word embeddings. Rather than learning two mappings into a joint space, our approach integrates modalities by determining a reversible grounded mapping between the textual and the grounded space by means of multi-task learning. Evaluations on intrinsic and extrinsic tasks show that our embeddings are highly beneficial for both abstract and concrete words. They are strongly correlated with human judgments and outperform previous works on a wide range of benchmarks. Our grounded embeddings are publicly available here.

Comments:	To be published in the 25th Conference on Computational Natural Language Learning (CoNLL 2021)
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2104.07500 [cs.CL]
	(or arXiv:2104.07500v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2104.07500

Submission history

From: Hassan Shahmohammadi [view email]
[v1] Thu, 15 Apr 2021 14:49:11 UTC (1,388 KB)
[v2] Mon, 13 Sep 2021 19:59:45 UTC (6,325 KB)

Computer Science > Computation and Language

Title:Learning Zero-Shot Multifaceted Visually Grounded Word Embeddings via Multi-Task Training

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Learning Zero-Shot Multifaceted Visually Grounded Word Embeddings via Multi-Task Training

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators