LEWIS: Latent Embeddings for Word Images and their Semantics

Gordo, Albert; Almazan, Jon; Murray, Naila; Perronnin, Florent

Computer Science > Computer Vision and Pattern Recognition

arXiv:1509.06243 (cs)

[Submitted on 21 Sep 2015]

Title:LEWIS: Latent Embeddings for Word Images and their Semantics

Authors:Albert Gordo, Jon Almazan, Naila Murray, Florent Perronnin

View PDF

Abstract:The goal of this work is to bring semantics into the tasks of text recognition and retrieval in natural images. Although text recognition and retrieval have received a lot of attention in recent years, previous works have focused on recognizing or retrieving exactly the same word used as a query, without taking the semantics into consideration.
In this paper, we ask the following question: \emph{can we predict semantic concepts directly from a word image, without explicitly trying to transcribe the word image or its characters at any point?} For this goal we propose a convolutional neural network (CNN) with a weighted ranking loss objective that ensures that the concepts relevant to the query image are ranked ahead of those that are not relevant. This can also be interpreted as learning a Euclidean space where word images and concepts are jointly embedded. This model is learned in an end-to-end manner, from image pixels to semantic concepts, using a dataset of synthetically generated word images and concepts mined from a lexical database (WordNet). Our results show that, despite the complexity of the task, word images and concepts can indeed be associated with a high degree of accuracy

Comments:	Accepted for publication at the International Conference on Computer Vision (ICCV) 2015
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:1509.06243 [cs.CV]
	(or arXiv:1509.06243v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.1509.06243

Submission history

From: Albert Gordo [view email]
[v1] Mon, 21 Sep 2015 14:32:43 UTC (4,088 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CV

< prev | next >

new | recent | 2015-09

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Albert Gordo
Jon Almazán
Naila Murray
Florent Perronnin

export BibTeX citation

Computer Science > Computer Vision and Pattern Recognition

Title:LEWIS: Latent Embeddings for Word Images and their Semantics

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:LEWIS: Latent Embeddings for Word Images and their Semantics

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators