research-article

Illustrate Your Story: Enriching Text with Images

Authors:

Sreyasi Nag Chowdhury,

William Cheng,

Gerard de Melo,

Simon Razniewski,

Gerhard WeikumAuthors Info & Claims

WSDM '20: Proceedings of the 13th International Conference on Web Search and Data Mining

Pages 849 - 852

https://doi.org/10.1145/3336191.3371866

Published: 22 January 2020 Publication History

Get Access

Abstract

Human perception is known to be predominantly visual. As modern web infrastructure promoted the storage of media, the web-data paradigm shifted from text-only documents to those containing text and images. A multitude of blog posts, news articles, and social media posts exist on the Internet today as examples of multimodal stories. The manual alignment of images and text in a story is time-consuming and labor intensive. We present a web application for automatically selecting relevant images from an album and placing them in suitable contexts within a body of text. The application solves a global optimization problem that maximizes the coherence of text paragraphs and image descriptors, and allows for exploring the underlying image descriptors and similarity metrics. Experiments show that our method can align images with texts with high semantic fit, and to user satisfaction.

References

[1]

Malihe Alikhani, Sreyasi Nag Chowdhury, Gerard de Melo, and Matthew Stone. 2019. CITE: A Corpus Of Text--Image Discourse Relations. Proc. of NAACL-HLT.

Google Scholar

[2]

Ann Marie Barry. 1997. Visual intelligence: Perception, image, and manipulation in visual communication.SUNY Press.

Google Scholar

[3]

R. Bernardi, R. cC akici, D. Elliott, A. Erdem, E. Erdem, N. Ikizler-Cinbis, F. Keller, A. Muscat, and B. Plank. 2016. Automatic Description Generation from Images: A Survey of Models, Datasets, and Evaluation Measures. J. Artif. Intell. Res. (2016).

Google Scholar

[4]

Sreyasi Nag Chowdhury, Simon Razniewski, and Gerhard Weikum. 2019. Story-oriented Image Selection and Placement. CoRR (2019).

Google Scholar

[5]

Fartash Faghri, David J. Fleet, Jamie Kiros, and Sanja Fidler. 2018. VSE

Google Scholar

[6]

: Improving Visual-Semantic Embeddings with Hard Negatives. BMVC.

Google Scholar

[7]

Andrea Frome, Gregory S. Corrado, Jonathon Shlens, Samy Bengio, Jeffrey Dean, Marc'Aurelio Ranzato, and Tomas Mikolov. 2013. DeViSE: A Deep Visual-Semantic Embedding Model. NIPS.

Google Scholar

[8]

Shivali Goel, Rishi Madhok, and Shweta Garg. 2018. Proposing Contextually Relevant Quotes for Images. ECIR.

Google Scholar

[9]

Dhiraj Joshi, James Ze Wang, and Jia Li. 2006. The Story Picturing Engine - a system for automatic text illustration. TOMCCAP, Vol. 2, 1 (2006), 68--89.

Digital Library

Google Scholar

[10]

Cewu Lu, Ranjay Krishna, Michael S. Bernstein, and Fei-Fei Li. 2016. Visual Relationship Detection with Language Priors. ECCV.

Google Scholar

[11]

Paul Messaris and Linus Abraham. 2001. The role of images in framing news stories. Framing public life. Routledge, 231--242.

Google Scholar

[12]

Hareesh Ravi, Lezi Wang, Carlos Mu n iz, Leonid Sigal, Dimitris N. Metaxas, and Mubbasir Kapadia. 2018. Show Me a Story: Towards Coherent Neural Story Illustration. CVPR.

Google Scholar

[13]

Joseph Redmon and Ali Farhadi. 2017. YOLO9000: Better, Faster, Stronger. CVPR.

Google Scholar

[14]

Bolei Zhou, À gata Lapedriza, Jianxiong Xiao, Antonio Torralba, and Aude Oliva. 2014. Learning Deep Features for Scene Recognition using Places Database. NIPS.

Google Scholar

Cited By

View all

Wang YLin JYu ZHu WKarlsson B(2023)Open-world story generation with structured knowledge enhancementNeurocomputing10.1016/j.neucom.2023.126792559:COnline publication date: 28-Nov-2023
https://dl.acm.org/doi/10.1016/j.neucom.2023.126792
Golec JHachaj TSokal G(2021)TIPS: A Framework for Text Summarising with Illustrative PicturesEntropy10.3390/e2312161423:12(1614)Online publication date: 30-Nov-2021
https://doi.org/10.3390/e23121614
Sharp SBurns MAndrade J(2021)Use of Instagram to convey nutrition information to collegiate athletesJournal of American College Health10.1080/07448481.2021.192095571:4(1103-1110)Online publication date: 27-Jul-2021
https://doi.org/10.1080/07448481.2021.1920955
Show More Cited By

Index Terms

Illustrate Your Story: Enriching Text with Images
1. Applied computing
  1. Document management and text processing
    1. Document preparation
      1. Multi / mixed media creation
2. Information systems
  1. Information systems applications
    1. Multimedia information systems
      1. Multimedia content creation
  2. World Wide Web
    1. Web mining
      1. Data extraction and integration

Recommendations

From Linear Story Generation to Branching Story Graphs

Interactive narrative systems are storytelling systems in which the user can influence the content or ordering of story world events. Conceptually, an interactive narrative can be represented as a branching graph of narrative elements, implying points ...
The True Story of Fake News: How Mainstream Media Manipulates Millions
M2D: Monolog to Dialog Generation for Conversational Story Telling
Interactive Storytelling
Abstract
Storytelling serves many different social functions, e.g. stories are used to persuade, share troubles, establish shared values, learn social behaviors, and entertain. Moreover, stories are often told conversationally through dialog, and previous ...

Comments

Information & Contributors

Information

Published In

WSDM '20: Proceedings of the 13th International Conference on Web Search and Data Mining

January 2020

950 pages

ISBN:9781450368223

DOI:10.1145/3336191

General Chairs:
James Caverlee
Texas A&M University
,
Xia "Ben" Hu
Texas A&M University
,
Program Chairs:
Mounia Lalmas
Spotify
,
Wei Wang
University of California, Los Angeles

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 January 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

WSDM '20

Sponsor:

WSDM '20: The Thirteenth ACM International Conference on Web Search and Data Mining

February 3 - 7, 2020

TX, Houston, USA

Acceptance Rates

Overall Acceptance Rate 498 of 2,863 submissions, 17%

Upcoming Conference

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
224
Total Downloads

Downloads (Last 12 months)11
Downloads (Last 6 weeks)0

Reflects downloads up to 23 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Wang YLin JYu ZHu WKarlsson B(2023)Open-world story generation with structured knowledge enhancementNeurocomputing10.1016/j.neucom.2023.126792559:COnline publication date: 28-Nov-2023
https://dl.acm.org/doi/10.1016/j.neucom.2023.126792
Golec JHachaj TSokal G(2021)TIPS: A Framework for Text Summarising with Illustrative PicturesEntropy10.3390/e2312161423:12(1614)Online publication date: 30-Nov-2021
https://doi.org/10.3390/e23121614
Sharp SBurns MAndrade J(2021)Use of Instagram to convey nutrition information to collegiate athletesJournal of American College Health10.1080/07448481.2021.192095571:4(1103-1110)Online publication date: 27-Jul-2021
https://doi.org/10.1080/07448481.2021.1920955
Kulahcioglu Tde Melo GWen Chen CCucchiara RHua XQi GRicci EZhang ZZimmermann R(2020)Fonts Like This but HappierProceedings of the 28th ACM International Conference on Multimedia10.1145/3394171.3413534(2973-2981)Online publication date: 12-Oct-2020
https://dl.acm.org/doi/10.1145/3394171.3413534

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Cited By

Index Terms

Recommendations

From Linear Story Generation to Branching Story Graphs

The True Story of Fake News: How Mainstream Media Manipulates Millions

M2D: Monolog to Dialog Generation for Conversational Story Telling

Comments

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Other Metrics

Article Metrics

Other Metrics

Cited By

Login options

Full Access

PDF

eReader

Abstract

References

Cited By

Index Terms

Recommendations

From Linear Story Generation to Branching Story Graphs

The True Story of Fake News: How Mainstream Media Manipulates Millions

M2D: Monolog to Dialog Generation for Conversational Story Telling

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations