skip to main content
10.1145/3460426.3463611acmconferencesArticle/Chapter ViewAbstractPublication PagesicmrConference Proceedingsconference-collections
research-article
Open access

Image-to-Image Transfer Makes Chaos to Order

Published: 01 September 2021 Publication History

Abstract

GAN-based image-to-image transfer tools have achieved remarkable results in image generation. However, most of the research efforts focus on changing the style features, e.g., color and texture. The spatial features, e.g., the locations of objects in input and output images, always keep consistent. If the above tools are employed to translate locations, such as organizing objects from a chaotic scene to an orderly scene in images (i.e., chaos to order), can these tools work well? Therefore, we investigate the issue of image-to-image location transfer and receive a preliminary conclusion that it is hard to manipulate spatial features of objects in raw images automatically. In this paper, we propose a novel framework called LT-GAN to address the above issue. Specifically, a multi-stage generation structure is designed, where the location translation is performed based on semantic labels as a bridge to enhance the effect of automatically manipulate the spatial features of raw images. Experimental results demonstrate the effectiveness of the proposed multi-stage generation strategy. Meanwhile, a Color Histogram Loss is explored to evaluate the similarity of color distribution between a chaotic scene image and the corresponding orderly scene image. The quality of orderly scene images generated by the final stage is improved significantly in LT-GAN by using the combination of feature extraction and the Color Histogram Loss. Moreover, to break through the limitation of public datasets in image-to-image transfer tasks, a new dataset named M2C is constructed for this new application scenario of location transfer, including more than 15,000 paired images and the corresponding semantic labels in total. The dataset is available at \urlhttps://drive.google.com/open?id=1amr9ga9wvhnIzeZ48OHbLapHGqOb4-Up

References

[1]
Liang-Chieh Chen, Yukun Zhu, George Papandreou, Florian Schroff, and Hartwig Adam. 2018. Encoder-decoder with atrous separable convolution for semantic image segmentation. In ECCV. 801--818.
[2]
Marius Cordts, Mohamed Omran, Sebastian Ramos, Timo Rehfeld, Markus Enzweiler, Rodrigo Benenson, Uwe Franke, Stefan Roth, and Bernt Schiele. 2016. The cityscapes dataset for semantic urban scene understanding. In CVPR. 3213--3223.
[3]
Brian Dolhansky and Cristian Canton Ferrer. 2018. Eye in-painting with exemplar generative adversarial networks. In CVPR. 7902--7911.
[4]
Alexei A Efros and Thomas K Leung. 1999. Texture synthesis by non-parametric sampling. In ICCV, Vol. 2. IEEE, 1033--1038.
[5]
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In NIPS. 2672--2680.
[6]
Aaron Hertzmann, Charles E Jacobs, Nuria Oliver, Brian Curless, and David H Salesin. 2001. Image analogies. In SIGGRAPH. ACM, 327--340.
[7]
Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter. 2017. Gans trained by a two time-scale update rule converge to a local nash equilibrium. In NIPS. 6626--6637.
[8]
Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. 2017. Image-toimage translation with conditional adversarial networks. In CVPR. 1125--1134.
[9]
Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen. 2017. Progressive growing of gans for improved quality, stability, and variation. arXiv preprint arXiv:1710.10196 (2017).
[10]
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
[11]
Christian Ledig, Lucas Theis, Ferenc Huszár, Jose Caballero, Andrew Cunningham, Alejandro Acosta, Andrew Aitken, Alykhan Tejani, Johannes Totz, Zehan Wang, et al. 2017. Photo-realistic single image super-resolution using a generative adversarial network. In CVPR. 4681--4690.
[12]
Chuan Li and Michael Wand. 2016. Precomputed real-time texture synthesis with markovian generative adversarial networks. In ECCV. Springer, 702--716.
[13]
Jianan Li, Xiaodan Liang, Yunchao Wei, Tingfa Xu, Jiashi Feng, and Shuicheng Yan. 2017. Perceptual generative adversarial networks for small object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1222--1230.
[14]
Si Liu, Yao Sun, Defa Zhu, Renda Bao, Wei Wang, Xiangbo Shu, and Shuicheng Yan. 2017. Face aging with contextual generative adversarial nets. In Proceedings of the 25th ACM international conference on Multimedia. ACM, 82--90.
[15]
Jonathan Long, Evan Shelhamer, and Trevor Darrell. 2015. Fully convolutional networks for semantic segmentation. In CVPR. 3431--3440.
[16]
Pauline Luc, Camille Couprie, Soumith Chintala, and Jakob Verbeek. 2016. Semantic segmentation using adversarial networks. arXiv preprint arXiv:1611.08408 (2016).
[17]
Eunbyung Park, Jimei Yang, Ersin Yumer, Duygu Ceylan, and Alexander C Berg. 2017. Transformation-grounded image generation network for novel 3d view synthesis. In CVPR. 3500--3509.
[18]
Scott Reed, Zeynep Akata, Xinchen Yan, Lajanugen Logeswaran, Bernt Schiele, and Honglak Lee. 2016. Generative adversarial text to image synthesis. arXiv preprint arXiv:1605.05396 (2016).
[19]
Zhixin Shu, Ersin Yumer, Sunil Hadap, Kalyan Sunkavalli, Eli Shechtman, and Dimitris Samaras. 2017. Neural face editing with intrinsic image disentangling. In CVPR. 5541--5550.
[20]
Nathan Silberman, Derek Hoiem, Pushmeet Kohli, and Rob Fergus. 2012. Indoor segmentation and support inference from rgbd images. In ECCV. Springer, 746-- 760.
[21]
Sergey Tulyakov, Ming-Yu Liu, Xiaodong Yang, and Jan Kautz. 2018. Mocogan: Decomposing motion and content for video generation. In CVPR. 1526--1535.
[22]
Evgeniya Ustinova and Victor Lempitsky. 2016. Learning deep embeddings with histogram loss. In NIPS. 4170--4178.
[23]
Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Andrew Tao, Jan Kautz, and Bryan Catanzaro. 2018. High-resolution image synthesis and semantic manipulation with conditional gans. In CVPR. 8798--8807.
[24]
Xiaolong Wang, Abhinav Shrivastava, and Abhinav Gupta. 2017. A-fast-rcnn: Hard positive generation via adversary for object detection. In CVPR. 2606--2615.
[25]
Jiajun Wu, Chengkai Zhang, Tianfan Xue, Bill Freeman, and Josh Tenenbaum. 2016. Learning a probabilistic latent space of object shapes via 3d generative adversarial modeling. In NIPS. 82--90.
[26]
Qizhe Xie, Zihang Dai, Yulun Du, Eduard Hovy, and Graham Neubig. 2017. Controllable in variance through adversarial feature learning. In NIPS. 585--596.
[27]
Li-Chia Yang, Szu-Yu Chou, and Yi-Hsuan Yang. 2017. MidiNet: A convolutional generative adversarial network for symbolic-domain music generation. arXiv preprint arXiv:1703.10847 (2017).
[28]
Han Zhang, Tao Xu, Hongsheng Li, Shaoting Zhang, Xiaogang Wang, Xiaolei Huang, and Dimitris N Metaxas. 2017. Stackgan: Text to photo-realistic image synthesis with stacked generative adversarial networks. In ICCV. 5907--5915.
[29]
Bolei Zhou, Hang Zhao, Xavier Puig, Sanja Fidler, Adela Barriuso, and Antonio Torralba. 2017. Scene parsing through ade20k dataset. In CVPR. 633--641.
[30]
Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. 2017. Unpaired image-to-image translation using cycle-consistent adversarial networks. In ICCV. 2223--2232.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICMR '21: Proceedings of the 2021 International Conference on Multimedia Retrieval
August 2021
715 pages
ISBN:9781450384636
DOI:10.1145/3460426
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 September 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. datasets
  2. generative adversarial networks
  3. image-to-image
  4. location transfer

Qualifiers

  • Research-article

Funding Sources

Conference

ICMR '21
Sponsor:

Acceptance Rates

Overall Acceptance Rate 254 of 830 submissions, 31%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)101
  • Downloads (Last 6 weeks)8
Reflects downloads up to 28 Dec 2024

Other Metrics

Citations

Cited By

View all

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media