skip to main content
research-article

Social-sensed Image Aesthetics Assessment

Published: 31 December 2020 Publication History

Abstract

Image aesthetics assessment aims to endow computers with the ability to judge the aesthetic values of images, and its potential has been recognized in a variety of applications. Most previous studies perform aesthetics assessment purely based on image content. However, given the fact that aesthetic perceiving is a human cognitive activity, it is necessary to consider users’ perception of an image when judging its aesthetic quality. In this article, we regard users’ social behavior as the reflection of their perception of images and harness these additional clues to improve image aesthetics assessment. Specifically, we first merge the raw social interactions between users and images into clusters as the social labels of images, so the collective social behavioral information associated with an image can be well represented over a structured and compact space. Then, we develop a novel deep multi-task network to jointly learn social labels in different modalities from social images and apply it to common web images. In this manner, our approach is readily generalized to web images without social behavioral information. Finally, we introduce a high-level fusion sub-network to the aesthetics model, in which the social and visual representations of images are well balanced for aesthetics assessment. Experimental results on two benchmark datasets well verify the effectiveness of our approach and highlight the benefits of different types of social behavioral information for image aesthetics assessment.

References

[1]
Dolores Albarracin and Robert S. Wyer Jr. 2000. The cognitive impact of past behavior: Influences on beliefs, attitudes, and future behavioral decisions.J. Person. Social Psychol. 79, 1 (2000), 5.
[2]
Samiul Azam and Marina Gavrilova. 2016. Soft biometric: Give me your favorite images and I will tell your gender. In Proceedings of the 15th IEEE International Conference on Cognitive Informatics 8 Cognitive Computing. 535--541.
[3]
Chaoran Cui, Huiui Liu, Tao Lian, Liqiang Nie, Lei Zhu, and Yilong Yin. 2019. Distribution-oriented aesthetics assessment with semantic-aware hybrid network. IEEE Trans. Multim. 21, 5 (2019), 1209--1220.
[4]
Chaoran Cui, Wenya Yang, Cheng Shi, Meng Wang, Xiushan Nie, and Yilong Yin. 2020. Personalized image quality assessment with social-sensed aesthetic preference. Inf. Sci. 512 (2020), 780--794.
[5]
Peng Cui, Shao-Wei Liu, Wen-Wu Zhu, Huan-Bo Luan, Tat-Seng Chua, and Shi-Qiang Yang. 2014. Social-sensed image search. ACM Trans. Inf. Syst. 32, 2 (2014), 8.
[6]
Peng Cui, Wenwu Zhu, Tat-Seng Chua, and Ramesh Jain. 2016. Social-sensed multimedia computing. IEEE Multim. 23, 1 (2016), 92--96.
[7]
Ritendra Datta, Dhiraj Joshi, Jia Li, and James Z. Wang. 2006. Studying aesthetics in photographic images using a computational approach. In Proceedings of the 8th European Conference on Computer Vision. 288--301.
[8]
Yubin Deng, Chen Change Loy, and Xiaoou Tang. 2017. Image aesthetic assessment: An experimental survey. IEEE Sig. Proc. Mag. 34, 4 (2017), 80--106.
[9]
Sagnik Dhar, Vicente Ordonez, and Tamara L. Berg. 2011. High level describable attributes for predicting aesthetics and interestingness. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1657--1664.
[10]
Bin-Bin Gao, Chao Xing, Chen-Wei Xie, Jianxin Wu, and Xin Geng. 2017. Deep label distribution learning with label ambiguity. IEEE Trans. Image Proc. 26, 6 (2017), 2825--2838.
[11]
Yuan Gao, Jiayi Ma, Mingbo Zhao, Wei Liu, and Alan L. Yuille. 2019. NDDR-CNN: Layerwise feature fusing in multi-task CNNs by neural discriminative dimensionality reduction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3205--3214.
[12]
Bo Geng, Linjun Yang, Chao Xu, Xian-Sheng Hua, and Shipeng Li. 2011. The role of attractiveness in web image search. In Proceedings of the 19th ACM International Conference on Multimedia. 63--72.
[13]
Ross Girshick. 2015. Fast R-CNN. In Proceedings of the IEEE International Conference on Computer Vision. 1440--1448.
[14]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770--778.
[15]
Jie Hu, Li Shen, and Gang Sun. 2018. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7132--7141.
[16]
Xin Jin, Le Wu, Xiaodong Li, Siyu Chen, Siwei Peng, Jingying Chi, Shiming Ge, Chenggen Song, and Geng Zhao. 2018. Predicting aesthetic score distribution through cumulative Jensen-Shannon divergence. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence. 77--84.
[17]
Dhiraj Joshi, Ritendra Datta, Elena Fedorovskaya, Quang-Tuan Luong, James Z. Wang, Jia Li, and Jiebo Luo. 2011. Aesthetics and emotions in images. IEEE Sig. Proc. Mag. 28, 5 (2011), 94--115.
[18]
Yueying Kao, Ran He, and Kaiqi Huang. 2017. Deep aesthetic quality assessment with semantic information. IEEE Trans. Image Proc. 26, 3 (2017), 1482--1495.
[19]
Rei Kawakami, Ryota Yoshihashi, Seiichiro Fukuda, Shaodi You, Makoto Iida, and Takeshi Naemura. 2019. Cross-connected networks for multi-task learning of detection and segmentation. In Proceedings of the IEEE International Conference on Image Processing. IEEE, 3636--3640.
[20]
Yan Ke, Xiaoou Tang, and Feng Jing. 2006. The design of high-level features for photo quality assessment. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 419--426.
[21]
Alex Kendall, Yarin Gal, and Roberto Cipolla. 2018. Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7482--7491.
[22]
Aditya Khosla, Atish Das Sarma, and Raffay Hamid. 2014. What makes an image popular? In Proceedings of the 23rd International Conference on World Wide Web. 867--876.
[23]
Shu Kong, Xiaohui Shen, Zhe Lin, Radomir Mech, and Charless Fowlkes. 2016. Photo aesthetics ranking network with attributes and content adaptation. In Proceedings of the 14th European Conference on Computer Vision. 662--679.
[24]
Michal Kucer, Alexander C. Loui, and David W. Messinger. 2018. Leveraging expert feature knowledge for predicting image aesthetics. IEEE Trans. Image Proc. 27, 10 (2018), 5100--5112.
[25]
Anan Liu, Yuting Su, Weizhi Nie, and Mohan Kankanhalli. 2017. Hierarchical clustering multi-task learning for joint human action grouping and recognition. IEEE Trans. Pattern Anal. Mach. Intell. 39, 1 (2017), 102--114.
[26]
Anan Liu, Ning Xu, Hanwang Zhang, Weizhi Nie, Yuting Su, and Yongdong Zhang. 2018. Multi-level policy and reward reinforcement learning for image captioning. In Proceedings of the 27th International Joint Conference on Artificial Intelligence. 821--827.
[27]
Shaowei Liu, Peng Cui, Wenwu Zhu, and Shiqiang Yang. 2015. Learning socially embedded visual representation from scratch. In Proceedings of the 23rd ACM International Conference on Multimedia. 109--118.
[28]
Shaowei Liu, Peng Cui, Wenwu Zhu, Shiqiang Yang, and Qi Tian. 2014. Social embedding image distance learning. In Proceedings of the 22nd ACM International Conference on Multimedia. 617--626.
[29]
Shikun Liu, Edward Johns, and Andrew J. Davison. 2019. End-to-end multi-task learning with attention. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1871--1880.
[30]
Pietro Lovato, Manuele Bicego, Cristina Segalin, Alessandro Perina, Nicu Sebe, and Marco Cristani. 2014. Faved! biometrics: Tell me which image you like and I’ll tell you who you are. IEEE Trans. Inf. Forens. Secur. 9, 3 (2014), 364--374.
[31]
Xin Lu, Zhe Lin, Hailin Jin, Jianchao Yang, and James Z. Wang. 2015. Rating image aesthetics using deep learning. IEEE Trans. Multim. 17, 11 (2015), 2021--2034.
[32]
Xin Lu, Zhe Lin, Xiaohui Shen, Radomir Mech, and James Z. Wang. 2015. Deep multi-patch aggregation network for image style, aesthetics, and quality estimation. In Proceedings of the IEEE International Conference on Computer Vision. 990--998.
[33]
Yiwen Luo and Xiaoou Tang. 2008. Photo and video quality evaluation: Focusing on the subject. In Proceedings of the European Conference on Computer Vision. 386--399.
[34]
Shuang Ma, Jing Liu, and Chang Wen Chen. 2017. A-Lamp: Adaptive layout-aware multi-patch deep convolutional neural network for photo aesthetic assessment. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4535--4544.
[35]
Long Mai, Hailin Jin, and Feng Liu. 2016. Composition-preserving deep photo aesthetics assessment. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 497--506.
[36]
Luca Marchesotti, Florent Perronnin, Diane Larlus, and Gabriela Csurka. 2011. Assessing the aesthetic quality of photographs using generic image descriptors. In Proceedings of the IEEE International Conference on Computer Vision. IEEE, 1784--1791.
[37]
Ishan Misra, Abhinav Shrivastava, Abhinav Gupta, and Martial Hebert. 2016. Cross-stitch networks for multi-task learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3994--4003.
[38]
Naila Murray, Luca Marchesotti, and Florent Perronnin. 2012. AVA: A large-scale database for aesthetic visual analysis. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2408--2415.
[39]
Radu Andrei Negoescu and Daniel Gatica-Perez. 2008. Analyzing flickr groups. In Proceedings of the International Conference on Content-based Image and Video Retrieval. 417--426.
[40]
Liqiang Nie, Yongqi Li, Fuli Feng, Xuemeng Song, Meng Wang, and Yinglong Wang. 2020. Large-scale question tagging via joint question-topic embedding learning. ACM Trans. Inf. Syst. 38, 2 (2020), 1--23.
[41]
Liqiang Nie, Meng Liu, and Xuemeng Song. 2019. Multimodal learning toward micro-video understanding. Synth. Lect. Image, Vid. Multim. Proc. 9, 4 (2019), 1--186.
[42]
Masashi Nishiyama, Takahiro Okabe, Imari Sato, and Yoichi Sato. 2011. Aesthetic quality classification of photographs based on color harmony. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 33--40.
[43]
Jian Ren, Xiaohui Shen, Zhe Lin, Radomir Mech, and David J. Foran. 2017. Personalized image aesthetics. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 638--647.
[44]
Sebastian Ruder. 2017. An overview of multi-task learning in deep neural networks. Arxiv Preprint Arxiv:1706.05098 (2017).
[45]
Sebastian Ruder, Joachim Bingel, Isabelle Augenstein, and Anders Søgaard. 2017. Sluice networks: Learning what to share between loosely related tasks. Stat 1050 (2017), 23.
[46]
Cristina Segalin, Alessandro Perina, Marco Cristani, and Alessandro Vinciarelli. 2017. The pictures we like are our image: Continuous mapping of favorite pictures into self-assessed and attributed personality traits. IEEE Trans. Affect. Comput. 8, 2 (2017), 268--285.
[47]
Kekai Sheng, Weiming Dong, Chongyang Ma, Xing Mei, Feiyue Huang, and Bao-Gang Hu. 2018. Attention-based multi-Patch aggregation for image aesthetic assessment. In Proceedings of the 26th ACM International Conference on Multimedia. 879--886.
[48]
Hossein Talebi and Peyman Milanfar. 2018. NIMA: Neural image assessment. IEEE Trans. Image Proc. 27, 8 (2018), 3998--4011.
[49]
Xiaoou Tang, Wei Luo, and Xiaogang Wang. 2013. Content-based photo quality assessment. IEEE Trans. Multim. 15, 8 (2013), 1930--1943.
[50]
Bart Thomee, David A. Shamma, Gerald Friedland, Benjamin Elizalde, Karl Ni, Douglas Poland, Damian Borth, and Li-Jia Li. 2016. YFCC100M: The new data in multimedia research. Commun. ACM 59, 2 (2016), 64--73.
[51]
Roelof van Zwol, Adam Rae, and Lluis Garcia Pueyo. 2010. Prediction of favourite photos using social, visual, and textual signals. In Proceedings of the 18th ACM International Conference on Multimedia. 1015--1018.
[52]
Yanxiang Wang, Hari Sundaram, and Lexing Xie. 2012. Social event detection with interaction graph modeling. In Proceedings of the 20th ACM International Conference on Multimedia. 865--868.
[53]
Yinwei Wei, Xiang Wang, Weili Guan, Liqiang Nie, Zhouchen Lin, and Baoquan Chen. 2020. Neural multimodal cooperative learning toward micro-video understanding. IEEE Trans. Image Proc. 29 (2020), 1--14.
[54]
Ning Xu, Anan Liu, Yongkang Wong, Yongdong Zhang, Weizhi Nie, Yuting Su, and Mohan Kankanhalli. 2019. Dual-stream recurrent neural network for video captioning. IEEE Trans. Circ. Syst. Vid. Technol. 29, 8 (2019), 2482--2493.
[55]
Yi-Liang Zhao, Qiang Chen, Shuicheng Yan, Tat-Seng Chua, and Daqing Zhang. 2013. Detecting profilable and overlapping communities with user-generated multimedia contents in LBSNs. ACM Trans. Multim. Comput. Commun. Applic. 10, 1 (2013), 3.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Multimedia Computing, Communications, and Applications
ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 16, Issue 3s
Special Issue on Privacy and Security in Evolving Internet of Multimedia Things and Regular Papers
October 2020
190 pages
ISSN:1551-6857
EISSN:1551-6865
DOI:10.1145/3444536
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 31 December 2020
Accepted: 01 July 2020
Revised: 01 April 2020
Received: 01 December 2019
Published in TOMM Volume 16, Issue 3s

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Image aesthetics assessment
  2. multi-task learning
  3. social sense
  4. user perception modeling

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • National Natural Science Foundation of China
  • Fostering Project of Dominant Discipline and Talent Team of Shandong Province Higher Education Institutions
  • National Key R8D Program of China
  • Natural Science Foundation of Shandong Province

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)20
  • Downloads (Last 6 weeks)3
Reflects downloads up to 25 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media