skip to main content
10.1145/3664647.3680972acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Semantics-Aware Image Aesthetics Assessment using Tag Matching and Contrastive Ranking

Published: 28 October 2024 Publication History

Abstract

The perception of image aesthetics is built upon the understanding of semantic content. However, how to evaluate the aesthetic quality of images with diversified semantic backgrounds remains challenging in image aesthetics assessment (IAA). To address the dilemma, this paper presents a semantics-aware image aesthetics assessment approach, which first analyzes the semantic content of images and then models the aesthetic distinctions among images from two perspectives, i.e., aesthetic attribute and aesthetic level. Concretely, we propose two strategies, dubbed tag matching and contrastive ranking, to extract knowledge pertaining to image aesthetics. The tag matching identifies the semantic category and the dominant aesthetic attributes based on predefined tag libraries. The contrastive ranking is designed to uncover the comparative relationships among images with different aesthetic levels but similar semantic backgrounds. In the process of contrastive ranking, the impact of long-tailed distribution of aesthetic data is also considered by balanced sampling and traversal contrastive learning. Extensive experiments and comparisons on three benchmark IAA databases demonstrate the superior performance of the proposed model in terms of both prediction accuracy and alleviating long-tailed effect. The code will be public at https://github.com/yzc-ippl/TMCR **REMOVE 2nd URL**://github.com/yzc-ippl/TMCR.

References

[1]
Luigi Celona, Marco Leonardi, Paolo Napoletano, and Alessandro Rozza. 2022. Composition and style attributes guided image aesthetic assessment. IEEE Transactions on Image Processing, Vol. 31 (2022), 5009--5024.
[2]
Pengfei Chen, Leida Li, Jinjian Wu, Weisheng Dong, and Guangming Shi. 2021. Contrastive self-supervised pre-training for video quality assessment. IEEE Transactions on Image Processing, Vol. 31 (2021), 458--471.
[3]
Chaoran Cui, Huihui Liu, Tao Lian, Liqiang Nie, Lei Zhu, and Yilong Yin. 2018. Distribution-oriented aesthetics assessment with semantic-aware hybrid network. IEEE Transactions on Multimedia, Vol. 21, 5 (2018), 1209--1220.
[4]
Ritendra Datta, Dhiraj Joshi, Jia Li, and James Z Wang. 2006. Studying aesthetics in photographic images using a computational approach. In Proceedings of the European Conference on Computer Vision. 288--301.
[5]
Yubin Deng, Chen Change Loy, and Xiaoou Tang. 2017. Image aesthetic assessment: An experimental survey. IEEE Signal Process Mag, Vol. 34, 4 (2017), 80--106.
[6]
Shuai He, Yongchang Zhang, Rui Xie, Dongxiang Jiang, and Anlong Ming. 2022. Rethinking image aesthetics assessment: Models, datasets and benchmarks. In Proceedings of the International Joint Conference on Artificial Intelligence. 942--948.
[7]
Simon Hentschel, Konstantin Kobs, and Andreas Hotho. 2022. CLIP knows image aesthetics. Frontiers in Artificial Intelligence, Vol. 5 (2022), 976235.
[8]
Richang Hong, Luming Zhang, and Dacheng Tao. 2016. Unified photo enhancement by discovering aesthetic communities from flickr. IEEE transactions on Image Processing, Vol. 25, 3 (2016), 1124--1135.
[9]
Vlad Hosu, Bastian Goldlucke, and Dietmar Saupe. 2019. Effective aesthetics prediction with multi-level spatially pooled features. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 9375--9383.
[10]
Jingwen Hou, Weisi Lin, Yuming Fang, Haoning Wu, Chaofeng Chen, Liang Liao, and Weide Liu. 2023. Towards Transparent Deep Image Aesthetics Assessment with Tag-based Content Descriptors. IEEE Transactions on Image Processing (2023), 1--1. https://doi.org/10.1109/TIP.2023.3308852
[11]
Yipo Huang, Leida Li, Pengfei Chen, Jinjian Wu, Yuzhe Yang, Yaqian Li, and Guangming Shi. 2024. Coarse-to-fine Image Aesthetics Assessment With Dynamic Attribute Selection. IEEE Trans. Multimedia (2024), 1--14. https://doi.org/10.1109/TMM.2024.3389452
[12]
Yipo Huang, Leida Li, Yuzhe Yang, Yaqian Li, and Yandong Guo. 2023. Explainable and Generalizable Blind Image Quality Assessment via Semantic Attribute Reasoning. IEEE Transactions on Multimedia, Vol. 25 (2023), 7672--7685.
[13]
Yipo Huang, Xiangfei Sheng, Zhichao Yang, Quan Yuan, Zhichao Duan, Pengfei Chen, Leida Li, Weisi Lin, and Guangming Shi. 2024. AesExpert: Towards Multi-modality Foundation Model for Image Aesthetics Perception. arXiv preprint arXiv:2404.09624 (2024).
[14]
Yipo Huang, Quan Yuan, Xiangfei Sheng, Zhichao Yang, Haoning Wu, Pengfei Chen, Yuzhe Yang, Leida Li, and Weisi Lin. 2024. AesBench: An Expert Benchmark for Multimodal Large Language Models on Image Aesthetics Perception. arXiv preprint arXiv:2401.08276 (2024).
[15]
Gengyun Jia, Peipei Li, and Ran He. 2023. Theme-Aware Aesthetic Distribution Prediction With Full-Resolution Photographs. IEEE Transactions on Neural Networks and Learning Systems, Vol. 34, 11 (2023), 8654--8668.
[16]
Bin Jin, Maria V Ortiz Segovia, and Sabine Süsstrunk. 2016. Image aesthetic predictors based on weighted CNNs. In Proceedings of the IEEE International Conference on Image Processing. IEEE, 2291--2295.
[17]
Xin Jin, Qiang Deng, Hao Lou, Xiqiao Li, and Chaoen Xiao. 2022. Fine-grained regression for image aesthetic scoring. Cognitive Robotics, Vol. 2 (2022), 202--210.
[18]
Bingyi Kang, Yu Li, Sa Xie, Zehuan Yuan, and Jiashi Feng. 2021. Exploring balanced feature spaces for representation learning. In Proceedings of the International Conference on Learning Representations.
[19]
Junjie Ke, Qifei Wang, Yilin Wang, Peyman Milanfar, and Feng Yang. 2021. Musiq: Multi-scale image quality transformer. In Proceedings of the IEEE International Conference on Computer Vision. 5148--5157.
[20]
Junjie Ke, Keren Ye, Jiahui Yu, Yonghui Wu, Peyman Milanfar, and Feng Yang. 2023. VILA: Learning Image Aesthetics from User Comments with Vision-Language Pretraining. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 10041--10051.
[21]
Yan Ke, Xiaoou Tang, and Feng Jing. 2006. The design of high-level features for photo quality assessment. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 419--426.
[22]
Prannay Khosla, Piotr Teterwak, Chen Wang, Aaron Sarna, Yonglong Tian, Phillip Isola, Aaron Maschinot, Ce Liu, and Dilip Krishnan. 2020. Supervised contrastive learning. Advances in Neural Information Processing Systems, Vol. 33 (2020), 18661--18673.
[23]
Shu Kong, Xiaohui Shen, Zhe Lin, Radomir Mech, and Charless Fowlkes. 2016. Photo aesthetics ranking network with attributes and content adaptation. In Proceedings of the European Conference on Computer Vision. 662--679.
[24]
Michal Kucer, Alexander C Loui, and David W Messinger. 2018. Leveraging expert feature knowledge for predicting image aesthetics. IEEE Transactions on Image Processing, Vol. 27, 10 (2018), 5100--5112.
[25]
Phuc H Le-Khac, Graham Healy, and Alan F Smeaton. 2020. Contrastive representation learning: A framework and review. IEEE Access, Vol. 8 (2020), 193907--193934.
[26]
Jun-Tae Lee and Chang-Su Kim. 2019. Image aesthetic assessment based on pairwise comparison a unified approach to score regression, binary classification, and personalization. In Proceedings of the IEEE International Conference on Computer Vision. 1191--1200.
[27]
Junnan Li, Pan Zhou, Caiming Xiong, and Steven CH Hoi. 2021. Prototypical contrastive learning of unsupervised representations. In Proceedings of the International Conference on Learning Representations.
[28]
Leida Li, Yipo Huang, Jinjian Wu, Yuzhe Yang, Yaqian Li, Yandong Guo, and Guangming Shi. 2023. Theme-aware Visual Attribute Reasoning for Image Aesthetics Assessment. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 33, 9 (2023), 4798--4811.
[29]
Leida Li, Hancheng Zhu, Sicheng Zhao, Guiguang Ding, and Weisi Lin. 2020. Personality-assisted multi-task learning for generic and personalized image aesthetics assessment. IEEE Transactions on Image Processing, Vol. 29 (2020), 3898--3910.
[30]
Ilya Loshchilov and Frank Hutter. 2017. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101 (2017).
[31]
Xin Lu, Zhe Lin, Hailin Jin, Jianchao Yang, and James Z Wang. 2014. Rapid: Rating pictorial aesthetics using deep learning. In Proceedings of the 22nd ACM International Conference on Multimedia. 457--466.
[32]
Xin Lu, Zhe Lin, Xiaohui Shen, Radomir Mech, and James Z Wang. 2015. Deep multi-patch aggregation network for image style, aesthetics, and quality estimation. In Proceedings of the IEEE International Conference on Computer Vision. 990--998.
[33]
Shuang Ma, Jing Liu, and Chang Wen Chen. 2017. A-lamp: Adaptive layout-aware multi-patch deep convolutional neural network for photo aesthetic assessment. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4535--4544.
[34]
Naila Murray and Albert Gordo. 2017. A deep architecture for unified aesthetic prediction. arXiv preprint arXiv:1708.04890 (2017).
[35]
Naila Murray, Luca Marchesotti, and Florent Perronnin. 2012. AVA: A large-scale database for aesthetic visual analysis. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2408--2415.
[36]
Yuzhen Niu, Shanshan Chen, Bingrui Song, Zhixian Chen, and Wenxi Liu. 2022. Comment-guided semantics-aware image aesthetics assessment. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 33, 3 (2022), 1487--1492.
[37]
Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. 2017. Automatic differentiation in pytorch. In Advances in Neural Information Processing Systems. 1--4.
[38]
Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. 2021. Learning transferable visual models from natural language supervision. In Proceedings of the International Conference on Machine Learning. PMLR, 8748--8763.
[39]
Alec Radford, Karthik Narasimhan, Tim Salimans, Ilya Sutskever, et al. 2018. Improving language understanding by generative pre-training. (2018).
[40]
Xiangfei Sheng, Leida Li, Pengfei Chen, Jinjian Wu, Weisheng Dong, Yuzhe Yang, Liwu Xu, Yaqian Li, and Guangming Shi. 2023. AesCLIP: Multi-Attribute Contrastive Learning for Image Aesthetics Assessment. In Proceedings of the 31st ACM International Conference on Multimedia. 1117--1126.
[41]
Wei-Tse Sun, Ting-Hsuan Chao, Yin-Hsi Kuo, and Winston H Hsu. 2017. Photo filter recommendation by category-aware aesthetic learning. IEEE Transactions on Multimedia, Vol. 19, 8 (2017), 1870--1880.
[42]
Hossein Talebi and Peyman Milanfar. 2018. NIMA: Neural image assessment. IEEE Transactions on Image Processing, Vol. 27, 8 (2018), 3998--4011.
[43]
Xiaoou Tang, Wei Luo, and Xiaogang Wang. 2013. Content-Based Photo Quality Assessment. IEEE Transactions on Multimedia, Vol. 15, 8 (2013), 1930--1943.
[44]
Laurens Van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of machine learning research, Vol. 9, 11 (2008), 2579--2605.
[45]
Zhirong Wu, Yuanjun Xiong, Stella X Yu, and Dahua Lin. 2018. Unsupervised feature learning via non-parametric instance discrimination. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3733--3742.
[46]
Qianqian Xu, Qingming Huang, and Yuan Yao. 2012. Online crowdsourcing subjective image quality assessment. In Proceedings of the 20th ACM international conference on Multimedia. 359--368.
[47]
Qianqian Xu, Jiechao Xiong, Xiaochun Cao, and Yuan Yao. 2016. Parsimonious mixed-effects HodgeRank for crowdsourced preference aggregation. In Proceedings of the 24th ACM international conference on Multimedia. 841--850.
[48]
Qianqian Xu, Jiechao Xiong, Qingming Huang, and Yuan Yao. 2013. Robust evaluation for quality of experience in crowdsourcing. In Proceedings of the 21st ACM international conference on Multimedia. 43--52.
[49]
Chenggang Yan, Biao Gong, Yuxuan Wei, and Yue Gao. 2020. Deep multi-view enhancement hashing for image retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 43, 4 (2020), 1445--1451.
[50]
Yuzhe Yang, Liwu Xu, Leida Li, Nan Qie, Yaqian Li, Peng Zhang, and Yandong Guo. 2022. Personalized image aesthetics assessment with rich attributes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 19861--19869.
[51]
Yuzhe Yang, Kaiwen Zha, Yingcong Chen, Hao Wang, and Dina Katabi. 2021. Delving into deep imbalanced regression. In Proceedings of the International Conference on Machine Learning. PMLR, 11842--11851.
[52]
Zhichao Yang, Leida Li, Yuzhe Yang, Yaqian Li, and Weisi Lin. 2024. Multi-Level Transitional Contrast Learning for Personalized Image Aesthetics Assessment. IEEE Transactions on Multimedia, Vol. 26 (2024), 1944--1956.
[53]
Hui Zeng, Zisheng Cao, Lei Zhang, and Alan C Bovik. 2019. A unified probabilistic formulation of image aesthetic assessment. IEEE Transactions on Image Processing, Vol. 29 (2019), 1548--1561.
[54]
Kaiwen Zha, Peng Cao, Jeany Son, Yuzhe Yang, and Dina Katabi. 2024. Rank-N-Contrast: Learning Continuous Representations for Regression. Advances in Neural Information Processing Systems, Vol. 36 (2024).
[55]
Kai Zhao, Kun Yuan, Ming Sun, Mading Li, and Xing Wen. 2023. Quality-aware pre-trained models for blind image quality assessment. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 22302--22313.
[56]
Zhipeng Zhong, Fei Zhou, and Guoping Qiu. 2023. Aesthetically relevant image captioning. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 37. 3733--3741.
[57]
Ye Zhou, Xin Lu, Junping Zhang, and James Z Wang. 2016. Joint image and text representation for aesthetics analysis. In Proceedings of the 24th ACM international conference on Multimedia. 262--266.
[58]
Jianggang Zhu, Zheng Wang, Jingjing Chen, Yi-Ping Phoebe Chen, and Yu-Gang Jiang. 2022. Balanced contrastive learning for long-tailed visual recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6908--6917.

Index Terms

  1. Semantics-Aware Image Aesthetics Assessment using Tag Matching and Contrastive Ranking

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '24: Proceedings of the 32nd ACM International Conference on Multimedia
    October 2024
    11719 pages
    ISBN:9798400706868
    DOI:10.1145/3664647
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 28 October 2024

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. clip
    2. contrastive learning
    3. image aesthetics assessment
    4. semantic and aesthetic perception

    Qualifiers

    • Research-article

    Funding Sources

    • the OPPO Research Fund
    • the National Natural Science Foundation of China

    Conference

    MM '24
    Sponsor:
    MM '24: The 32nd ACM International Conference on Multimedia
    October 28 - November 1, 2024
    Melbourne VIC, Australia

    Acceptance Rates

    MM '24 Paper Acceptance Rate 1,150 of 4,385 submissions, 26%;
    Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 77
      Total Downloads
    • Downloads (Last 12 months)77
    • Downloads (Last 6 weeks)35
    Reflects downloads up to 18 Jan 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media