skip to main content
research-article

Image Quality Assessment Using Contrastive Learning

Published: 01 January 2022 Publication History

Abstract

We consider the problem of obtaining image quality representations in a self-supervised manner. We use prediction of distortion type and degree as an auxiliary task to learn features from an unlabeled image dataset containing a mixture of synthetic and realistic distortions. We then train a deep Convolutional Neural Network (CNN) using a contrastive pairwise objective to solve the auxiliary problem. We refer to the proposed training framework and resulting deep IQA model as the CONTRastive Image QUality Evaluator (CONTRIQUE). During evaluation, the CNN weights are frozen and a linear regressor maps the learned representations to quality scores in a No-Reference (NR) setting. We show through extensive experiments that CONTRIQUE achieves competitive performance when compared to state-of-the-art NR image quality models, even without any additional fine-tuning of the CNN backbone. The learned representations are highly robust and generalize well across images afflicted by either synthetic or authentic distortions. Our results suggest that powerful quality representations with perceptual relevance can be obtained without requiring large labeled subjective image quality datasets. The implementations used in this paper are available at <uri>https://github.com/pavancm/CONTRIQUE</uri>.

References

[1]
X. Yu, C. G. Bampis, P. Gupta, and A. C. Bovik, “Predicting the quality of images compressed after distortion in two steps,” IEEE Trans. Image Process., vol. 28, no. 12, pp. 5757–5770, Dec. 2019.
[2]
K. Ding, K. Ma, S. Wang, and E. P. Simoncelli, “Comparison of full-reference image quality models for optimization of image processing systems,” Int. J. Comput. Vis., vol. 129, no. 4, pp. 1258–1281, Apr. 2021.
[3]
H. Talebi and P. Milanfar, “Learned perceptual image enhancement,” in Proc. IEEE Int. Conf. Comput. Photography (ICCP), May 2018, pp. 1–13.
[4]
H. Zheng, H. Yang, J. Fu, Z.-J. Zha, and J. Luo, “Learning conditional knowledge distillation for degraded-reference image quality assessment,” in Proc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV), Oct. 2021, pp. 10242–10251.
[5]
H. R. Sheikh, M. F. Sabir, and A. C. Bovik, “A statistical evaluation of recent full reference image quality assessment algorithms,” IEEE Trans. Image Process., vol. 15, no. 11, pp. 3440–3451, Mar. 2006.
[6]
E. C. Larson and D. M. Chandler, “Most apparent distortion: Full-reference image quality assessment and the role of strategy,” J. Electron. Imag., vol. 19, no. 1, 2010, Art. no.
[7]
D. Ghadiyaram and A. C. Bovik, “Massive online crowdsourced study of subjective and objective picture quality,” IEEE Trans. Image Process., vol. 25, no. 1, pp. 372–387, Jun. 2015.
[8]
V. Hosu, H. Lin, T. Sziranyi, and D. Saupe, “KonIQ-10k: An ecologically valid database for deep learning of blind image quality assessment,” IEEE Trans. Image Process., vol. 29, pp. 4041–4056, 2020.
[9]
Z. Ying, H. Niu, P. Gupta, D. Mahajan, D. Ghadiyaram, and A. Bovik, “From patches to pictures (PaQ-2-PiQ): Mapping the perceptual space of picture quality,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2020, pp. 3575–3585.
[10]
Y. Fang, H. Zhu, Y. Zeng, K. Ma, and Z. Wang, “Perceptual quality assessment of smartphone photography,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2020, pp. 3677–3686.
[11]
A. K. Moorthy and A. C. Bovik, “Blind image quality assessment: From natural scene statistics to perceptual quality,” IEEE Trans. Image Process., vol. 20, no. 12, pp. 3350–3364, Dec. 2011.
[12]
M. A. Saad, A. C. Bovik, and C. Charrier, “Blind image quality assessment: A natural scene statistics approach in the DCT domain,” IEEE Trans. Image Process., vol. 21, no. 8, pp. 3339–3352, Aug. 2012.
[13]
A. Mittal, A. K. Moorthy, and A. C. Bovik, “No-reference image quality assessment in the spatial domain,” IEEE Trans. Image Process., vol. 21, no. 12, pp. 4695–4708, Dec. 2012.
[14]
A. Mittal, R. Soundararajan, and A. C. Bovik, “Making a ‘completely blind’ image quality analyzer,” IEEE Signal Process. Lett., vol. 20, no. 3, pp. 209–212, Mar. 2013.
[15]
K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2016, pp. 770–778.
[16]
K. He, G. Gkioxari, P. Dollár, and R. Girshick, “Mask R-CNN,” in Proc. ICCV, 2017, pp. 2961–2969.
[17]
D. Sun, X. Yang, M.-Y. Liu, and J. Kautz, “PWC-Net: CNNs for optical flow using pyramid, warping, and cost volume,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., Jun. 2018, pp. 8934–8943.
[18]
W. Zhang, K. Ma, J. Yan, D. Deng, and Z. Wang, “Blind image quality assessment using a deep bilinear convolutional neural network,” IEEE Trans. Circuits Syst. Video Technol., vol. 30, no. 1, pp. 36–47, Jan. 2018.
[19]
J. Kim and S. Lee, “Fully deep blind image quality predictor,” IEEE J. Sel. Topics Signal Process., vol. 11, no. 1, pp. 206–220, Feb. 2016.
[20]
H. Zeng, L. Zhang, and A. C. Bovik, “A probabilistic quality representation approach to deep blind image quality prediction,” 2017, arXiv:1708.08190.
[21]
S. Suet al., “Blindly assess image quality in the wild guided by a self-adaptive hyper network,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2020, pp. 3667–3676.
[22]
O. Russakovskyet al., “ImageNet large scale visual recognition challenge,” Int. J. Comput. Vis., vol. 115, no. 3, pp. 211–252, Dec. 2015.
[23]
A. Dosovitskiy, J. T. Springenberg, M. Riedmiller, and T. Brox, “Discriminative unsupervised feature learning with convolutional neural networks,” in Proc. Neural Inf. Process. Syst., 2014, pp. 766–774.
[24]
P. Bojanowski and A. Joulin, “Unsupervised learning in noise,” in Proc. Int. Joint Conf. Neural Netw., 1989, pp. 517–526.
[25]
T. Chen, S. Kornblith, M. Norouzi, and G. Hinton, “A simple framework for contrastive learning of visual representations,” in Proc. Int. Conf. Mach. Learn., 2020, pp. 1597–1607.
[26]
K. He, H. Fan, Y. Wu, S. Xie, and R. Girshick, “Momentum contrast for unsupervised visual representation learning,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2020, pp. 9729–9738.
[27]
P. Ye, J. Kumar, L. Kang, and D. Doermann, “Unsupervised feature learning framework for no-reference image quality assessment,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Jun. 2012, pp. 1098–1105.
[28]
J. Xu, P. Ye, Q. Li, H. Du, Y. Liu, and D. Doermann, “Blind image quality assessment based on high order statistics aggregation,” IEEE Trans. Image Process., vol. 25, no. 9, pp. 4444–4457, Sep. 2016.
[29]
J. Kim, H. Zeng, D. Ghadiyaram, S. Lee, L. Zhang, and A. C. Bovik, “Deep convolutional neural models for picture-quality prediction: Challenges and solutions to data-driven image quality assessment,” IEEE Signal Process. Mag., vol. 34, no. 6, pp. 130–141, Nov. 2017.
[30]
K. Ma, W. Liu, K. Zhang, Z. Duanmu, Z. Wang, and W. Zuo, “End-to-end blind image quality assessment using deep neural networks,” IEEE Trans. Image Process., vol. 27, no. 3, pp. 1202–1213, Mar. 2017.
[31]
H. Talebi and P. Milanfar, “NIMA: Neural image assessment,” IEEE Trans. Image Process., vol. 27, no. 8, pp. 3998–4011, Aug. 2018.
[32]
H. Talebi and P. Milanfar, “Learning to resize images for computer vision tasks,” in Proc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV), Oct. 2021, pp. 497–506.
[33]
H. Talebi, E. Amid, P. Milanfar, and M. K. Warmuth, “Rank-smoothed pairwise learning in perceptual quality assessment,” in Proc. IEEE Int. Conf. Image Process. (ICIP), Oct. 2020, pp. 3413–3417.
[34]
H. Zhu, L. Li, J. Wu, W. Dong, and G. Shi, “MetaIQA: Deep meta-learning for no-reference image quality assessment,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2020, pp. 14143–14152.
[35]
A. Vaswaniet al., “Attention is all you need,” in Proc. Adv. Neural Inf. Process. Syst., vol. 30, 2017.
[36]
N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, and S. Zagoruyko, “End-to-end object detection with transformers,” in Proc. Eur. Conf. Comput. Vis. Cham, Switzerland: Springer, 2020, pp. 213–229.
[37]
H. Chenet al., “Pre-trained image processing transformer,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2021, pp. 12299–12310.
[38]
A. Dosovitskiyet al., “An image is worth $16\times16$ words: Transformers for image recognition at scale,” in Proc. Int. Conf. Learn. Represent., 2021.
[39]
J. Ke, Q. Wang, Y. Wang, P. Milanfar, and F. Yang, “MUSIQ: Multi-scale image quality transformer,” in Proc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV), Oct. 2021, pp. 5148–5157.
[40]
S. Gidaris, P. Singh, and N. Komodakis, “Unsupervised representation learning by predicting image rotations,” in Proc. Int. Conf. Learn. Represent., 2018.
[41]
R. Zhang, P. Isola, and A. A. Efros, “Colorful image colorization,” in Proc. Eur. Conf. Comput. Vis. Cham, Switzerland: Springer, 2016, pp. 649–666.
[42]
G. Larsson, M. Maire, and G. Shakhnarovich, “Colorization as a proxy task for visual understanding,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jul. 2017, pp. 6874–6883.
[43]
D. Pathak, P. Krahenbuhl, J. Donahue, T. Darrell, and A. A. Efros, “Context encoders: Feature learning by inpainting,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2016, pp. 2536–2544.
[44]
X. Liu, J. V. D. Weijer, and A. D. Bagdanov, “Exploiting unlabeled data in CNNs by self-supervised learning to rank,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 41, no. 8, pp. 1862–1878, Aug. 2019.
[45]
W. Kim, A.-D. Nguyen, S. Lee, and A. C. Bovik, “Dynamic receptive field generation for full-reference image quality assessment,” IEEE Trans. Image Process., vol. 29, pp. 4219–4231, 2020.
[46]
K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” in Proc. Int. Conf. Learn. Represent., 2015, pp. 1–14.
[47]
P. Khoslaet al., “Supervised contrastive learning,” in Proc. Adv. Neural Inf. Process. Syst., vol. 33, 2020, pp. 18661–18673.
[48]
Z. Wang, E. P. Simoncelli, and A. C. Bovik, “Multiscale structural similarity for image quality assessment,” in Proc. 37th Asilomar Conf. Signals, Syst. Comput., vol. 2, Jul. 2003, pp. 1398–1402.
[49]
R. Zhang, P. Isola, A. A. Efros, E. Shechtman, and O. Wang, “The unreasonable effectiveness of deep features as a perceptual metric,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., Jun. 2018, pp. 586–595.
[50]
U. Rajashekar, Z. Wang, and E. P. Simoncelli, “Perceptual quality assessment of color images using adaptive signal representation,” in Proc. 15th Hum. Vis. Electron. Imag., vol. 7527, 2010, pp. 467–475.
[51]
D. Ghadiyaram and A. C. Bovik, “Perceptual quality prediction on authentically distorted images using a bag of features approach,” J. Vis., vol. 17, no. 1, p. 32, 2016.
[52]
Z. Tu, X. Yu, Y. Wang, N. Birkbeck, B. Adsumilli, and A. C. Bovik, “RAPIQUE: Rapid and accurate video quality prediction of user generated content,” IEEE Open J. Signal Process., vol. 2, pp. 425–440, 2021.
[53]
C. G. Bampis, P. Gupta, R. Soundararajan, and A. C. Bovik, “SpEED-QA: Spatial efficient entropic differencing for image and video quality,” IEEE Signal Process. Lett., vol. 24, no. 9, pp. 1333–1337, Sep. 2017.
[54]
C. G. Bampis, Z. Li, and A. C. Bovik, “Spatiotemporal feature integration and model fusion for full reference video quality assessment,” IEEE Trans. Circuits Syst. Video Technol., vol. 29, no. 8, pp. 2256–2270, Aug. 2018.
[55]
P. C. Madhusudana, N. Birkbeck, Y. Wang, B. Adsumilli, and A. C. Bovik, “ST-GREED: Space-time generalized entropic differences for frame rate dependent video quality prediction,” IEEE Trans. Image Process., vol. 30, pp. 7446–7457, 2021.
[56]
H. Zhang, M. Cisse, Y. N. Dauphin, and D. Lopez-Paz, “mixup: Beyond empirical risk minimization,” in Proc. Int. Conf. Learn. Represent., 2018.
[57]
E. D. Cubuk, B. Zoph, D. Mane, V. Vasudevan, and Q. V. Le, “AutoAugment: Learning augmentation strategies from data,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2019, pp. 113–123.
[58]
L. Kang, P. Ye, Y. Li, and D. Doermann, “Convolutional neural networks for no-reference image quality assessment,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Jun. 2014, pp. 1733–1740.
[59]
S. Bosse, D. Maniry, K. R. Müller, T. Wiegand, and W. Samek, “Deep neural networks for no-reference and full-reference image quality assessment,” IEEE Trans. Image Process., vol. 27, no. 1, pp. 206–219, Jan. 2017.
[60]
A. V. D. Oord, Y. Li, and O. Vinyals, “Representation learning with contrastive predictive coding,” in Proc. Adv. Neural Inf. Process. Syst., 2018.
[61]
P. Bachman, R. D. Hjelm, and W. Buchwalter, “Learning representations by maximizing mutual information across views,” in Proc. Adv. Neural Inf. Process. Syst., 2019.
[62]
H. Lin, V. Hosu, and D. Saupe, “DeepFL-IQA: Weak supervision for deep IQA feature learning,” 2020, arXiv:2001.08113.
[63]
N. Murray, L. Marchesotti, and F. Perronnin, “AVA: A large-scale database for aesthetic visual analysis,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Jun. 2012, pp. 2408–2415.
[64]
T.-Y. Linet al., “Microsoft COCO: Common objects in context,” in Proc. Eur. Conf. Comput. Vis. Cham, Switzerland: Springer, 2014, pp. 740–755.
[65]
E. Mavridaki and V. Mezaris, “No-reference blur assessment in natural images using Fourier transform and spatial pyramids,” in Proc. IEEE Int. Conf. Image Process. (ICIP), Oct. 2014, pp. 566–570.
[66]
M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman, “The Pascal visual object classes (VOC) challenge,” Int. J. Comput. Vis., vol. 88, no. 2, pp. 303–338, Sep. 2010.
[67]
N. Ponomarenkoet al., “Image database TID2013: Peculiarities, results and perspectives,” Signal Process., Image Commun., vol. 30, pp. 57–77, Jan. 2015.
[68]
H. Lin, V. Hosu, and D. Saupe, “KADID-10k: A large-scale artificially distorted IQA database,” in Proc. 11th Int. Conf. Qual. Multimedia Exper. (QoMEX), Jun. 2019, pp. 1–3.
[69]
I. Loshchilov and F. Hutter, “SGDR: Stochastic gradient descent with warm restarts,” in Proc. Int. Conf. Learn. Represent., 2017.
[70]
B. Thomeeet al., “YFCC100M: The new data in multimedia research,” Commun. ACM, vol. 59, no. 2, pp. 64–73, 2016.
[71]
Final Report From the Video Quality Experts Group on the Validation of Objective Quality Metrics for Video Quality Assessment, VQEG, Glasgow, U.K., 2000.
[72]
L. Van der Maaten and G. Hinton, “Visualizing data using t-SNE,” J. Mach. Learn. Res., vol. 9, no. 11, 2008.
[73]
Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: From error visibility to structural similarity,” IEEE Trans. Image Process., vol. 13, no. 4, pp. 600–612, Apr. 2004.
[74]
L. Zhang, L. Zhang, X. Mou, and D. Zhang, “FSIM: A feature similarity index for image quality assessment,” IEEE Trans. Image Process., vol. 20, no. 8, pp. 2378–2386, Aug. 2011.
[75]
L. Zhang, Y. Shen, and H. Li, “VSI: A visual saliency-induced index for perceptual image quality assessment,” IEEE Trans. Image Process., vol. 23, no. 10, pp. 4270–4281, Aug. 2014.
[76]
E. Prashnani, H. Cai, Y. Mostofi, and P. Sen, “PieAPP: Perceptual image-error assessment through pairwise preference,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., Jun. 2018, pp. 1808–1817.
[77]
K. Ding, K. Ma, S. Wang, and E. P. Simoncelli, “Image quality assessment: Unifying structure and texture similarity,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 44, no. 5, pp. 2567–2581, May 2020.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image IEEE Transactions on Image Processing
IEEE Transactions on Image Processing  Volume 31, Issue
2022
3518 pages

Publisher

IEEE Press

Publication History

Published: 01 January 2022

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 31 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Diffusion Model-Based Visual Compensation Guidance and Visual Difference Analysis for No-Reference Image Quality AssessmentIEEE Transactions on Image Processing10.1109/TIP.2024.352380034(263-278)Online publication date: 1-Jan-2025
  • (2025)Subjective and Objective Analysis of Indian Social Media Video QualityIEEE Transactions on Image Processing10.1109/TIP.2024.351237634(140-153)Online publication date: 1-Jan-2025
  • (2025)Stealthiness Assessment of Adversarial Perturbation: From a Visual PerspectiveIEEE Transactions on Information Forensics and Security10.1109/TIFS.2024.352001620(898-913)Online publication date: 1-Jan-2025
  • (2024)Causal-IQAProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3694624(61747-61762)Online publication date: 21-Jul-2024
  • (2024)Integrating global context contrast and local sensitivity for blind image quality assessmentProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3693189(27920-27941)Online publication date: 21-Jul-2024
  • (2024)Adaptive feature selection for no-reference image quality assessment by mitigating semantic noise sensitivityProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3693183(27808-27821)Online publication date: 21-Jul-2024
  • (2024)QAGaitProceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence and Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence and Fourteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v38i6.28391(5785-5793)Online publication date: 20-Feb-2024
  • (2024)Transformer-based no-reference image quality assessment via supervised contrastive learningProceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence and Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence and Fourteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v38i5.28285(4829-4837)Online publication date: 20-Feb-2024
  • (2024)Scaling and maskingProceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence and Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence and Fourteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v38i4.28170(3792-3801)Online publication date: 20-Feb-2024
  • (2024)Images Inpainting Quality Evaluation Using Structural Features and Visual SaliencyAdvances in Multimedia10.1155/2024/50669162024Online publication date: 1-Jan-2024
  • Show More Cited By

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media