skip to main content
10.1145/3664647.3680907acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Highly Efficient No-reference 4K Video Quality Assessment with Full-Pixel Covering Sampling and Training Strategy

Published: 28 October 2024 Publication History

Abstract

Deep Video Quality Assessment (VQA) methods have shown impressive high-performance capabilities. Notably, no-reference (NR) VQA methods play a vital role in situations where obtaining reference videos is restricted or not feasible. Nevertheless, as more streaming videos are being created in ultra-high definition (e.g., 4K) to enrich viewers' experiences, the current deep VQA methods face unacceptable computational costs. Furthermore, the resizing, cropping, and local sampling techniques employed in these methods can compromise the details and content of original 4K videos, thereby negatively impacting quality assessment. In this paper, we propose a highly efficient and novel NR 4K VQA technology. Specifically, first, a novel data sampling and training strategy is proposed to tackle the problem of excessive resolution. This strategy allows the VQA Swin Transformer-based model to effectively train and make inferences using the full data of 4K videos on standard consumer-grade GPUs without compromising content or details. Second, a weighting and scoring scheme is developed to mimic the human subjective perception mode, which is achieved by considering the distinct impact of each sub-region within a 4K frame on the overall perception. Third, we incorporate the frequency domain information of video frames to better capture the details that affect video quality, consequently further improving the model's generalizability. To our knowledge, this is the first technology for the NR 4K VQA task. Thorough empirical studies demonstrate it not only significantly outperforms existing methods on a specialized 4K VQA dataset but also achieves state-of-the-art performance across multiple open-source NR video quality datasets.

References

[1]
AGH University of Science and Technology. n. d. Video Quality Indicators. http://vq.kt.agh.edu.pl/metrics.html.
[2]
Baoliang Chen, Lingyu Zhu, Guo Li, Fangbo Lu, Hongfei Fan, and Shiqi Wang. 2021. Learning generalized spatial-temporal deep feature representation for no-reference video quality assessment. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 32, 4 (2021), 1903--1916.
[3]
Pengfei Chen, Leida Li, Lei Ma, Jinjian Wu, and Guangming Shi. 2020. RIRNet: Recurrent-in-recurrent network for video quality assessment. In Proceedings of the ACM International Conference on Multimedia. 834--842.
[4]
Manri Cheon and Jong-Seok Lee. 2017. Subjective and objective quality assessment of compressed 4K UHD videos for immersive experience. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 28, 7 (2017), 1467--1480.
[5]
Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. 2020. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020).
[6]
Deepti Ghadiyaram and Alan C Bovik. 2017. Perceptual quality prediction on authentically distorted images using a bag of features approach. Journal of Vision, Vol. 17, 1 (2017), 32--32.
[7]
Deepti Ghadiyaram, Janice Pan, Alan C Bovik, Anush Krishna Moorthy, Prasanjit Panda, and Kai-Chieh Yang. 2017. In-capture mobile video distortions: A study of subjective behavior and objective algorithms. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 28, 9 (2017), 2061--2077.
[8]
Jie Gu, Gaofeng Meng, Cheng Da, Shiming Xiang, and Chunhong Pan. 2019. No-reference image quality assessment with reinforcement recursive list-wise ranking. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 8336--8343.
[9]
Vlad Hosu, Franz Hahn, Mohsen Jenadeleh, Hanhe Lin, Hui Men, Tamás Szirányi, Shujun Li, and Dietmar Saupe. 2017. The Konstanz natural video database (KoNViD-1k). In International Conference on Quality of Multimedia Experience (QoMEX). IEEE, 1--6.
[10]
P ITU-T RECOMMENDATION. 1999. Subjective video quality assessment methods for multimedia applications. (1999).
[11]
Junjie Ke, Qifei Wang, Yilin Wang, Peyman Milanfar, and Feng Yang. 2021. Musiq: Multi-scale image quality transformer. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 5148--5157.
[12]
Jari Korhonen. 2019. Two-level approach for no-reference consumer video quality assessment. IEEE Transactions on Image Processing, Vol. 28, 12 (2019), 5923--5938.
[13]
Jari Korhonen, Yicheng Su, and Junyong You. 2020. Blind natural video quality prediction via statistical temporal features and deep spatial features. In Proceedings of the ACM International Conference on Multimedia. 3311--3319.
[14]
Bowen Li, Weixia Zhang, Meng Tian, Guangtao Zhai, and Xianpei Wang. 2022. Blindly assess quality of in-the-wild videos via quality-aware pre-training and motion perception. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 32, 9 (2022), 5944--5958.
[15]
Dingquan Li, Tingting Jiang, and Ming Jiang. 2019. Quality assessment of in-the-wild videos. In Proceedings of the ACM International Conference on Multimedia. 2351--2359.
[16]
Dingquan Li, Tingting Jiang, and Ming Jiang. 2021. Unified quality assessment of in-the-wild videos with mixed datasets training. International Journal of Computer Vision, Vol. 129 (2021), 1238--1257.
[17]
Jing Li, Rafal Mantiuk, Junle Wang, Suiyi Ling, and Patrick Le Callet. 2018. Hybrid-MST: A hybrid active sampling strategy for pairwise preference aggregation. Advances in Neural Information Processing Systems, Vol. 31 (2018).
[18]
Liang Liao, Kangmin Xu, Haoning Wu, Chaofeng Chen, Wenxiu Sun, Qiong Yan, and Weisi Lin. 2022. Exploring the effectiveness of video perceptual representation in blind video quality assessment. In Proceedings of the ACM International Conference on Multimedia. 837--846.
[19]
Hongbo Liu, Mingda Wu, Kun Yuan, Ming Sun, Yansong Tang, Chuanchuan Zheng, Xing Wen, and Xiu Li. 2023. Ada-DQA: Adaptive Diverse Quality-aware Feature Acquisition for Video Quality Assessment. In Proceedings of the ACM International Conference on Multimedia. 6695--6704.
[20]
Yongxu Liu, Yinghui Quan, Guoyao Xiao, Aobo Li, and Jinjian Wu. 2024. Scaling and Masking: A New Paradigm of Data Sampling for Image and Video Quality Assessment. arXiv preprint arXiv:2401.02614 (2024).
[21]
Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. 2021. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 10012--10022.
[22]
Ze Liu, Jia Ning, Yue Cao, Yixuan Wei, Zheng Zhang, Stephen Lin, and Han Hu. 2022. Video swin transformer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3202--3211.
[23]
Wei Lu, Wei Sun, Xiongkuo Min, Wenhan Zhu, Quan Zhou, Jun He, Qiyuan Wang, Zicheng Zhang, Tao Wang, and Guangtao Zhai. 2022. Deep neural network for blind visual quality assessment of 4K content. IEEE Transactions on Broadcasting (2022).
[24]
Alex Mackin, Mariana Afonso, Fan Zhang, and David Bull. 2018. A study of subjective video quality at various spatial resolutions. In IEEE International Conference on Image Processing. IEEE, 2830--2834.
[25]
Pavan C Madhusudana, Neil Birkbeck, Yilin Wang, Balu Adsumilli, and Alan C Bovik. 2021. ST-GREED: Space-time generalized entropic differences for frame rate dependent video quality prediction. IEEE Transactions on Image Processing, Vol. 30 (2021), 7446--7457.
[26]
Anish Mittal, Michele A Saad, and Alan C Bovik. 2015. A completely blind video integrity oracle. IEEE Transactions on Image Processing, Vol. 25, 1 (2015), 289--300.
[27]
Anish Mittal, Rajiv Soundararajan, and Alan C Bovik. 2012. Making a 'completely blind' image quality analyzer. IEEE Signal Processing Letters, Vol. 20, 3 (2012), 209--212.
[28]
Anush Krishna Moorthy and Alan Conrad Bovik. 2011. Blind image quality assessment: From natural scene statistics to perceptual quality. IEEE Transactions on Image Processing, Vol. 20, 12 (2011), 3350--3364.
[29]
Mikko Nuutinen, Toni Virtanen, Mikko Vaahteranoksa, Tero Vuori, Pirkko Oittinen, and Jukka Häkkinen. 2016. CVD2014'A database for evaluating no-reference video quality assessment algorithms. IEEE Transactions on Image Processing, Vol. 25, 7 (2016), 3073--3086.
[30]
Stéphane Péchard, Romuald Pépion, and Patrick Le Callet. 2008. Suitable methodology in subjective video quality assessment: a resolution dependent paradigm. In International Workshop on Image Media Quality and its Applications. 6.
[31]
Alexander Raake, Silvio Borer, Shahid M Satti, Jörgen Gustafsson, Rakesh Rao Ramachandra Rao, Stefano Medagli, Peter List, Steve Göring, David Lindero, Werner Robitza, et al. 2020. Multi-model standard for bitstream-, pixel-based and hybrid video quality assessment of UHD/4K: ITU-T P. 1204. IEEE Access, Vol. 8 (2020), 193020--193049.
[32]
Rakesh Rao Ramachandra Rao, Silvio Borer, David Lindero, Steve Göring, and Alexander Raake. 2023. PNATS-UHD-1-Long: An Open Video Quality Dataset for Long Sequences for HTTP-based Adaptive Streaming QoE Assessment. In 2023 15th International Conference on Quality of Multimedia Experience (QoMEX). IEEE, 252--257.
[33]
Rakesh Rao Ramachandra Rao, Steve Göring, Bassem Elmeligy, and Alexander Raake. 2023. AVT-VQDB-UHD-1-Appeal: A UHD-1/4K Open Dataset for Video Quality and Appeal Assessment Using Modern Video Codecs. In 2023 IEEE 25th International Workshop on Multimedia Signal Processing (MMSP). IEEE, 1--6.
[34]
Rakesh Rao Ramachandra Rao, Steve Göring, and Alexander Raake. 2022. Avqbits?adaptive video quality model based on bitstream information for various video applications. IEEE Access, Vol. 10 (2022), 80321--80351.
[35]
Rakesh Rao Ramachandra Rao, Steve Göring, Werner Robitza, Bernhard Feiten, and Alexander Raake. 2019. AVT-VQDB-UHD-1: A large scale video quality database for UHD-1. In 2019 IEEE International Symposium on Multimedia (ISM). IEEE, 17--177.
[36]
Michele A Saad, Alan C Bovik, and Christophe Charrier. 2012. Blind image quality assessment: A natural scene statistics approach in the DCT domain. IEEE Transactions on Image Processing, Vol. 21, 8 (2012), 3339--3352.
[37]
Michele A Saad, Alan C Bovik, and Christophe Charrier. 2014. Blind prediction of natural video quality. IEEE Transactions on Image Processing, Vol. 23, 3 (2014), 1352--1365.
[38]
B Series. 2012. Methodology for the subjective assessment of the quality of television pictures. Recommendation ITU-R BT, Vol. 500, 13 (2012).
[39]
Kalpana Seshadrinathan, Rajiv Soundararajan, Alan Conrad Bovik, and Lawrence K Cormack. 2010. Study of subjective and objective quality assessment of video. IEEE Transactions on Image Processing, Vol. 19, 6 (2010), 1427--1441.
[40]
Zeina Sinno and Alan Conrad Bovik. 2018. Large-scale study of perceptual video quality. IEEE Transactions on Image Processing, Vol. 28, 2 (2018), 612--627.
[41]
Rajiv Soundararajan and Alan C Bovik. 2012. Video quality assessment by reduced reference spatio-temporal entropic differencing. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 23, 4 (2012), 684--694.
[42]
Wei Sun, Xiongkuo Min, Wei Lu, and Guangtao Zhai. 2022. A deep learning based no-reference quality assessment model for ugc videos. In Proceedings of the ACM International Conference on Multimedia. 856--865.
[43]
Zhengzhong Tu, Yilin Wang, Neil Birkbeck, Balu Adsumilli, and Alan C Bovik. 2021. UGC-VQA: Benchmarking blind video quality assessment for user generated content. IEEE Transactions on Image Processing, Vol. 30 (2021), 4449--4464.
[44]
Yilin Wang, Sasi Inguva, and Balu Adsumilli. 2019. YouTube UGC dataset for video compression research. In IEEE International Workshop on Multimedia Signal Processing (MMSP). IEEE, 1--5.
[45]
Yilin Wang, Junjie Ke, Hossein Talebi, Joong Gon Yim, Neil Birkbeck, Balu Adsumilli, Peyman Milanfar, and Feng Yang. 2021. Rich features for perceptual quality assessment of UGC videos. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 13435--13444.
[46]
Haoning Wu, Chaofeng Chen, Jingwen Hou, Liang Liao, Annan Wang, Wenxiu Sun, Qiong Yan, and Weisi Lin. 2022. Fast-vqa: Efficient end-to-end video quality assessment with fragment sampling. In Proceedings of the European Conference on Computer Vision. Springer, 538--554.
[47]
Haoning Wu, Chaofeng Chen, Liang Liao, Jingwen Hou, Wenxiu Sun, Qiong Yan, Jinwei Gu, and Weisi Lin. 2023. Neighbourhood representative sampling for efficient end-to-end video quality assessment. IEEE Transactions on Pattern Analysis and Machine Intelligence (2023).
[48]
Haoning Wu, Erli Zhang, Liang Liao, Chaofeng Chen, Jingwen Hou, Annan Wang, Wenxiu Sun, Qiong Yan, and Weisi Lin. 2023. Exploring video quality assessment on user generated contents from aesthetic and technical perspectives. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 20144--20154.
[49]
Fengchuang Xing, Yuan-Gen Wang, Hanpin Wang, Jiefeng He, and Jinchun Yuan. 2022. DVL2021: An ultra high definition video dataset for perceptual quality study. Journal of Visual Communication and Image Representation, Vol. 82 (2022), 103374.
[50]
Jiahua Xu, Jing Li, Xingguang Zhou, Wei Zhou, Baichao Wang, and Zhibo Chen. 2021. Perceptual quality assessment of internet videos. In Proceedings of the ACM International Conference on Multimedia. 1248--1257.
[51]
Zhenqiang Ying, Maniratnam Mandal, Deepti Ghadiyaram, and Alan Bovik. 2021. Patch-VQ:'Patching Up'the video quality problem. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14019--14029.
[52]
Junyong You and Jari Korhonen. 2019. Deep neural networks for no-reference video quality assessment. In IEEE International Conference on Image Processing. IEEE, 2349--2353.
[53]
Wenhan Zhu, Guangtao Zhai, Xiongkuo Min, Xiaokang Yang, and Xiao-Ping Zhang. 2021. Perceptual quality assessment for recognizing true and pseudo 4K content. In ICASSP 2021--2021 IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 2190--2194.

Index Terms

  1. Highly Efficient No-reference 4K Video Quality Assessment with Full-Pixel Covering Sampling and Training Strategy

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      MM '24: Proceedings of the 32nd ACM International Conference on Multimedia
      October 2024
      11719 pages
      ISBN:9798400706868
      DOI:10.1145/3664647
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 28 October 2024

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. 4k video quality assessment
      2. 4k video sampling strategy
      3. network training
      4. transformer

      Qualifiers

      • Research-article

      Conference

      MM '24
      Sponsor:
      MM '24: The 32nd ACM International Conference on Multimedia
      October 28 - November 1, 2024
      Melbourne VIC, Australia

      Acceptance Rates

      MM '24 Paper Acceptance Rate 1,150 of 4,385 submissions, 26%;
      Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 39
        Total Downloads
      • Downloads (Last 12 months)39
      • Downloads (Last 6 weeks)9
      Reflects downloads up to 06 Jan 2025

      Other Metrics

      Citations

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media