skip to main content
research-article

A Super-Resolution Flexible Video Coding Solution for Improving Live Streaming Quality

Published: 01 January 2023 Publication History

Abstract

In the context of the latest growing popularity of live video streaming, ensuring high video quality has become one of the most significant challenges faced by all live streaming platforms. Insufficient uplink bandwidth is an important factor that influences these live video transmissions, affecting their bitrate and latency and consequently the associated video streaming quality. This paper proposes a novel flexible super-resolution-based video coding and uploading framework (FlexSRVC) that improves the quality of live video streaming in limited uplink network bandwidth conditions. FlexSRVC includes a flexible video coding scheme, which compresses high-resolution key and non-key video frames to a lower bitrate in order to reduce the upload delay. A new flexible bitrate adaptation algorithm is also proposed to select dynamically the number of frames to be compressed and the compression ratio by jointly considering uplink network conditions and available cloud computing resources. Trace-driven emulations demonstrate that FlexSRVC provides the same quality while reducing up to 25% of the required bandwidth compared to the original encoding method (H.264). FlexSRVC improves users' QoE by at least 50% compared to a super resolution-based method which employs reconstruction of all video frames in uplink bandwidth constrained conditions.

References

[3]
Z. Wang et al., “MultiLive: Adaptive bitrate control for low-delay multi-party interactive live streaming,” in Proc. IEEE Conf. Comput. Commun., Toronto, ON, Canada, 2020, pp. 1093–1102.
[4]
R.-X. Zhang et al., “Enhancing the crowdsourced live streaming: A deep reinforcement learning approach,” in Proc. 29th ACM Workshop Netw. Operating Syst. Support Digit. Audio Video, Amherst, MA, USA, 2019, pp. 55–60.
[5]
P. Dogga, S. Chakraborty, S. Mitra, and R. Netravali, “Edge-based transcoding for adaptive live video streaming,” in Proc. USENIX Workshop Hot Topics Edge Comput., HotEdge, Renton, USA, 2019.
[6]
Z. Xu, X. Zhang, and Z. Guo, “Qoe-driven adaptive k-push for HTTP/2 live streaming,” IEEE Trans. Circuits Syst. Video Technol., vol. 29, no. 6, pp. 1781–1794, Jun. 2019.
[7]
J. Kim, Y. Jung, H. Yeo, J. Ye, and D. Han, “Neural-enhanced live streaming: Improving live video ingest via online learning,” in Proc. Annu. Conf. ACM Special Interest Group Data Commun. Appl. Technol. Architectures Protocols Comput. Commun., Virtual, USA, 2020, pp. 107–125.
[8]
H. Lin, X. He, L. Qing, Q. Teng, and S. Yang, “Improved low-bitrate HEVC video coding using deep learning based super-resolution and adaptive block patching,” IEEE Trans. Multimedia, vol. 21, no. 12, pp. 3010–3023, Dec. 2019.
[9]
Amazon Web Services, “Video latency in live streaming,” Sep. 2022. [Online]. Available: https://aws.amazon.com/media/tech/video-latency-in-live-streaming/?nc1=h_ls
[10]
G.-M. Muntean, “Efficient delivery of multimedia streams over broadband networks using QOAS,” IEEE Trans. Broadcast., vol. 52, no. 2, pp. 230–235, Jun. 2006.
[11]
J. Adams and G.-M. Muntean, “Adaptive-buffer power save mechanism for mobile multimedia streaming,” in Proc. IEEE Int. Conf. Commun., 2007, pp. 4548–4553.
[12]
G.-M. Muntean, P. Perry, and L. Murphy, “A comparison-based study of quality-oriented video on demand,” IEEE Trans. Broadcast., vol. 53, no. 1, pp. 92–102, Mar. 2007.
[13]
G.-M. Muntean and N. Cranley, “Resource efficient quality-oriented wireless broadcasting of adaptive multimedia content,” IEEE Trans. Broadcast., vol. 53, no. 1, pp. 362–368, Mar. 2007.
[14]
A. Yaqoob, T. Bi, and G.-M. Muntean, “A DASH-based efficient throughput and buffer occupancy-based adaptation algorithm for smooth multimedia streaming,” in Proc. Int. Wireless Commun. Mobile Comput. Conf., 2019, pp. 643–649.
[15]
L. Zou, T. Bi, and G.-M. Muntean, “A DASH-Based adaptive multiple sensorial content delivery solution for improved user quality of experience,” IEEE Access, vol. 7, pp. 89172–89187, 2019.
[16]
Y. Geng, X. Zhang, T. Niu, C. Zhou, and Z. Guo, “Delay-constrained rate control for real-time video streaming over wireless networks,” in Proc. Vis. Commun. Image Process., Singapore, 2015, pp. 1–4.
[17]
G. Carlucci, L. De Cicco, S. Holmer, and S. Mascolo, “Analysis and design of the Google congestion control for web real-time communication (WebRTC),” in Proc. Int. Conf. Multimedia Syst., Klagenfurt, Austria, 2016, pp. 13:1–13:12.
[18]
E. Kurdoglu et al., “Real-time bandwidth prediction and rate adaptation for video calls over cellular networks,” in Proc. 7th Int. Conf. Multimedia Syst., Klagenfurt, Austria, 2016, pp. 12:1–12:11.
[19]
G. Bakar, R. A. Kirmizioglu, and A. M. Tekalp, “Motion-based rate adaptation in WebRTC videoconferencing using scalable video coding,” IEEE Trans. Multimedia, vol. 21, no. 2, pp. 429–441, Feb. 2019.
[20]
C. Dong, C. C. Loy, K. He, and X. Tang, “Image super-resolution using deep convolutional networks,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 38, no. 2, pp. 295–307, Feb. 2016.
[21]
J. Kim, J. Kwon Lee, and K. Mu Lee, “Accurate image super-resolution using very deep convolutional networks,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Las Vegas, NV, USA, 2016, pp. 1646–1654.
[22]
J. Caballero et al., “Real-time video super-resolution with spatio-temporal networks and motion compensation,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Honolulu, USA, 2017, pp. 2848–2857.
[23]
R. Keys, “Cubic convolution interpolation for digital image processing,” IEEE Trans. Acoust., Speech, Signal Process., vol. 29, no. 6, pp. 1153–1160, Dec. 1981.
[24]
H. Hou and H. Andrews, “Cubic splines for image interpolation and digital filtering,” IEEE Trans. Acoust., Speech, Signal Process., vol. 26, no. 6, pp. 508–517, Dec. 1978.
[25]
C. E. Duchon, “Lanczos filtering in one and two dimensions,” J. Appl. Meteorol., vol. 18, no. 8, pp. 1016–1022, 1979.
[26]
C. Ledig et al., “Photo-realistic single image super-resolution using a generative adversarial network,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2017, pp. 4681–4690.
[27]
X. Wang et al., “ESRGAN: Enhanced super-resolution generative adversarial networks,” in Proc. Eur. Conf. Comput. Vis. Workshops, 2018, pp. 63–79.
[28]
X. Wang, K. C. Chan, K. Yu, C. Dong, and C. C. Loy, “EDVR: Video restoration with enhanced deformable convolutional networks,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. Workshops, 2019, pp. 1954–1963.
[29]
W. Shi et al., “Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Las Vegas, NV, USA, 2016, pp. 1874–1883.
[30]
B. Lim, S. Son, H. Kim, S. Nah, and K. Mu Lee, “Enhanced deep residual networks for single image super-resolution,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Honolulu, HI, USA, 2017, pp. 1132–1140.
[31]
N. Ahn, B. Kang, and K.-A. Sohn, “Fast, accurate, and lightweight super-resolution with cascading residual network,” in Proc. Eur. Conf. Comput. Vis., Munich, Germany, 2018, pp. 256–272.
[32]
S. Williams, A. Waterman, and D. Patterson, “Roofline: An insightful visual performance model for multicore architectures,” Commun. ACM, vol. 52, no. 4, pp. 65–76, Apr. 2009.
[33]
Y. Li et al., “Convolutional neural network-based block up-sampling for intra frame coding,” IEEE Trans. Circuits Syst. Video Technol., vol. 28, no. 9, pp. 2316–2330, Sep. 2018.
[34]
J. Glaister, C. Chan, M. Frankovich, A. Tang, and A. Wong, “Hybrid video compression using selective keyframe identification and patch-based super-resolution,” in Proc. IEEE Int. Symp. Multimedia, Dana Point, CA, USA, 2011, pp. 105–110.
[35]
M. Shen, P. Xue, and C. Wang, “Down-sampling based video coding using super-resolution technique,” IEEE Trans. Circuits Syst. Video Technol., vol. 21, no. 6, pp. 755–765, Jun. 2011.
[36]
K. Liu, D. Liu, H. Li, and F. Wu, “Convolutional neural network-based residue super-resolution for video coding,” in Proc. IEEE Vis. Commun. Image Process., 2018, pp. 1–4.
[37]
H. Yeo, Y. Jung, J. Kim, J. Shin, and D. Han, “Neural adaptive content-aware internet video delivery,” in Proc. 13th USENIX Symp. Operating Syst. Des. Implementation, Carlsbad, CA, USA, 2018, pp. 645–661.
[38]
Y. Zhang et al., “Improving quality of experience by adaptive video streaming with super-resolution,” in Proc. 39th IEEE Conf. Comput. Commun., Toronto, ON, Canada, 2020, pp. 1957–1966.
[39]
Z. Li, A. Aaron, I. Katsavounidis, A. Moorthy, and M. Manohara, “Toward a practical perceptual video quality metric,” Netflix Tech Blog, vol. 6, 2016, Art. no.
[40]
T. Huang et al., “Comyco: Quality-aware adaptive video streaming via imitation learning,” in Proc. 27th ACM Int. Conf. Multimedia, Nice, France, 2019, pp. 429–437.
[41]
X. Yin, A. Jindal, V. Sekar, and B. Sinopoli, “A control-theoretic approach for dynamic adaptive video streaming over HTTP,” in Proc. ACM Conf. Special Interest Group Data Commun., New York, NY, USA, 2015, pp. 325–338.
[42]
FFmpeg, “H.264 video encoding guide,” Sep. 2022. [Online]. Available: https://trac.ffmpeg.org/wiki/Encode/H.264
[43]
Nginx, Sep. 2022. [Online]. Available: http://nginx.org/
[46]
E. Agustsson and R. Timofte, “Ntire 2017 challenge on single image super-resolution: Dataset and study,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. Workshops, Honolulu, HI, USA, 2017, pp. 1122–1131.
[47]
D. Raca, J. J. Quinlan, A. H. Zahran, and C.J. Sreenan, “Beyond throughput: A 4G lte dataset with channel and context metrics,” in Proc. 9th Int. Conf. Multimedia Syst., Amsterdam, The Netherlands, 2018, pp. 460–465.
[48]
B. Hubert et al., “Linux advanced routing & traffic control howto,” Sep. 2022. [Online]. Available: https://lartc.org/howto/
[49]
H. Yeo, C. J. Chong, Y. Jung, J. Ye, and D. Han, “NEMO: Enabling neural-enhanced video streaming on commodity mobile devices,” in Proc. 26th Annu. Int. Conf. Mobile Comput. Netw., New York, NY, USA, 2020, pp. 1–14, Art. no.
[50]
Q. Huynh-Thu and M. Ghanbari, “Scope of validity of PSNR in image/video quality assessment,” Electron. Lett., vol. 44, no. 13, pp. 800–801, 2008.

Cited By

View all

Index Terms

  1. A Super-Resolution Flexible Video Coding Solution for Improving Live Streaming Quality
            Index terms have been assigned to the content through auto-classification.

            Recommendations

            Comments

            Information & Contributors

            Information

            Published In

            cover image IEEE Transactions on Multimedia
            IEEE Transactions on Multimedia  Volume 25, Issue
            2023
            8932 pages

            Publisher

            IEEE Press

            Publication History

            Published: 01 January 2023

            Qualifiers

            • Research-article

            Contributors

            Other Metrics

            Bibliometrics & Citations

            Bibliometrics

            Article Metrics

            • Downloads (Last 12 months)0
            • Downloads (Last 6 weeks)0
            Reflects downloads up to 24 Jan 2025

            Other Metrics

            Citations

            Cited By

            View all
            • (2024)VCApather: A Network as a Service Solution for Video Conference ApplicationsProceedings of the 34th edition of the Workshop on Network and Operating System Support for Digital Audio and Video10.1145/3651863.3651884(57-63)Online publication date: 15-Apr-2024
            • (2024)Perceptual Quality Assessment of Face Video Compression: A Benchmark and An Effective MethodIEEE Transactions on Multimedia10.1109/TMM.2024.338026026(8596-8608)Online publication date: 1-Jan-2024
            • (2024)Real-Time Lightweight Video Super-Resolution With RRED-Based Perceptual ConstraintIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2024.340582734:10_Part_2(10310-10325)Online publication date: 1-Oct-2024
            • (2023)Reparo: QoE-Aware Live Video Streaming in Low-Rate Networks by Intelligent Frame RecoveryProceedings of the 31st ACM International Conference on Multimedia10.1145/3581783.3613441(9194-9204)Online publication date: 26-Oct-2023
            • (2023)BiSR: Bidirectionally Optimized Super-Resolution for Mobile Video StreamingProceedings of the ACM Web Conference 202310.1145/3543507.3583519(3121-3131)Online publication date: 30-Apr-2023

            View Options

            View options

            Figures

            Tables

            Media

            Share

            Share

            Share this Publication link

            Share on social media