skip to main content
10.1145/3083165.3083180acmconferencesArticle/Chapter ViewAbstractPublication PagesmmsysConference Proceedingsconference-collections
research-article

Fixation Prediction for 360° Video Streaming in Head-Mounted Virtual Reality

Published: 20 June 2017 Publication History

Abstract

We study the problem of predicting the Field-of-Views (FoVs) of viewers watching 360° videos using commodity Head-Mounted Displays (HMDs). Existing solutions either use the viewer's current orientation to approximate the FoVs in the future, or extrapolate future FoVs using the historical orientations and dead-reckoning algorithms. In this paper, we develop fixation prediction networks that concurrently leverage sensor- and content-related features to predict the viewer fixation in the future, which is quite different from the solutions in the literature. The sensor-related features include HMD orientations, while the content-related features include image saliency maps and motion maps. We build a 360° video streaming testbed to HMDs, and recruit twenty-five viewers to watch ten 360° videos. We then train and validate two design alternatives of our proposed networks, which allows us to identify the better-performing design with the optimal parameter settings. Trace-driven simulation results show the merits of our proposed fixation prediction networks compared to the existing solutions, including: (i) lower consumed bandwidth, (ii) shorter initial buffering time, and (iii) short running time.

References

[1]
2016. Augmented Virtual Reality revenue forecast revised to hit $120 billion by 2020. (2016). https://goo.gl/nw9mtP.
[2]
2016. Global 360-Degree Camera Market 2016-2020. (2016). https://goo.gl/zJCdnO.
[3]
T. Alshawi, Z. Long, and G. AlRegib. 2016. Understanding spatial correlation in eye-fixation maps for visual attention in videos. In Proc. of IEEE International Conference on Multimedia and Expo (ICME'16). 1--6.
[4]
A. Borji, M. Cheng, H. Jiang, and J. Li. 2014. Salient object detection: A survey. arXiv preprint arXiv:1411.5878 (2014).
[5]
L. Bottou. 2010. Large-scale machine learning with stochastic gradient descent. In Proc. of International Conference on Computational Statistics (COMPSTAT'10). 177--186.
[6]
S. Chaabouni, J. Benois-Pineau, and C. Amar. 2016. Transfer learning with deep networks for saliency prediction in natural video. In Proc. of IEEE International Conference on Image Processing (ICIP'16). 1604--1608.
[7]
C. Chang, C. Hsu, C. Hsu, and K. Chen. 2016. Performance measurements of virtual reality systems: Quantifying the timing and positioning accuracy. In Proc. of ACM Conference on Multimedia (MM'16). 655--659.
[8]
M. Cornia, L. Baraldi, G. Serra, and R. Cucchiara. 2016. A Deep Multi-Level Network for Saliency Prediction. In International Conference on Pattern Recognition (ICPR'16). 3488--3493.
[9]
T. El-Ganainy and M. Hefeeda. 2016. Streaming Virtual Reality Content. arXiv preprint arXiv:1612.08350 (2016).
[10]
S. Friston and A. Steed. 2014. Measuring latency in virtual environments. Transactions on Visualization and Computer Graphics 20, 4 (2014), 616--625.
[11]
V Gaddam, M. Riegler, R. Eg, C. Griwodz, and P. Halvorsen. 2016. Tiling in Interactive Panoramic Video: Approaches and Evaluation. IEEE Transactions on Multimedia 18, 9 (2016), 1819--1831.
[12]
R. Guntur and W. Ooi. 2012. On tile assignment for region-of-interest video streaming in a wireless LAN. In Proc. of ACM international workshop on Network and Operating System Support for Digital Audio and Video (NOSSDAV'12). 59--64.
[13]
S. Hochreiter and J. Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735--1780.
[14]
Chun-Ying Huang, Kuan-Ta Chen, De-Yu Chen, Hwai-Jung Hsu, and Cheng-Hsin Hsu. 2014. GamingAnywhere: The First Open Source Cloud Gaming System. ACM Transactions on Multimedia Computing, Communications, and Applications 10, 1 (2014).
[15]
T. Judd, K. Ehinger, F. Durand, and A. Torralba. 2009. Learning to predict where humans look. In IEEE International Conference on Computer Vision (ICCV'09). 2106--2113.
[16]
Y. Kavak, E. Erdem, and A. Erdem. 2017. A comparative study for feature integration strategies in dynamic saliency estimation. Signal Processing: Image Communication 51 (2017), 13--25.
[17]
H. Kimata, D. Ochi, A. Kameda, H. Noto, K. Fukazawa, and A. Kojima. 2012. Mobile and multi-device interactive panorama video distribution system. In Proc. of IEEE Global Conference on Consumer Electronics (GCCE'12). 574--578.
[18]
B. Lucas and T. Kanade. 1981. An iterative image registration technique with an application to stereo vision. In Proc. of the International Joint Conference on Artificial Intelligence. 674--679.
[19]
H. Lakshman M. Yu and B. Girod. 2015. A Framework to Evaluate Omnidirectional Video Coding Schemes. In IEEE International Symposium on Mixed and Augmented Reality. 31--36.
[20]
A. Mavlankar and B. Girod. 2009. Pre-fetching based on video analysis for interactive region-of-interest streaming of soccer sequences. In Proc. of IEEE International Conference on Image Processing (ICIP'09). 3061--3064.
[21]
A. Mavlankar and B. Girod. 2010. Video streaming with interactive pan/tilt/zoom. In Signals and Communication Technology. 431--455.
[22]
T. Nguyen, M. Xu, G. Gao, M. Kankanhalli, Q. Tian, and S. Yan. 2013. Static saliency vs. dynamic saliency: a comparative study. In Proc. of ACM International Conference on Multimedia (MM'13). 987--996.
[23]
K. Simonyan and A. Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).
[24]
K. Skarseth, H. Bjørlo, P. Halvorsen, M. Riegler, and C. Griwodz. 2016. OpenVQ: a video quality assessment toolkit. In Proc. of ACM International Conference on Multimedia (MM'16), OSSC paper. 1197--1200.
[25]
I. Sodagar. 2011. The mpeg-dash standard for multimedia streaming over the internet. IEEE MultiMedia 18, 4 (2011), 62--67.
[26]
E. Vig, M. Dorr, and D. Cox. 2014. Large-scale optimization of hierarchical features for saliency prediction in natural images. In Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR'14). 2798--2805.
[27]
G. Simon X. Corbillon, A. Devlic and J. Chakareski. 2017. Viewport-Adaptive Navigable 360-Degree Video Delivery. In IEEE International Conference on Communications (ICC'17). Accepted to appear.
[28]
M. Young, G. Gaylor, S. Andrus, and B. Bodenheimer. 2014. A comparison of two cost-differentiated virtual reality systems for perception and action tasks. In Proc. of the ACM Symposium on Applied Perception. 83--90.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
NOSSDAV'17: Proceedings of the 27th Workshop on Network and Operating Systems Support for Digital Audio and Video
June 2017
105 pages
ISBN:9781450350037
DOI:10.1145/3083165
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 June 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. 360° video
  2. HMD
  3. prediction
  4. virtual reality

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

MMSys'17
Sponsor:
MMSys'17: Multimedia Systems Conference 2017
June 20 - 23, 2017
Taipei, Taiwan

Acceptance Rates

NOSSDAV'17 Paper Acceptance Rate 15 of 40 submissions, 38%;
Overall Acceptance Rate 118 of 363 submissions, 33%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)135
  • Downloads (Last 6 weeks)14
Reflects downloads up to 14 Sep 2024

Other Metrics

Citations

Cited By

View all

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media