skip to main content
research-article

Look at Me! Correcting Eye Gaze in Live Video Communication

Published: 05 June 2019 Publication History

Abstract

Although live video communication is widely used, it is generally less engaging than face-to-face communication because of limitations on social, emotional, and haptic feedback. Missing eye contact is one such problem caused by the physical deviation between the screen and camera on a device. Manipulating video frames to correct eye gaze is a solution to this problem. In this article, we introduce a system to rotate the eyeball of a local participant before the video frame is sent to the remote side. It adopts a warping-based convolutional neural network to relocate pixels in eye regions. To improve visual quality, we minimize the L2 distance between the ground truths and warped eyes. We also present several newly designed loss functions to help network training. These new loss functions are designed to preserve the shape of eye structures and minimize color changes around the periphery of eye regions. To evaluate the presented network and loss functions, we objectively and subjectively compared results generated by our system and the state-of-the-art, DeepWarp, in relation to two datasets. The experimental results demonstrated the effectiveness of our system. In addition, we showed that our system can perform eye-gaze correction in real time on a consumer-level laptop. Because of the quality and efficiency of the system, gaze correction by postprocessing through this system is a feasible solution to the problem of missing eye contact in video communication.

References

[1]
T. Banerjee. Webinar 8 Webcast Market Size, Trends 8 Analysis--Forecasts To 2025. Retrieved from https://medium.com/@banerjee.treesha/webinar-webcast-market-size-trends-analysis-forecasts-to-2025-1877a838ce39.
[2]
P. S. N. Lee, L. Leung, V. Lo, C. Xiong, and T. Wu. 2011. Internet communication versus face-to-face interaction in quality of life. Soc. Indicat. Res. 100, 3 (01 Feb. 2011), 375--389.
[3]
The Late Late Show with James Corden. 2017. Harry Styles video chats with james corden. Retrieved from https://www.youtube.com/watch?v=H7ZjRna4ZK4.
[4]
Y. Ganin, D. Kononenko, D. Sungatullina, and V. Lempitsky. 2016. DeepWarp: Photorealistic Image Resynthesis for Gaze Manipulation. Springer International Publishing, 311--326.
[5]
G. Huang, Z. Liu, and K. Q. Weinberger. 2017. Densely connected convolutional networks. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). 2261--2269. arxiv:1608.06993 http://arxiv.org/abs/1608.06993
[6]
R. Yang and Z. Zhang. 2001. Eye Gaze Correction with Stereovision for Video-Teleconferencing. Technical Report. Microsoft. Retrieved from https://www.microsoft.com/en-us/research/publication/eye-gaze-correction-with-stereovision-for-video-teleconferencing/.
[7]
A. Criminisi, J. Shotton, A. Blake, and P. H. S. Torr. 2003. Gaze manipulation for one-to-one teleconferencing. In Proceedings 9th IEEE International Conference on Computer Vision, Vol. 1. 191--198.
[8]
L. Wolf, Z. Freund, and S. Avidan. 2010. An eye for an eye: A single camera gaze-replacement method. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 817--824.
[9]
F. Solina and R. Ravnik. 2011. Fixing missing eye-contact in video conferencing systems. In Proceedings of the 33rd International Conference on Information Technology Interfaces (ITI’11). 233--236.
[10]
J. Gemmell, K. Toyama, C. L. Zitnick, T. Kang, and S. Seitz. 2000. Gaze awareness for video-conferencing: A software approach. IEEE Multimedia 7, 4 (2000), 26--35.
[11]
D. Giger, J. C. Bazin, C. Kuster, T. Popa, and M. Gross. 2014. Gaze correction with a single webcam. In Proceedings of the 2014 IEEE International Conference on Multimedia and Expo (ICME’14). 1--6.
[12]
A. Jaklič, F. Solina, and L. Šajn. 2017. User interface for a better eye contact in videoconferencing. Displays 46 (2017), 25--36.
[13]
L. S. Bohannon, A. M. Herbert, J. B. Pelz, and E. M. Rantanen. 2013. Eye contact and video-mediated communication: A review. Displays 34, 2 (2013), 177--185.
[14]
E. T. Baek and Y. S. Ho. 2017. Gaze correction using feature-based view morphing and performance evaluation. Signal Image Vid. Process. 11, 1 (2017), 187--194.
[15]
G. Doherty-Sneddon, A. Anderson, C. O’Malley, S. Langton, S. Garrod, and V. Bruce. 1997. Face-to-face and video-mediated communication: A comparison of dialogue structure and task performance. J. Exp. Psychol. Appl. 3, 2 (1997), 105--125.
[16]
E. M. Tapia, S. S. Intille, J. R. Rebula, and S. Stoddard. 2003. Concept and partial prototype video: Ubiquitous video communication with the perception of eye contact. In Proceedings of the UBICOMP 2003 Video Program.
[17]
A. Jones, M. Lang, G. Fyffe, X. Yu, J. Busch, I. McDowall, M. Bolas, and P. Debevec. 2009. Achieving eye contact in a one-to-many 3D video teleconferencing system. ACM Trans. Graph. 28, 3 (2009), 64:1--64:8.
[18]
B. M. Rappoport, C. J. Stringer, F. R. Rothkopf, J. C. Franklin, J. P. Ternus, J. C. Hoenig, R. P. Howarth, S. A. MYERS, and S. B. Lynch. 2016. Devices and methods for providing access to internal component. United States Patent US20160358543A1, 2016.
[19]
T. OGITA, S. Takanashi, and S. Takatsuka 2012. Sensor-equipped display apparatus and electronic apparatus. United States Patent US20120069042A1, 2012.
[20]
M. Dumont, S. Rogmans, S. Maesen, and P. Bekaert. 2009. Optimized two-party video chat with restored eye contact using graphics hardware. In e-Business and Telecommunications, Joaquim Filipe and Mohammad S. Obaidat (Eds.). Springer, Berlin, 358--372.
[21]
C. Kuster, T. Popa, J. C. Bazin, C. Gotsman, and M. Gross. 2012. Gaze correction for home video conferencing. ACM Trans. Graph. 31, 6 (2012), 174:1--174:6.
[22]
D. Weiner and N. Kiryati. 2003. Virtual gaze redirection in face images. In Proceedings of the 12th International Conference on Image Analysis and Processing. 76--81.
[23]
Y. Qin, K. C. Lien, M. Turk, and T. Höllerer. 2015. Eye Gaze Correction with a Single Webcam Based on Eye-Replacement. Springer International Publishing, Cham, 599--609.
[24]
Z. Shu, E. Shechtman, D. Samaras, and S. Hadap. 2016. EyeOpener: Editing eyes in the wild. ACM Trans. Graph. 36, 1 (2016).
[25]
E. Wood, T. Baltrušaitis, L. P. Morency, P. Robinson, and A. Bulling. 2018. GazeDirector: Fully articulated eye gaze redirection in video. Eurographics 37, 2 (2018), 217--225.
[26]
D. A. Forsyth and J. Ponce. 2002. Computer Vision: A Modern Approach. Prentice Hall Professional.
[27]
N. A. Dodgson. 2004. Variation and extrema of human interpupillary distance. In Stereoscopic Displays and Virtual Reality Systems XI, Andrew J. Woods, John O. Merritt, Stephen A. Benton, and Mark T. Bolas (Eds.), Vol. 5291. SPIE, 19--22.
[28]
D. E. King. 2009. Dlib-ml: A machine learning toolkit. J. Mach. Learn. Res. 10 (2009), 1755--1758. https://dl.acm.org/citation.cfm?id=1755843
[29]
V. Kazemi and J. Sullivan. 2014. One millisecond face alignment with an ensemble of regression trees. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1867--1874.
[30]
B. Xu, N. Wang, T. Chen, and M. Li. 2015. Empirical evaluation of rectified activations in convolutional network. In Proceedings of the ICML Deep Learning Workshop (2015). 06--11. arxiv:1505.00853 http://arxiv.org/abs/1505.00853
[31]
S. Ioffe and C. Szegedy. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the 32nd International Conference on International Conference on Machine Learning, Vol. 37. 448--456. http://dl.acm.org/citation.cfm?id=3045118.3045167
[32]
M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G.S. Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow, A. Harp, G. Irving, M. Isard, Y. Jia, R. Jozefowicz, L. Kaiser, M. Kudlur, J. Levenberg, D. Mané, R. Monga, S. Moore, D. Murray, C. Olah, M. Schuster, J. Shlens, B. Steiner, I. Sutskever, K. Talwar, P. Tucker, V. Vanhoucke, V. Vasudevan, F. Viégas, O. Vinyals, P. Warden, M. Wattenberg, M. Wicke, Y. Yu, and X. Zheng. 2015. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. Retrieved from https://www.tensorflow.org/.
[33]
D. P. Kingma and J. Ba. 2015. Adam: A method for stochastic optimization. In Proceedings of the International Conference on Learning Representations (2015). arxiv:1412.6980 http://arxiv.org/abs/1412.6980
[34]
B. A. Smith, Q. Yin, S. K. Feiner, and S. K. Nayar. 2013. Gaze locking: Passive eye contact detection for human--object interaction. In Proceedings of the ACM Symposium on User Interface Software and Technology (UIST’13). 271--280.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Multimedia Computing, Communications, and Applications
ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 15, Issue 2
May 2019
375 pages
ISSN:1551-6857
EISSN:1551-6865
DOI:10.1145/3339884
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 June 2019
Accepted: 01 February 2019
Revised: 01 January 2019
Received: 01 September 2018
Published in TOMM Volume 15, Issue 2

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Eye contact
  2. convolutional neural network
  3. gaze correction
  4. image processing
  5. live video communication

Qualifiers

  • Research-article
  • Research
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)64
  • Downloads (Last 6 weeks)8
Reflects downloads up to 24 Dec 2024

Other Metrics

Citations

Cited By

View all

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media