research-article

Enhancing the AR Experience with Machine Learning Services

Authors:

Michael Englert,

Marcel Klomann,

Yvonne JungAuthors Info & Claims

Web3D '19: Proceedings of the 24th International Conference on 3D Web Technology

Pages 1 - 9

https://doi.org/10.1145/3329714.3338134

Published: 26 July 2019 Publication History

Abstract

In this paper, we present and evaluate a web service that offers cloud-based machine learning services to improve Augmented Reality applications on mobile and web clients with special regards to tracking quality and registration of complex scenes that require an application-specific coordinate frame. Specifically, our service aims at reducing camera drift that still occurs in modern AR frameworks as well as helps with the initial camera alignment in a known scene by estimating the absolute camera pose using a configurable context-based image segmentation in combination with an adaptive image classification. We demonstrate real-world applications that utilize our web service and evaluate the performance and accuracy of the underlying image segmentation and the camera pose estimation. We also discuss the initial configuration along with the semi-automatic process of generating training data, and the training of the machine learning models for the corresponding tasks.

References

[1]

2019. Apple ARKit. https://developer.apple.com/arkit/.

[2]

2019. Google ARCore. https://developers.google.com/ar/.

[3]

Vijay Badrinarayanan, Alex Kendall, and Roberto Cipolla. 2017. Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE transactions on pattern analysis and machine intelligence 39, 12(2017), 2481–2495.

[4]

Igor Barros Barbosa, Marco Cristani, Barbara Caputo, Aleksander Rognhaugen, and Theoharis Theoharis. 2017. Looking beyond appearances: Synthetic training data for deep CNNs in re-identification. Computer Vision and Image Understanding(2017).

[5]

Guoliang Chen, Xiaolin Meng, Yunjia Wang, Yanzhe Zhang, Peng Tian, and Huachao Yang. 2015. Integrated WiFi/PDR/Smartphone using an unscented kalman filter algorithm for 3D indoor localization. Sensors 15, 9 (2015), 24595–24614.

[6]

Liang-Chieh Chen, Yukun Zhu, George Papandreou, Florian Schroff, and Hartwig Adam. 2018. Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation. In ECCV.

[7]

Estefania Munoz Diaz. 2015. Inertial pocket navigation system: Unaided 3D positioning. Sensors 15, 4 (2015), 9156–9178.

[8]

Andreas Dietze, Marcel Klomann, Yvonne Jung, Michael Englert, Sebastian Rieger, Achim Rehberger, Silvan Hau, and Paul Grimm. 2017. SMULGRAS: A Platform for Smart Multicodal Graphics Search. In Proceedings Web3D ’17. ACM, New York, USA, 17:1–17:9.

Digital Library

[9]

Michael Englert, Marcel Klomann, Paul Grimm, and Yvonne Jung. 2018. Methoden zur Realisierung und Verbesserung von Indoor-Lokalisierung in AR-Anwendungen. In 15. Workshop der GI-Fachgruppe VR/AR. Shaker, 7–18.

[10]

Georg Gerstweiler, Emanuel Vonach, and Hannes Kaufmann. 2015. Hymotrack: A mobile AR navigation system for complex indoor environments. Sensors 16, 1 (2015), 17.

[11]

Edmundo Guerra, Rodrigo Munguia, Yolanda Bolea, and Antoni Grau. 2016. Human collaborative localization and mapping in indoor environments with non-continuous stereo. Sensors 16, 3 (2016), 275.

[12]

Bert M Haralick, Chung-Nan Lee, Karsten Ottenberg, and Michael Nölle. 1994. Review and analysis of solutions of the three point perspective pose estimation problem. International journal of computer vision 13, 3 (1994), 331–356.

Digital Library

[13]

Tadanobu Inoue, Subhajit Chaudhury, Giovanni De Magistris, and Sakyasingha Dasgupta. 2017. Transfer learning from synthetic to real images using variational autoencoders for robotic applications. arXiv preprint arXiv:1709.06762(2017).

[14]

Alex Kendall and Roberto Cipolla. 2015. Modelling Uncertainty in Deep Learning for Camera Relocalization. CoRR abs/1509.05909(2015). http://arxiv.org/abs/1509.05909

[15]

Alex Kendall, Matthew Grimes, and Roberto Cipolla. 2015. Convolutional networks for real-time 6-DOF camera relocalization. CoRR abs/1505.07427(2015). http://arxiv.org/abs/1505.07427

[16]

Kourosh Khoshelham and Sisi Zlatanova. 2016. Sensors for Indoor Mapping and Navigation.

[17]

Marcel Klomann, Michael Englert, Kai Weber, Paul Grimm, and Yvonne Jung. 2018. Improving mobile MR applications using a cloud-based image segmentation approach with synthetic training data. In Proceedings of the 23rd International Conference on 3D Web Technology(Web3D ’18). ACM, 4:1–4:7.

Digital Library

[18]

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems. 1097–1105.

Digital Library

[19]

Jonathan Long, Evan Shelhamer, and Trevor Darrell. 2015. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3431–3440.

[20]

E. Marchand, H. Uchiyama, and F. Spindler. 2016. Pose Estimation for Augmented Reality: A Hands-On Survey. IEEE Transactions on Visualization and Computer Graphics 22, 12(2016), 2633–2651.

Digital Library

[21]

Matt Miesnieks. 2017. Why is ARKit better than the alternatives?(2017). https://medium.com/super-ventures-blog/why-is-arkit-better-than-the-alternatives-af8871889d6a

[22]

OpenCV Team. 2019. Open Source Computer Vision Library. http://www.opencv.org/.

[23]

Benjamin Planche, Ziyan Wu, Kai Ma, Shanhui Sun, Stefan Kluckner, Terrence Chen, Andreas Hutter, Sergey Zakharov, Harald Kosch, and Jan Ernst. 2017. Depthsynth: Real-time realistic synthetic data generation from cad models for 2.5 d recognition. arXiv preprint arXiv:1702.08558(2017).

[24]

Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556(2014).

[25]

Baochen Sun and Kate Saenko. 2014. From Virtual to Reality: Fast Adaptation of Virtual Object Detectors to Real Domains. In BMVC, Vol. 1. 3.

[26]

Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich, 2015. Going deeper with convolutions. CVPR.

[27]

J. Wu, L. Ma, and X. Hu. 2017. Delving deeper into convolutional neural networks for camera relocalization. In 2017 IEEE International Conference on Robotics and Automation (ICRA). 5644–5651.

Cited By

Kalathas DKoulouris DMenychtas ATsanakas PMaglogiannis I(2024)Continuous Machine Learning for Assisting AR Indoor NavigationSN Computer Science10.1007/s42979-024-03254-w5:7Online publication date: 27-Sep-2024
https://doi.org/10.1007/s42979-024-03254-w
Malta AFarinha TMendes M(2023)Augmented Reality in Maintenance—History and PerspectivesJournal of Imaging10.3390/jimaging90701429:7(142)Online publication date: 10-Jul-2023
https://doi.org/10.3390/jimaging9070142
Zhabokrytskyi I(2022)Current State and Prospects of Increasing the Functionality of Augmented Reality Using Neural NetworksÈlektronnoe modelirovanie10.15407/emodel.44.05.07344:5(73-89)Online publication date: 10-Sep-2022
https://doi.org/10.15407/emodel.44.05.073

Recommendations

Improving mobile MR applications using a cloud-based image segmentation approach with synthetic training data
Web3D '18: Proceedings of the 23rd International ACM Conference on 3D Web Technology

In this paper, we show how the quality of augmentation in mobile Mixed Reality applications can be improved using a cloud-based image segmentation approach with synthetic training data. Many modern Augmented Reality frameworks are based on visual ...
A Video Annotation Tool Using Vision-based AR Technology
CW '12: Proceedings of the 2012 International Conference on Cyberworlds

In this paper, we present a video annotation tool using vision-based Augmented Reality (AR) technology. We apply AR technology and computer vision method for making videos with 3D annotations such as image textures, video clips, 3D objects and 3D text. ...
Augmented Reality Camera Tracking with Homographies

The authors present a computer vision system for robust real-time tracking of natural features for augmented reality. This is based on the computation of a homography or projective transformation between the current image and a previously captured image ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

Web3D '19: Proceedings of the 24th International Conference on 3D Web Technology

July 2019

131 pages

ISBN:9781450367981

DOI:10.1145/3329714

Conference Chairs:
Nicholas F. Polys
Virginia Tech., U.S.A
,
Mike McCann
Monterey Bay Aquarium Research Institute, U.S.A
,
Program Chairs:
Feng Liu
Mercer University, U.S.A
,
Andreas Plesch
Harvard University, U.S.A

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 July 2019

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Badges

Best Paper

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

Web3D '19

Web3D '19: The 24th International Conference on 3D Web Technology

July 26 - 28, 2019

CA, LA, USA

Acceptance Rates

Overall Acceptance Rate 27 of 71 submissions, 38%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
243
Total Downloads

Downloads (Last 12 months)4
Downloads (Last 6 weeks)0

Reflects downloads up to 29 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Kalathas DKoulouris DMenychtas ATsanakas PMaglogiannis I(2024)Continuous Machine Learning for Assisting AR Indoor NavigationSN Computer Science10.1007/s42979-024-03254-w5:7Online publication date: 27-Sep-2024
https://doi.org/10.1007/s42979-024-03254-w
Malta AFarinha TMendes M(2023)Augmented Reality in Maintenance—History and PerspectivesJournal of Imaging10.3390/jimaging90701429:7(142)Online publication date: 10-Jul-2023
https://doi.org/10.3390/jimaging9070142
Zhabokrytskyi I(2022)Current State and Prospects of Increasing the Functionality of Augmented Reality Using Neural NetworksÈlektronnoe modelirovanie10.15407/emodel.44.05.07344:5(73-89)Online publication date: 10-Sep-2022
https://doi.org/10.15407/emodel.44.05.073

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten