skip to main content
research-article

BundleFusion: Real-Time Globally Consistent 3D Reconstruction Using On-the-Fly Surface Reintegration

Published: 01 May 2017 Publication History

Abstract

Real-time, high-quality, 3D scanning of large-scale scenes is key to mixed reality and robotic applications. However, scalability brings challenges of drift in pose estimation, introducing significant errors in the accumulated model. Approaches often require hours of offline processing to globally correct model errors. Recent online methods demonstrate compelling results but suffer from (1) needing minutes to perform online correction, preventing true real-time use; (2) brittle frame-to-frame (or frame-to-model) pose estimation, resulting in many tracking failures; or (3) supporting only unstructured point-based representations, which limit scan quality and applicability. We systematically address these issues with a novel, real-time, end-to-end reconstruction framework. At its core is a robust pose estimation strategy, optimizing per frame for a global set of camera poses by considering the complete history of RGB-D input with an efficient hierarchical approach. We remove the heavy reliance on temporal tracking and continually localize to the globally optimized frames instead. We contribute a parallelizable optimization framework, which employs correspondences based on sparse features and dense geometric and photometric matching. Our approach estimates globally optimized (i.e., bundle adjusted) poses in real time, supports robust tracking with recovery from gross tracking failures (i.e., relocalization), and re-estimates the 3D model in real time to ensure global consistency, all within a single framework. Our approach outperforms state-of-the-art online systems with quality on par to offline methods, but with unprecedented speed and scan completeness. Our framework leads to a comprehensive online scanning solution for large indoor environments, enabling ease of use and high-quality results.1

Supplementary Material

JPG File (tog-13.jpg)
dai (dai.zip)
Supplemental movie, appendix, image and software files for, BundleFusion: Real-Time Globally Consistent 3D Reconstruction Using On-the-Fly Surface Reintegration
MP4 File (tog-13.mp4)

References

[1]
S. Agarwal, K. Mierle, and Others. 2013. Ceres Solver. Retrieved from http://ceres-solver.org. (2013).
[2]
P. J. Besl and N. D. McKay. 1992. A method for registration of 3-D shapes. IEEE Trans. PAMI 14, 2 (1992), 239--256.
[3]
J. Chen, D. Bautembach, and S. Izadi. 2013. Scalable real-time volumetric surface reconstruction. ACM TOG 32, 4 (2013), 113.
[4]
S. Choi, Q.-Y. Zhou, and V. Koltun. 2015. Robust reconstruction of indoor scenes. In Proc. CVPR. 5556--5565.
[5]
B. Curless and M. Levoy. 1996. A volumetric method for building complex models from range images. In Proc. SIGGRAPH. ACM, 303--312.
[6]
Z. DeVito, M. Mara, M. Zollhöfer, G. Bernstein, J. Ragan-Kelley, C. Theobalt, P. Hanrahan, M. Fisher, and M. Nießner. 2016. Opt: A domain specific language for non-linear least squares optimization in graphics and imaging. arXiv Preprint arXiv:1604.06525 (2016).
[7]
A. Elfes and L. Matthies. 1987. Sensor integration for robot navigation: Combining sonar and stereo range data in a grid-based representataion. In 26th IEEE Conference on Decision and Control, 1987, Vol. 26. IEEE, 1802--1807.
[8]
F. Endres, J. Hess, N. Engelhard, J. Sturm, D. Cremers, and W. Burgard. 2012. An evaluation of the RGB-D SLAM system. In Proc. ICRA. IEEE, 1691--1696.
[9]
J. Engel, T. Schöps, and D. Cremers. 2014. LSD-SLAM: Large-scale direct monocular SLAM. In European Conference on Computer Vision.
[10]
J. Engel, J. Sturm, and D. Cremers. 2013. Semi-dense visual odometry for a monocular camera. In Proc. ICCV. IEEE, 1449--1456.
[11]
N. Fioraio, J. Taylor, A. Fitzgibbon, L. Di Stefano, and S. Izadi. 2015. Large-scale and drift-free surface reconstruction using online subvolume registration. Proc. CVPR (June 2015).
[12]
C. Forster, M. Pizzoli, and D. Scaramuzza. 2014. SVO: Fast semi-direct monocular visual odometry. In Proc. ICRA. IEEE, 15--22.
[13]
S. Fuhrmann and M. Goesele. 2014. Floating scale surface reconstruction. In Proc. SIGGRAPH.
[14]
D. Gallup, M. Pollefeys, and J.-M. Frahm. 2010. 3D reconstruction using an n-layer heightmap. In Pattern Recognition. Springer, 1--10.
[15]
B. Glocker, J. Shotton, A. Criminisi, and S. Izadi. 2015. Real-time RGB-D camera relocalization via randomized ferns for keyframe encoding. TVCG 21, 5 (2015), 571--583.
[16]
J. C. Gower. 1975. Generalized procrustes analysis. Psychometrika 40, 1 (1975), 33--51.
[17]
A. Handa, T. Whelan, J. B. McDonald, and A. J. Davison. 2014. A benchmark for RGB-D visual odometry, 3D reconstruction and SLAM. In Proc. ICRA.
[18]
P. Henry, M. Krainin, E. Herbst, X. Ren, and D. Fox. 2010. RGB-D mapping: Using depth cameras for dense 3D modeling of indoor environments. In Proc. Int. Symp. Experimental Robotics, Vol. 20. 22--25.
[19]
A. Hilton, A. Stoddart, J. Illingworth, and T. Windeatt. 1996. Reliable surface reconstruction from multiple range images. J. Proc. ECCV 1 (1996), 117--126.
[20]
S. Izadi, D. Kim, O. Hilliges, D. Molyneaux, R. Newcombe, P. Kohli, J. Shotton, S. Hodges, D. Freeman, A. Davison, and A. Fitzgibbon. 2011. KinectFusion: Real-time 3D reconstruction and interaction using a moving depth camera. In Proc. UIST. 559--568.
[21]
W. Kabsch. 1976. A solution for the best rotation to relate two sets of vectors. Acta Crystallogr. A: Crystal Phys. Diffract. Theoret. General Crystallogr. 32, 5 (1976), 922--923.
[22]
M. Keller, D. Lefloch, M. Lambers, S. Izadi, T. Weyrich, and A. Kolb. 2013. Real-time 3D reconstruction in dynamic scenes using point-based fusion. In Proc. 3DV. IEEE, 1--8.
[23]
C. Kerl, J. Sturm, and D. Cremers. 2013. Dense visual SLAM for RGB-D cameras. In Proc. IROS.
[24]
G. Klein and D. Murray. 2007. Parallel tracking and mapping for small AR workspaces. In Proc. ISMAR.
[25]
R. Kümmerle, G. Grisetti, H. Strasdat, K. Konolige, and W. Burgard. 2011. g 2 o: A general framework for graph optimization. In Proc. ICRA. IEEE, 3607--3613.
[26]
M. Levoy, K. Pulli, B. Curless, S. Rusinkiewicz, D. Koller, L. Pereira, M. Ginzton, S. Anderson, J. Davis, J. Ginsberg, and D. Fulk. 2000. The digital michelangelo project: 3D scanning of large statues. In Proc. SIGGRAPH. ACM Press/Addison-Wesley Publishing Co., 131--144.
[27]
H. Li, E. Vouga, A. Gudym, L. Luo, J. T. Barron, and G. Gusev. 2013. 3D self-portraits. ACM TOG 32, 6 (2013), 187.
[28]
D. G. Lowe. 2004. Distinctive image features from scale-invariant keypoints. IJCV 60 (2004), 91--110.
[29]
R. Maier, J. Sturm, and D. Cremers. 2014. Submap-based bundle adjustment for 3D reconstruction from RGB-D data. In Proc. GCPR.
[30]
M. Meilland and A. Comport. 2013. On unifying key-frame and voxel-based dense visual SLAM at large scales. In Proc. IROS. IEEE, 3677--3683.
[31]
M. Meilland, A. Comport, P. Rives, and I. S. Antipolis Méditerranée. 2011. Real-time dense visual tracking under large lighting variations. In Proc. BMVC, Vol. 29.
[32]
P. Merrell, A. Akbarzadeh, L. Wang, P. Mordohai, J. M. Frahm, R. Yang, D. Nistér, and M. Pollefeys. 2007. Real-time visibility-based fusion of depth maps. In Proc. ICCV. 1--8.
[33]
R. M. Murray, S. S. Sastry, and L. Zexiang. 1994. A Mathematical Introduction to Robotic Manipulation. CRC Press.
[34]
R. A. Newcombe, S. Izadi, O. Hilliges, D. Molyneaux, D. Kim, A. J. Davison, P. Kohli, J. Shotton, S. Hodges, and A. Fitzgibbon. 2011a. KinectFusion: Real-time dense surface mapping and tracking. In Proc. ISMAR. 127--136.
[35]
R. A. Newcombe, S. J. Lovegrove, and A. J. Davison. 2011b. DTAM: Dense tracking and mapping in real-time. In Proc. ICCV. 2320--2327.
[36]
M. Nießner, A. Dai, and M. Fisher. 2014. Combining inertial navigation and ICP for real-time 3d surface reconstruction. In Eurographics (Short Papers). 13--16.
[37]
M. Nießner, M. Zollhöfer, S. Izadi, and M. Stamminger. 2013. Real-time 3D reconstruction at scale using voxel hashing. ACM TOG 32, 6 (2013), 169.
[38]
V. Pradeep, C. Rhemann, S. Izadi, C. Zach, M. Bleyer, and S. Bathiche. 2013. MonoFusion: Real-time 3d reconstruction of small scenes with a single web camera. In Proc. ISMAR. 83--88.
[39]
F. Reichl, J. Weiss, and R. Westermann. 2015. Memory-efficient interactive online reconstruction from depth image streams. In Computer Graphics Forum. Wiley Online Library.
[40]
H. Roth and M. Vona. 2012. Moving volume kinectfusion. In Proc. BMVC.
[41]
S. Rusinkiewicz, O. Hall-Holt, and M. Levoy. 2002. Real-time 3D model acquisition. ACM TOG 21, 3 (2002), 438--446.
[42]
S. Rusinkiewicz and M. Levoy. 2001. Efficient variants of the ICP algorithm. In Proc. 3DIM. 145--152.
[43]
N. Silberman, D. Hoiem, P. Kohli, and R. Fergus. 2012. Indoor segmentation and support inference from RGBD images. In ECCV.
[44]
F. Steinbruecker, C. Kerl, J. Sturm, and D. Cremers. 2013. Large-scale multi-resolution surface reconstruction from RGB-D sequences. In Proc. ICCV.
[45]
F. Steinbruecker, J. Sturm, and D. Cremers. 2014. Volumetric 3D mapping in real-time on a CPU. In 2014 IEEE International Conference on Robotics and Automation (ICRA’14).
[46]
J. Stückler and S. Behnke. 2014. Multi-resolution surfel maps for efficient dense 3D modeling and tracking. J. Visual Communication Image Representation 25, 1 (2014), 137--147.
[47]
J. Sturm, N. Engelhard, F. Endres, W. Burgard, and D. Cremers. 2012. A benchmark for the evaluation of RGB-D SLAM systems. In Proc. IROS.
[48]
B. Triggs, P. F. McLauchlan, R. I. Hartley, and A. W. Fitzgibbon. 2000. Bundle adjustment, a modern synthesis. In Vision Algorithms: Theory and Practice. Springer, 298--372.
[49]
J. Valentin, M. Nießner, J. Shotton, A. Fitzgibbon, S. Izadi, and P. Torr. 2015. Exploiting uncertainty in regression forests for accurate camera relocalization. In Proc. CVPR. 4400--4408.
[50]
T. Weise, T. Wismer, B. Leibe, and L. Van Gool. 2009. In-hand scanning with online loop closure. In Proc. ICCV Workshops. 1630--1637.
[51]
T. Whelan, H. Johannsson, M. Kaess, J. Leonard, and J. McDonald. 2012. Robust Tracking for Real-Time Dense RGB-D Mapping with Kintinuous. Technical Report. Query date: 10-25-2012.
[52]
T. Whelan, H. Johannsson, M. Kaess, J.J. Leonard, and J. McDonald. 2013a. Robust real-time visual odometry for dense RGB-D mapping. In Proc. ICRA.
[53]
T. Whelan, M. Kaess, J. J. Leonard, and J. McDonald. 2013b. Deformation-based loop closure for large scale dense RGB-D SLAM. In Proc. IROS. IEEE, 548--555.
[54]
T. Whelan, S. Leutenegger, R. F. Salas-Moreno, B. Glocker, and A. J. Davison. 2015. ElasticFusion: Dense SLAM without a pose graph. In Proc. RSS. Rome, Italy.
[55]
C. Wu, M. Zollhöfer, M. Nießner, M. Stamminger, S. Izadi, and C. Theobalt. 2014. Real-time shading-based refinement for consumer depth cameras. ACM TOG 33, 6 (2014), 200.
[56]
K. M. Wurm, A. Hornung, M. Bennewitz, C. Stachniss, and W. Burgard. 2010. OctoMap: A probabilistic, flexible, and compact 3D map representation for robotic systems. In Proc. ICRA, Vol. 2.
[57]
J. Xiao, A. Owens, and A. Torralba. 2013. SUN3D: A database of big spaces reconstructed using sfm and object labels. In Proc. ICCV. IEEE, 1625--1632.
[58]
M. Zeng, F. Zhao, J. Zheng, and X. Liu. 2012. Octree-based fusion for realtime 3d reconstruction. Graph. Models 75, 3 (2012), 126--136.
[59]
Y. Zhang, W. Xu, Y. Tong, and K. Zhou. 2015. Online structure analysis for real-time indoor scene reconstruction. ACM TOG 34, 5 (2015), 159.
[60]
Q-Y. Zhou and V. Koltun. 2013. Dense scene reconstruction with points of interest. ACM TOG 32, 4 (2013), 112.
[61]
Q.-Y. Zhou and V. Koltun. 2014. Color map optimization for 3D reconstruction with consumer depth cameras. ACM TOG 33, 4 (2014), 155.
[62]
Q.-Y. Zhou, S. Miller, and V. Koltun. 2013. Elastic fragments for dense scene reconstruction. In 2013 IEEE International Conference on Computer Vision (ICCV’13). IEEE, 473--480.
[63]
M. Zollhöfer, A. Dai, M. Innmann, C. Wu, M. Stamminger, C. Theobalt, and M. Nießner. 2015. Shading-based refinement on volumetric signed distance functions. ACM TOG 34, 4 (2015), 96.
[64]
M. Zollhöfer, M. Nießner, S. Izadi, C. Rehmann, C. Zach, M. Fisher, C. Wu, A. Fitzgibbon, C. Loop, C. Theobalt, and M. Stamminger. 2014. Real-time non-rigid reconstruction using an RGB-D camera. ACM TOG 33, 4 (2014), 156.

Cited By

View all

Index Terms

  1. BundleFusion: Real-Time Globally Consistent 3D Reconstruction Using On-the-Fly Surface Reintegration

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Graphics
    ACM Transactions on Graphics  Volume 36, Issue 3
    June 2017
    165 pages
    ISSN:0730-0301
    EISSN:1557-7368
    DOI:10.1145/3087678
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 01 May 2017
    Accepted: 01 January 2017
    Revised: 01 December 2016
    Received: 01 April 2016
    Published in TOG Volume 36, Issue 3

    Permissions

    Request permissions for this article.

    Check for updates

    Badges

    Author Tags

    1. RGB-D
    2. global consistency
    3. real-time
    4. scalable
    5. scan

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)64
    • Downloads (Last 6 weeks)6
    Reflects downloads up to 01 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media