skip to main content
research-article

Autoscanning for coupled scene reconstruction and proactive object analysis

Published: 02 November 2015 Publication History

Abstract

Detailed scanning of indoor scenes is tedious for humans. We propose autonomous scene scanning by a robot to relieve humans from such a laborious task. In an autonomous setting, detailed scene acquisition is inevitably coupled with scene analysis at the required level of detail. We develop a framework for object-level scene reconstruction coupled with object-centric scene analysis. As a result, the autoscanning and reconstruction will be object-aware, guided by the object analysis. The analysis is, in turn, gradually improved with progressively increased object-wise data fidelity. In realizing such a framework, we drive the robot to execute an iterative analyze-and-validate algorithm which interleaves between object analysis and guided validations.
The object analysis incorporates online learning into a robust graph-cut based segmentation framework, achieving a global update of object-level segmentation based on the knowledge gained from robot-operated local validation. Based on the current analysis, the robot performs proactive validation over the scene with physical push and scan refinement, aiming at reducing the uncertainty of both object-level segmentation and object-wise reconstruction. We propose a joint entropy to measure such uncertainty based on segmentation confidence and reconstruction quality, and formulate the selection of validation actions as a maximum information gain problem. The output of our system is a reconstructed scene with both object extraction and object-wise geometry fidelity.

Supplementary Material

ZIP File (a177-xu.zip)
Supplemental files.

References

[1]
Allen, P. K. 1988. Integrating vision and touch for object recognition tasks. Int. J. Robotics Research 7, 6, 1533.
[2]
Bach, F., Lanckriet, G., and Jordan, M. 2004. Multiple kernel learning, conic duality, and the smo algorithm. In Proc. ICML, 1--6.
[3]
Berger, M., Tagliasacchi, A., Seversky, L. M., Alliez, P., Levine, J. A., Sharf, A., and Silva, C. 2014. State of the art in surface reconstruction from point clouds. Eurographics STAR, 165--185.
[4]
Bersch, C., Pangercic, D., Osentoski, S., Hausman, K., Marton, Z.-C., Ueda, R., Okada, K., and Beetz, M. 2012. Segmentation of cluttered scenes through interactive perception. In RSS Workshop on Robots in Clutter: Manipulation, Perception and Navigation in Human Environments.
[5]
Callieri, M., Fasano, A., Impoco, G., Cignoni, P., Scopigno, R., Parrini, G., and Biagini, G. 2004. Roboscan: an automatic system for accurate and unattended 3D scanning. In Proc. of 3DPVT, 805--812.
[6]
Chen, S., Li, Y., and Kwok, N. M. 2011. Active vision in robotic systems: A survey of recent developments. Int. J. Robotics Research 30, 11, 1343--1377.
[7]
Chen, J., Bautembach, D., and Izadi, S. 2013. Scalable real-time volumetric surface reconstruction. ACM Trans. on Graph. (SIGGRAPH) 32, 4, 113:1--113:16.
[8]
Chen, X., Golovinskiy, A., and Funkhouser, T. 2013. A benchmark for 3D mesh segmentation. ACM Trans. on Graph. (SIGGRAPH) 28, 3, 73:1--73:12.
[9]
Chen, K., Lai, Y.-K., Wu, Y.-X., Martin, R., and Hu, S.-M. 2014. Automatic semantic modeling of indoor scenes from low-quality rgb-d data using contextual information. ACM Trans. on Graph. (SIGGRAPH Asia) 33, 6, 208:1--208:15.
[10]
Cover, T., and Thomas, J. 1991. Elements of Information Theory. Wiley.
[11]
Crammer, K., Dekel, O., Keshet, J., Shalev-Shwartz, S., and Singer, Y. 2006. Online passive-aggressive algorithms. J. Mach. Learn. Res. 7 (Dec.), 551--585.
[12]
Curless, B., and Levoy, M. 1996. A volumetric method for building complex models from range images. In Proc. of SIGGRAPH, 303--312.
[13]
Fisher, M., Savva, M., and Hanrahan, P. 2011. Characterizing structural relationships in scenes using graph kernels. ACM Trans. on Graph. (SIGGRAPH) 30, 4, 34:1--34:11.
[14]
Fisher, M., Ritchie, D., Savva, M., Funkhouser, T., and Hanrahan, P. 2012. Example-based synthesis of 3D object arrangements. ACM Trans. on Graph. (SIGGRAPH Asia) 31, 6, 135:1--135:11.
[15]
Foster, R. B., Wang, R., and Grupen, R. 2011. A mobile robot for autonomous scene capture and rendering. UMass Technical Report UM-CS-2011-019.
[16]
Golovinskiy, A., Kim, V. G., and Funkhouser, T. A. 2009. Shape-based recognition of 3D point clouds in urban environments. In Proc. ICCV, 2154--2161.
[17]
Gupta, S., Arbelaez, P., and Malik, J. 2013. Perceptual organization and recognition of indoor scenes from RGB-D images. In Proc. CVPR, 564--571.
[18]
Hausman, K., Balint-Benczedi, F., Pangercic, D., Marton, Z.-C., Ueda, R., Okada, K., and Beetz, M. 2013. Tracking-based interactive segmentation of textureless objects. In Proc. ICRA, 1122--1129.
[19]
Hedau, V., Hoiem, D., and Forsyth, D. 2010. Thinking inside the box: Using appearance models and context based on room geometry. In Proc. ECCV. 224--237.
[20]
Herbst, E., Henry, P., and Fox, D. 2014. Toward online 3-D object segmentation and mapping. In Proc. ICRA, 3193--3200.
[21]
Jiang, Y., and Saxena, A. 2013. Hallucinating humans for learning robotic placement of objects. In Proc. Experimental Robotics, 921--937.
[22]
Katz, S., and Tal, A. 2003. Hierarchical mesh decomposition using fuzzy clustering and cuts. ACM Trans. on Graph. (SIGGRAPH) 22, 3, 954--961.
[23]
Khalfaoui, S., Seulin, R., Fougerolle, Y., and Fofi, D. 2013. An efficient method for fully automatic 3D digitization of unknown objects. Computers in Industry 64, 9, 1152--1160.
[24]
Kim, Y. M., Mitra, N. J., Yan, D.-M., and Guibas, L. 2012. Acquiring 3D indoor environments with variability and repetition. ACM Trans. on Graph. (SIGGRAPH Asia) 31, 6, 138:1--138:11.
[25]
Levandowsky, M., and Winter, D. 1971. Distance between sets. Nature 234, 5, 34--35.
[26]
Li, Y., Dai, A., Guibas, L., and Niessner, M. 2015. Database-assisted object retrieval for real-time 3D reconstruction. Computer Graphics Forum (Eurographics) 34, 2.
[27]
Liu, T., Chaudhuri, S., Kim, V. G., Huang, Q., Mitra, N. J., and Funkhouser, T. 2014. Creating consistent scene graphs using a probabilistic grammar. ACM Trans. on Graph. (SIGGRAPH Asia) 33, 6, 211:1--211:12.
[28]
Mattausch, O., Panozzo, D., Mura, C., Sorkine-Hornung, O., and Pajarola, R. 2014. Object detection and classification from large-scale cluttered indoor scans. Computer Graphics Forum (Eurographics) 33, 2.
[29]
Nan, L., Xie, K., and Sharf, A. 2012. A search-classify approach for cluttered indoor scene understanding. ACM Trans. on Graph. (SIGGRAPH Asia) 31, 6, 137:1--137:10.
[30]
Newcombe, R. A., Davison, A. J., Izadi, S., Kohli, P., Hilliges, O., Shotton, J., Molyneaux, D., Hodges, S., Kim, D., and Fitzgibbon, A. 2011. KinectFusion: Real-time dense surface mapping and tracking. In Proc. IEEE Int. Symp. on Mixed and Augmented Reality, 127--136.
[31]
Niessner, M., Zollhöfer, M., Izadi, S., and Stamminger, M. 2013. Real-time 3D reconstruction at scale using voxel hashing. ACM Trans. on Graph. (SIGGRAPH Asia) 32, 6, 169:1--169:11.
[32]
Papon, J., Abramov, A., Schoeler, M., and Wörgötter, F. 2013. Voxel cloud connectivity segmentation - supervoxels for point clouds. In Proc. CVPR, 2027--2034.
[33]
Prisacariu, V. A., Kähler, O., Cheng, M. M., Valentin, J., Torr, P. H. S., Reid, I. D., and Murray, D. W. 2014. A framework for the volumetric integration of depth images. ArXiv e-prints, 1410.0925.
[34]
ROS, 2014. ROS Wiki. http://wiki.ros.org/.
[35]
Roth, H., and Vona, M. 2012. Moving volume KinectFusion. In Proc. BMVC, 112:1--112:11.
[36]
Salas-Moreno, R. F., Newcombe, R. A., Strasdat, H., Kelly, P. H. J., and Davison, A. J. 2012. SLAM++: Simultaneous localisation and mapping at the level of objects. In CVPR, 1352--1359.
[37]
Savva, M., Chang, A. X., Hanrahan, P., Fisher, M., and Niessner, M. 2014. Scenegrok: Inferring action maps in 3D environments. ACM Trans. on Graph. (SIGGRAPH Asia) 33, 6.
[38]
Schnabel, R., Wahl, R., and Klein, R. 2007. Efficient RANSAC for point-cloud shape detection. Computer Graphics Forum 26, 2, 214--226.
[39]
Shao, T., Xu, W., Zhou, K., Wang, J., Li, D., and Guo, B. 2012. An interactive approach to semantic modeling of indoor scenes with an RGBD camera. ACM Trans. on Graph. (SIGGRAPH Asia) 31, 6, 136:1--136:11.
[40]
Silberman, N., Kohli, P., Hoiem, D., and Fergus, R. 2012. Indoor segmentation and support inference from RGBD images. In Proc. ECCV, 746--760.
[41]
Valentin, J., Vineet, V., Cheng, M.-M., Kim, D., Shotton, J., Kohli, P., Niessner, M., Criminisi, A., Izadi, S., and Torr, P. 2015. SemanticPaint: Interactive 3D labeling and learning at your finger tips. ACM Trans. on Graph., to appear.
[42]
Wagner, R., Frese, U., and Buml, B. 2013. Real-time dense multi-scale workspace modeling on a humanoid robot. In Proc. IROS, 5164--5171.
[43]
Whelan, T., Kaess, M., Fallon, M., Johannsson, H., Leonard, J., and McDonald, J. 2012. Kintinuous: Spatially extended KinectFusion. In RSS Workshop on RGB-D: Advanced Reasoning with Depth Cameras.
[44]
Wu, S., Sun, W., Long, P., Huang, H., Cohen-Or, D., Gong, M., Deussen, O., and Chen, B. 2014. Quality-driven poisson-guided autoscanning. ACM Trans. on Graph. (SIGGRAPH Asia) 33, 6, 203:1--203:12.
[45]
Zhang, Y., Xu, W., Tong, Y., and Zhou, K. 2014. Online structure analysis for real-time indoor scene reconstruction. ACM Trans. on Graph.
[46]
Zhou, Q.-Y., and Koltun, V. 2013. Dense scene reconstruction with points of interest. ACM Trans. on Graph. (SIGGRAPH) 32, 4, 112:1--112:8.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Graphics
ACM Transactions on Graphics  Volume 34, Issue 6
November 2015
944 pages
ISSN:0730-0301
EISSN:1557-7368
DOI:10.1145/2816795
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 November 2015
Published in TOG Volume 34, Issue 6

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. autonomous scanning
  2. object-aware reconstruction
  3. proactive analysis
  4. scene reconstruction

Qualifiers

  • Research-article

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)32
  • Downloads (Last 6 weeks)6
Reflects downloads up to 06 Nov 2024

Other Metrics

Citations

Cited By

View all

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media