research-article

Depth-map completion for large indoor scene reconstruction

Authors:

Shuhan ShenAuthors Info & Claims

Volume 99, Issue C

https://doi.org/10.1016/j.patcog.2019.107112

Published: 01 March 2020 Publication History

Highlights

•

Propose a new depth completion algorithm for MVS depth-maps.

•

Use occlusion boundary to solve depth discontinuity problem.

•

Propose an iterative filtering and completion method for large indoor scene reconstruction.

Abstract

Traditional Multi View Stereo (MVS) algorithms are often difficult to deal with large-scale indoor scene reconstruction, due to the photo-consistency measurement errors in weak textured regions, which are commonly exist in indoor scenes. To solve this limitation, in this paper we proposed a point cloud completion strategy that combines learning-based depth-map completion and geometry-based consistency filtering to fill large-area missing in depth-maps. The proposed method takes nonuniform and noisy MVS depth-map as input, and completes each depth-map individually. In the completion process, we first complete depth-maps using learning based method, and then filter each depth-map using depth consistency validation with its neighboring depth-maps. This depth-map completion and geometric filtering steps are performed iteratively until the number of depth points is converged. Experiments on large-scale indoor scenes and benchmark MVS datasets demonstrate the effectiveness of the proposed methods.

References

[1]

S.M. Seitz, B. Curless, J. Diebel, D. Scharstein, R. Szeliski, A comparison and evaluation of multi-view stereo reconstruction algorithms, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), vol. 1, IEEE, 2006, pp. 519–528.

[2]

H. Cui, X. Gao, S. Shen, Z. Hu, HSfM: hybrid structure-from-motion, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 1212–1221.

[3]

J.L. Schönberger, J.-M. Frahm, Structure-from-motion revisited, Conference on Computer Vision and Pattern Recognition (CVPR), 2016.

[4]

K.N. Kutulakos, S.M. Seitz, A theory of shape by space carving, Int. J. Comput. Vis. 38 (3) (2000) 199–218.

[5]

S. Galliani, K. Lasinger, K. Schindler, Massively parallel multiview stereopsis by surface normal diffusion, Proceedings of the IEEE International Conference on Computer Vision, 2015, pp. 873–881.

[6]

D. Zhang, J. Han, C. Li, J. Wang, X. Li, Detection of co-salient objects by looking deep and wide, Int. J. Comput. Vis. 120 (2) (2016) 215–232.

[7]

J. Han, D. Zhang, X. Hu, L. Guo, J. Ren, F. Wu, Background prior-based salient object detection via deep reconstruction residual, IEEE Trans. Circuits Syst. Video Technol. 25 (8) (2014) 1309–1321.

[8]

G. Cheng, P. Zhou, J. Han, Learning rotation-invariant convolutional neural networks for object detection in VHR optical remote sensing images, IEEE Trans. Geosci. Remote Sens. 54 (12) (2016) 7405–7415.

[9]

C. Ma, J.-B. Huang, X. Yang, M.-H. Yang, Hierarchical convolutional features for visual tracking, Proceedings of the IEEE International Conference on Computer Vision, 2015, pp. 3074–3082.

[10]

L. Wang, W. Ouyang, X. Wang, H. Lu, STCT: sequentially training convolutional networks for visual tracking, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 1373–1381.

[11]

Y. Zhang, T. Funkhouser, Deep depth completion of a single RGB-D image, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 175–185.

[12]

M. Beljan, J. Ackermann, M. Goesele, Consensus multi-view photometric stereo, Pattern Recognit. 30 (3) (2012) 548–554.

[13]

M. Goesele, N. Snavely, B. Curless, H. Hoppe, S.M. Seitz, Multi-view stereo for community photo collections, 2007 IEEE 11th International Conference on Computer Vision, IEEE, 2007, pp. 1–8.

[14]

Y. Furukawa, J. Ponce, Accurate, dense, and robust multiview stereopsis, IEEE Trans. Pattern Anal. Mach. Intell. 32 (8) (2010) 1362–1376.

[15]

G. Vogiatzis, C.H. Esteban, P.H. Torr, R. Cipolla, Multiview stereo via volumetric graph-cuts and occlusion robust photo-consistency, IEEE Trans. Pattern Anal. Mach. Intell. 29 (12) (2007) 2241–2246.

[16]

I. Kostrikov, E. Horbert, B. Leibe, Probabilistic labeling cost for high-accuracy multi-view reconstruction, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 1534–1541.

[17]

M. Blaha, M. Rothermel, M.R. Oswald, T. Sattler, A. Richard, J.D. Wegner, M. Pollefeys, K. Schindler, Semantically informed multiview surface refinement, The IEEE International Conference on Computer Vision (ICCV), 2017.

[18]

M. Jancosek, T. Pajdla, Multi-view reconstruction preserving weakly-supported surfaces, CVPR 2011, IEEE, 2011, pp. 3121–3128.

[19]

S. Shen, Z. Hu, How to select good neighboring images in depth-map merging based 3D modeling, IEEE Trans. Image Process. 23 (1) (2013) 308–318.

[20]

J.L. Schönberger, E. Zheng, J.-M. Frahm, M. Pollefeys, Pixelwise view selection for unstructured multi-view stereo, European Conference on Computer Vision, Springer, 2016, pp. 501–518.

[21]

A. Knapitsch, J. Park, Q.-Y. Zhou, V. Koltun, Tanks and temples: benchmarking large-scale scene reconstruction, ACM Trans. Graph. (ToG) 36 (4) (2017) 78.

[22]

T. Schöps, J.L. Schönberger, S. Galliani, T. Sattler, K. Schindler, M. Pollefeys, A. Geiger, A multi-view stereo benchmark with high-resolution images and multi-camera videos, Conference on Computer Vision and Pattern Recognition (CVPR), 2017.

[23]

M. Babaee, D.T. Dinh, G. Rigoll, A deep convolutional neural network for video sequence background subtraction, Pattern Recognit. 76 (2018) 635–649.

Digital Library

[24]

L. Wu, Y. Wang, X. Li, J. Gao, What-and-where to match: deep spatially multiplicative integration networks for person re-identification, Pattern Recognit. 76 (2018) 727–738.

[25]

J. Han, D. Zhang, G. Cheng, L. Guo, J. Ren, Object detection in optical remote sensing images based on weakly supervised learning and high-level feature learning, IEEE Trans. Geosci. Remote Sens. 53 (6) (2014) 3325–3337.

[26]

G. Montavon, S. Lapuschkin, A. Binder, W. Samek, K.-R. Müller, Explaining nonlinear classification decisions with deep taylor decomposition, Pattern Recognit. 65 (2017) 211–222.

Digital Library

[27]

W. Wang, J. Shen, Deep visual attention prediction, IEEE Trans. Image Process. 27 (5) (2017) 2368–2378.

Digital Library

[28]

F. Yang, X. Li, H. Cheng, Y. Guo, L. Chen, J. Li, Multi-scale bidirectional FCN for object skeleton extraction, Thirty-Second AAAI Conference on Artificial Intelligence, 2018.

[29]

W. Wang, J. Shen, F. Porikli, R. Yang, Semi-supervised video object segmentation with super-trajectories, IEEE Trans. Pattern Anal. Mach. Intell. 41 (4) (2018) 985–998.

[30]

W. Wang, J. Shen, L. Shao, Video salient object detection via fully convolutional networks, IEEE Trans. Image Process. 27 (1) (2017) 38–49.

[31]

Y. Yao, Z. Luo, S. Li, T. Fang, L. Quan, MVSNet: depth inference for unstructured multi-view stereo, Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 767–783.

[32]

Y. Yao, Z. Luo, S. Li, T. Shen, T. Fang, L. Quan, Recurrent mvsnet for high-resolution multi-view stereo depth inference, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2019) 5525–5534.

[33]

W. Hartmann, S. Galliani, M. Havlena, L. Van Gool, K. Schindler, Learned multi-patch similarity, Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 1586–1594.

[34]

P.-H. Huang, K. Matzen, J. Kopf, N. Ahuja, J.-B. Huang, DeepMVS: learning multi-view stereopsis, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 2821–2830.

[35]

M. Ji, J. Gall, H. Zheng, Y. Liu, L. Fang, SurfaceNet: an end-to-end 3D neural network for multiview stereopsis, Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 2307–2315.

[36]

F. Mal, S. Karaman, Sparse-to-dense: depth prediction from sparse depth samples and a single image, 2018 IEEE International Conference on Robotics and Automation (ICRA), IEEE, 2018, pp. 1–8.

[37]

J. Uhrig, N. Schneider, L. Schneider, U. Franke, T. Brox, A. Geiger, Sparsity invariant CNNs, 2017 International Conference on 3D Vision (3DV), IEEE, 2017, pp. 11–20.

[38]

J. Ku, A. Harakeh, S.L. Waslander, In defense of classical image processing: fast depth completion on the CPU, 2018 15th Conference on Computer and Robot Vision (CRV), IEEE, 2018, pp. 16–22.

[39]

N. Chodosh, C. Wang, S. Lucey, Deep convolutional compressed sensing for lidar depth completion, Asian Conference on Computer Vision, 2018, pp. 499–513.

[40]

K.R. Vijayanagar, M. Loghman, J. Kim, Real-time refinement of kinect depth maps using multi-resolution anisotropic diffusion, Mobile Netw. Appl. 19 (3) (2014) 414–425.

[41]

M. Jaritz, R. De Charette, E. Wirbel, X. Perrotton, F. Nashashibi, Sparse and dense data with CNNs: depth completion and semantic segmentation, 2018 International Conference on 3D Vision (3DV), IEEE, 2018, pp. 52–60.

[42]

A. Eldesokey, M. Felsberg, F.S. Khan, Confidence Propagation through CNNs for Guided Sparse Depth Regression, IEEE Trans. Pattern Anal. Mach. Intell. (2019) 11,.

Digital Library

[43]

W. Van Gansbeke, D. Neven, B. De Brabandere, L. Van Gool, Sparse and Noisy LiDAR Completion with RGB Guidance and Uncertainty, 16th International Conference on Machine Vision Applications (MVA), 2019, pp. 1–6.

[44]

M. Camplani, L. Salgado, Efficient spatio-temporal hole filling strategy for kinect depth maps, Three-dimensional image processing (3DIP) and applications Ii, vol. 8290, International Society for Optics and Photonics, 2012, p. 82900E.

[45]

L.-K. Liu, S.H. Chan, T.Q. Nguyen, Depth reconstruction from sparse samples: representation, algorithm, and sampling, IEEE Trans. Image Process. 24 (6) (2015) 1983–1996.

[46]

J. Park, H. Kim, Y.-W. Tai, M.S. Brown, I.S. Kweon, High-quality depth map upsampling and completion for RGB-D cameras, IEEE Trans. Image Process. 23 (12) (2014) 5559–5572.

[47]

Y. Zhang, Y. Feng, X. Liu, D. Zhai, X. Ji, H. Wang, Q. Dai, Color-guided depth image recovery with adaptive data fidelity and transferred graph Laplacian regularization, IEEE Trans. Circuits Syst. Video Technol. (2019).

[48]

S. Shen, Accurate multiple view 3D reconstruction using patch-based stereo for large-scale scenes, IEEE Trans. Image Process. 22 (5) (2013) 1901–1914.

[49]

T. Davis, Csparse, Society for Industrial and Applied Mathematics, Philadephia, PA, 6, 2006.

Cited By

Aldayri AAlbattah W(2024)A deep learning approach for anomaly detection in large-scale Hajj crowdsThe Visual Computer: International Journal of Computer Graphics10.1007/s00371-023-03124-140:8(5589-5603)Online publication date: 1-Aug-2024
https://dl.acm.org/doi/10.1007/s00371-023-03124-1
Wang YLi ZJiang YZhou KCao TFu YXiao C(2022)NeuralRoomACM Transactions on Graphics10.1145/3550454.355551441:6(1-15)Online publication date: 30-Nov-2022
https://dl.acm.org/doi/10.1145/3550454.3555514
Zhou KHong LChen CXu HYe CHu QLi Z(2022)DevNet: Self-supervised Monocular Depth Learning via Density Volume ConstructionComputer Vision – ECCV 202210.1007/978-3-031-19842-7_8(125-142)Online publication date: 23-Oct-2022
https://dl.acm.org/doi/10.1007/978-3-031-19842-7_8

Index Terms

Depth-map completion for large indoor scene reconstruction
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
  2. Computer graphics

Index terms have been assigned to the content through auto-classification.

Recommendations

Depth completion for kinect v2 sensor

Kinect v2 adopts a time-of-flight (ToF) depth sensing mechanism, which causes different type of depth artifacts comparing to the original Kinect v1. The goal of this paper is to propose a depth completion method, which is designed especially for the ...
Elimination of Incorrect Depth Points for Depth Completion
Advances in Computer Graphics
Abstract
Commodity-level scan cameras generally capture RGB-D image with depth missing or incorrect depth points if the surface of the object is transparent, bright, or black. These incorrect depth points are generated randomly and limit the downstream ...
RGB-D SLAM with Deep Depth Completion
Artificial Intelligence and Soft Computing
Abstract
RGB-D indoor mapping has been an active research topic in the last decade with the release of various depth sensors. Researchers proposed impressive SLAM systems such as ORB-SLAM2. However, the depth sensors are sensitive to illumination ...

Comments

Information & Contributors

Information

Published In

cover image Pattern Recognition

Pattern Recognition Volume 99, Issue C

Mar 2020

162 pages

ISSN:0031-3203

Issue’s Table of Contents

Copyright © 2019.

Publisher

Elsevier Science Inc.

United States

Publication History

Published: 01 March 2020

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 28 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Aldayri AAlbattah W(2024)A deep learning approach for anomaly detection in large-scale Hajj crowdsThe Visual Computer: International Journal of Computer Graphics10.1007/s00371-023-03124-140:8(5589-5603)Online publication date: 1-Aug-2024
https://dl.acm.org/doi/10.1007/s00371-023-03124-1
Wang YLi ZJiang YZhou KCao TFu YXiao C(2022)NeuralRoomACM Transactions on Graphics10.1145/3550454.355551441:6(1-15)Online publication date: 30-Nov-2022
https://dl.acm.org/doi/10.1145/3550454.3555514
Zhou KHong LChen CXu HYe CHu QLi Z(2022)DevNet: Self-supervised Monocular Depth Learning via Density Volume ConstructionComputer Vision – ECCV 202210.1007/978-3-031-19842-7_8(125-142)Online publication date: 23-Oct-2022
https://dl.acm.org/doi/10.1007/978-3-031-19842-7_8
Li SShi JSong WHao AQin H(2021)Hierarchical Object Relationship Constrained Monocular Depth Estimation.Pattern Recognition10.1016/j.patcog.2021.108116120:COnline publication date: 1-Dec-2021
https://dl.acm.org/doi/10.1016/j.patcog.2021.108116

View Options

View options

Media

Figures

Other

Tables

View Issue’s Table of Contents