skip to main content
article

Multi-View Scene Capture by Surfel Sampling: From Video Streams to Non-Rigid 3D Motion, Shape and Reflectance

Published: 01 September 2002 Publication History

Abstract

In this paper we study the problem of recovering the 3D shape, reflectance, and non-rigid motion properties of a dynamic 3D scene. Because these properties are completely unknown and because the scene's shape and motion may be non-smooth, our approach uses multiple views to build a piecewise-continuous geometric and radiometric representation of the scene's trace in space-time. A basic primitive of this representation is the dynamic surfel, which (1) encodes the instantaneous local shape, reflectance, and motion of a small and bounded region in the scene, and (2) enables accurate prediction of the region's dynamic appearance under known illumination conditions. We show that complete surfel-based reconstructions can be created by repeatedly applying an algorithm called Surfel Sampling that combines sampling and parameter estimation to fit a single surfel to a small, bounded region of space-time. Experimental results with the Phong reflectance model and complex real scenes (clothing, shiny objects, skin) illustrate our method's ability to explain pixels and pixel variations in terms of their underlying causes—shape, reflectance, motion, illumination, and visibility.

References

[1]
Amenta, N., Bern, M., and Kamvysselis, M. 1998. A new Voronoi-based surface reconstruction algorithm. In Proc. SIGGRAPH'98, pp. 415-421.]]
[2]
Anandan, P. 1989. A computational framework and an algorithm for the measurement of visual motion. Int. J. Computer Vision, 2:283-310.]]
[3]
Avidan, S. and Shashua, A. 2000. Trajectory triangulation: 3D reconstruction of moving points from a monocular image sequence. IEEE Trans. Pattern Anal. Machine Intell., 22(4):348-357.]]
[4]
Baraff, D. and Witkin, A. 1998. Large steps in cloth simulation. In Proc. SIGGRAPH'98, pp. 43-54.]]
[5]
Belhumeur, P.N. 1996. A Bayesian approach to binocular stereopsis. Int. J. Computer Vision, 19(3):237-260.]]
[6]
Ben-Ezra, M., Peleg, S., and Werman, M. 2000. Real-time motion analysis with linear programming. Computer Vision and Image Understanding, 78(1):32-52.]]
[7]
Béréziat, D., Herlin, I., and Younes, L. 2000. A generalized optical flow constraint and its physical interpretation. In Proc. Computer Vision and Pattern Recognition Conf., vol. 2, pp. 487-492.]]
[8]
Black, M.J. 1999. Explaining optical flow events with parameterized spatio-temporal models, In Proc. Computer Vision and Pattern Recognition Conf., vol. 1, pp. 326-332.]]
[9]
Black, M.J. and Anandan, P. 1996. The robust estimation of multiple motions: Parametric and piecewise-smooth flow fields. Computer Vision and Image Understanding, 63(1):75-104.]]
[10]
Black, M.J., Fleet, D.J., and Yacoob, Y. 2000. Robustly estimating changes in image appearance. Computer Vision and Image Understanding , 78(1):8-31.]]
[11]
Blake, A. and Bulthoff, H. 1991. Shape from specularities: Computation and psychophysics. Phil. Trans. R. Soc. Lond., 331:237-252.]]
[12]
Blinn, J.F. 1978. Simulation of wrinkled surfaces. Computer Graphics , 12(3):286-292.]]
[13]
Bouguet, J.-Y. and Perona, P. 1998. 3D photography on your desk. In Proc. 6th Int. Conf. on Computer Vision, pp. 43-50.]]
[14]
Bregler, C., Hertzmann, A., and Biermann, H. 2000. Recovering non-rigid 3D shape from image streams. In Proc. Computer Vision and Pattern Recognition Conf., vol. 2, pp. 690-696.]]
[15]
Bregler, C. and Malik, J. 1998. Tracking people with twists and exponential maps. In Proc. Computer Vision and Pattern Recognition Conf., pp. 8-15.]]
[16]
Brodsky, T., Fermuller, C., and Aloimonos, Y. 1999. Shape from video. In Proc. Computer Vision and Pattern Recognition Conf., vol. 2, pp. 146-151.]]
[17]
Burt, P.J. and Adelson, E.H. 1983. The Laplacian pyramid as a compact image code. IEEE Trans. on Communications, 31(4):532- 540.]]
[18]
Carceroni, R.L. and Kutulakos, K.N. 1999a. Toward recovering shape and motion of 3D curves from multi-view image sequences. In Proc. Computer Vision and Pattern Recognition Conf., vol. 1, pp. 192-197.]]
[19]
Carceroni, R.L. and Kutulakos, K.N. 1999b. Multi-view 3D shape and motion recovery on the spatio-temporal curve manifold. In Proc. 7th Int. Conf. on Computer Vision., vol. 1, pp. 520- 527.]]
[20]
Caspi, Y. and Irani, M. 2000. A step towards sequence-to-sequence alignment. In Proc. Computer Vision and Pattern Recognition Conf., vol. 2, pp. 682-689.]]
[21]
Chen, Q. and Medioni, G. 1999. A volumetric stereo matching method: Application to image-based modeling. In Proc. Computer Vision and Pattern Recognition Conf., vol. 1, pp. 29-34.]]
[22]
Collins, R.T. 1996. A space-sweep approach to true multiimage matching, In Proc. Computer Vision and Pattern Recognition Conf., pp. 358-363.]]
[23]
Cook, R. and Torrance, K.E. 1981. A reflectance model for computer graphics. Computer Graphics, 15:307-316.]]
[24]
DeCarlo, D. and Metaxas, D. 1998. Deformable model-based shape and motion analysis from images using motion residual error. In Proc. 6th Int. Conf. on Computer Vision, pp. 113-119.]]
[25]
DeCarlo, D. and Metaxas, D. 2000. Optical flow constraints on deformable models with applications to face tracking. Int. J. Computer Vision, 38(2):99-127.]]
[26]
Delamare, Q. and Faugeras, O. 1999. 3D articulated models and multi-view tracking with silhouettes. In Proc. 7th Int. Conf. on Computer Vision, vol. 2, pp. 716-721.]]
[27]
Deutscher, J., Blake, A., and Reid, I. 2000. Articulated body motion capture by annealed particle filtering. In Proc. Computer Vision and Pattern Recognition Conf., vol. 2, pp. 126-133.]]
[28]
do Carmo, M.P. 1976. Differential Geometry of Curves and Surfaces. Prentice-Hall: Englewood Cliffs, NJ.]]
[29]
Drummond, T. and Cipolla, R. 2000. Real-time tracking of multiple articulated structures in multiple views. In Proc. 6th European Conf. on Computer Vision, vol. 2, pp. 20-36.]]
[30]
Faugeras, O. and Keriven, R. 1998. Complete dense stereovision using level set methods. In Proc. 5th European Conf. on Computer Vision, pp. 379-393.]]
[31]
Faugeras, O.D. and Keriven, R. 1998. Variational principles, surface evolution, PDE's, level set methods and the stereo problem. IEEE Trans. Image Processing, 7(3):336-344.]]
[32]
Fleet, D.J., Black, M.J., Yacoob, Y., and Jepson, A.D. 2000. Design and use of linear models for image motion analysis. Int. J. Computer Vision, 35(3):169-191.]]
[33]
Fleet, D.J. and Jepson, A.D. 1990. Computation of component image velocity from local phase information. Int. J. Computer Vision, 5(1):77-104.]]
[34]
Foley, J.D., van Dam, A., Feiner, S.K., and Hughes, J.F. 1990. Computer Graphics Principles and Practice. Addison-Wesley.]]
[35]
Forsyth, D. and Zisserman, A. 1991. Reflections on shading. IEEE Trans. Pattern Anal. Machine Intell., 13(7):671-679.]]
[36]
Fua, P. 1997. From multiple stereo views to multiple 3-D surfaces. Int. J. Computer Vision, 24(1):19-35.]]
[37]
Fua, P. 1999. Using model-driven bundle-adjustment to model heads from raw video image sequences. In Proc. 7th Int. Conf. on Computer Vision, vol. 1, pp. 46-53.]]
[38]
Fua, P. and Leclerc, Y.G. 1995. Object-centered surface reconstruction: Combining multi-image stereo and shading, Int. J. Computer Vision, 16:35-56.]]
[39]
Gaucher, L. and Medioni, G. 1999. Accurate motion flow estimation with discontinuities. In Proc. 7th Int. Conf. on Cnmputer Vision, vol. 2, pp. 695-702.]]
[40]
Guenter, B., Grimm, C., Malvar, H. and Wood, D. 1998. Making faces. In Proc. SIGGRAPH'98, pp. 55-66.]]
[41]
Haussecker, H.W. and Fleet, D.J. 2000. Computing optical flow with physical models of brightness variation. In Proc. Computer Vision and Pattern Recogition Conf., vol. 2, pp. 760-767.]]
[42]
Horn, B.K.P. 1986. Robot Vision. MIT Press.]]
[43]
Irani, M. 1999. Multi-frame optical flow estimation using subspace constraints. In Proc. 7th Int. Conf. on Computer Vision, vol. 1, pp. 626-633.]]
[44]
Irani, M. and Peleg, S. 1991. Improving resolution by image registration. CVGIP: Graphical Models and Image Processing, 53:231- 239.]]
[45]
Irani, M., Rousso, B., and Peleg, S. 1997. Recovery of ego-motion using region alignment. IEEE Trans. Pattern Anal. Machine Intell., 19(3):268-272.]]
[46]
Jin, H., Yezzi, A., and Soatto, S. 2000. Integrating multi-frame shape cues in a variational framework, In Proc. Computer Vision and Pattern Recognition Conf., vol. 1, pp. 169-176.]]
[47]
Ju, S.X., Black, M.J., and Jepson, A.D. 1996. Skin and bones: Multi-layer, locally affine, optical flow and regularization with transparency. In Proc. Computer Vision Pattern Recognition Conf., pp. 307-314.]]
[48]
Kanatani, K. and Ohta, N. 1999. Accuracy bounds and optimal computation of homography for image mosaicing applications. In Proc. 7th Int. Conf. on Computer Vision, vol. 1, pp. 73- 78.]]
[49]
Koenderink, J.J. 1990. Solid Shape. MIT Press.]]
[50]
Koenderink, J.J., Doorn, A.J.V., Dana, K.J., and Nayar, S. 1999. Bidirectional reflection distribution of thoroughly pitted surfaces. Int. J. Computer Vision, 31(2/3):129-144.]]
[51]
Kutulakos, K.N. 2000. Approximate N-View stereo. In Proc. 6th European Conf. on Computer Vision, vol. 1, pp. 67-83.]]
[52]
Kutulakos, K.N. and Seitz, S.M. 2000. A theory of shape by space carving. Int. J. Computer Vision, 38(3):199-218. Marr Prize Special Issue.]]
[53]
Lafortune, E.P.F., Foo, S., Torrance, K.E., and Greenberg, D.P 1997. Non-linear approximation of reflectance functions. In Proc. SIGGRAPH'97, pp. 117-126.]]
[54]
Langer, M.S. and Zucker, S.W. 1994. Shape-from-shading on a cloudy day. J. Opt. Soc. Am. A, 11(2):467-478.]]
[55]
Lin, S. and Lee, S.W. 1999. A representation of specular appearance. In Proc. 7th Int. Conf. on Computer Vision, vol. 2, pp. 849-854.]]
[56]
Lin, S. and Lee, S.W. 2000. An appearance representation for multiple reflection components. In Proc. Computer Vision and Pattern Recognition Conf., vol. 1, pp. 105-110.]]
[57]
Loop, C. and Zhang, Z. 1999. Computing rectifying homographies for stereo vision. In Proc. Computer Vision and Pattern Recognition Conf., vol. 1, pp. 125-131.]]
[58]
Lowe, D.G. 1991. Fitting parameterized three-dimensional models to images. IEEE Trans. Pattern Anal. Machine Intell., 13(5):441- 449.]]
[59]
Lu, R., Koenderinck, J.J., and Cappers, A.M.L. 1999. Specularities on surfaces with tangential hairs or grooves. In Proc. 7th Int. Conf. on Computer Vision, vol. 1, pp. 2-7.]]
[60]
Narayanan, P.J., Rander, P.W., and Kanade, T. 1998. Constructing virtual worlds using dense stereo. In Proc. 6th Int. Conf. on Computer Vision, pp. 3-10.]]
[61]
Nayar, S.K., Fang, X., and Boult, T.E. 1993. Removal of specularities using color and polarization. In Proc. Computer Vision and Pattern Recognition Conf., pp. 583-590.]]
[62]
Negahdaripour, S. 1998. Revised definition of optical frow: Integration of radiometric and geometric cues for dynamic scene analysis. IEEE Trans. Pattern Anal. Machine Intell., 20(9):961-979.]]
[63]
Ohta, Y. and Kanade, T. 1985. Stereo by intra- and inter-scanline search using dynamic programming. IEEE Trans. Pattern Anal. Machine Intell., 7(2):139-154.]]
[64]
Oren, M. and Nayar, S.K. 1997. A theory of specular surface geometry. Int. J. Computer Vision, 24(2):105-124.]]
[65]
Papin, C., Bouthemy, P., and Rochard, G. 2000. Tracking and characterization of highly deformable cloud structures. In Proc. 6th European Conf. on Computer Vision, vol. 2, pp. 428- 442.]]
[66]
Pratt, W.K. 1991. Digital Image Processing. John Wiley & Sons.]]
[67]
Press, W.H., Flannery, B.P., Teukolsky, S.A., and Vetterling, W.T. 1988. Numerical Recipies in C. Cambridge University Press.]]
[68]
Ramamoorthi, R. and Hanrahan, P. 2001. A signal processing framework for inverse rendering. In Proc. SIGGRAPH'01, pp. 117- 128.]]
[69]
Roy, S. and Cox, I.J. 1998. A maximum-flow formulation of the N-camera stereo correspondence problem. In Proc. 6th Int. Conf. on Computer Vision, pp. 492-499.]]
[70]
Samaras, D. and Metaxas, D. 1998. Incorporating illumination constraints in deformable models. In Proc. Computer Vision and Pattern Recognition Conf., pp. 322-329.]]
[71]
Sato, Y. and Ikeuchi, K. 1994. Temporal-color space analysis of reflection. J. Opt. Soc. Am. A, 11(11):2990-3002.]]
[72]
Sato, Y., Wheeler, M.D., and Ikeuchi, K. 1997. Object shape and reflectance modeling from observation. In Proc. SIGGRAPH'97, pp. 379-387.]]
[73]
Seitz, S.M. and Dyer, C.R. 1999. Photorealistic scene reconstruction by voxel coloring. Int. J. Computer Vision, 35(2):151-173.]]
[74]
Sidenbladh, H., Black, M.J., and Fleet, D.J. 2000. Stochastic tracking of 3D human figures using 2D image motion. In Proc. 6th European Conf. on Computer Vision, vol. 2, pp. 702-718.]]
[75]
Silva, C. and Santos-Victor, J. 2000. Intrinsic images for dense stereo matching with occlusions. In Proc. 6th European Conf. on Computer Vision, vol. 1, pp. 100-114.]]
[76]
Shashua, A. 1992. Geometry and photometry in 3D visual recognition . Ph.D. Thesis, MIT.]]
[77]
Smith, P., Drummond, T., and Cipolla, R. 2000. Motion segmentation by tracking edge information over multiple frames. In Proc. 6th European Conf. on Computer Vision, vol. 2, pp. 396-410.]]
[78]
Snow, D., Viola, P., and Zabih, R. 2000. Exact voxel occupancy with graph cuts. In Proc. Computer Vision and Pattern Recognition Conf., vol. 1, pp. 345-352.]]
[79]
Szeliski, R. 1996. Video mosaics for virtual environments. IEEE Computer Graphics and Applications, 16(2):22-30.]]
[80]
Szeliski, R. 1999. A multi-view approach to motion and stereo. In Proc. Computer Vision and Pattern Recognition Conf., vol. 1, pp. 157-163.]]
[81]
Szeliski, R., Avidan, S., and Anandan, P. 2000. Layer extraction from multiple images containing reflections and transparency. In Proc. Computer Vision and Pattern Recognition Conf., vol. 1, pp. 246- 253.]]
[82]
Szeliski, R. and Golland, P. 1998. Stereo matching with transparency and matting. In Proc. 6th Int. Conf. on Computer Vision, pp. 517- 524.]]
[83]
Tomasi, C. and Kanade, T. 1992. Shape and motion from image streams under orthography: A factorization method. Int. J. Computer Vision, 9(2):137-154.]]
[84]
Torrance, K.E. and Sparrow, E.M. 1967. Theory of off-specular reflection from roughened surfaces. J. Opt. Soc. Am., 57:1105-1114.]]
[85]
Tzovaras, D. and Grammalidis, N. 1997. Object-based coding of stereo image sequences using joint 3-D motion/disparity compensation. IEEE Trans. on Circuits and Systems for Video Technology, 7(2):312-327.]]
[86]
Vedula, S., Baker, S., Rander, P., Collins, R., and Kanade, T. 1999. Three-dimensional scene flow. In Proc. 7th Int. Conf. on Computer Vision, vol. 2, pp. 722-729.]]
[87]
Vedula, S., Baker, S., Seitz, S., and Kanade, T. 2000. Shape and motion carving in 6D. In Proc. Computer Vision and Pattern Recognition Conf., vol. 2, pp. 592-598.]]
[88]
Wang, J.Y. and Adelson, E.H. 1993. Layered representation for motion analysis, In Proc. Computer Vision and Pattern Recognition Conf., pp. 361-366.]]
[89]
Watt, A. 2000. 3D Computer Graphics. 3rd edn., Addison-Wesley.]]
[90]
Wexler, Y. and Shashua, A. 1999. Q-warping: Direct computation of quadratic reference surfaces. In Proc. Computer Vision and Pattern Recognition Conf., vol. 1, pp. 333-338.]]
[91]
Wolff, L.B., Nayar, S.K., and Oren, M. 1998. Improved diffuse reflection models for computer vision. Int. J. Computer Vision, 30(1):55-71.]]
[92]
Wood, D.N., Azuma, D.I., Aldinger, K., Curless, B., and Duchamp, T. 2000. Surface light fields for 3D photography. In Proc. SIGGRAPH'00, pp. 287-296.]]
[93]
Yacoob, Y. and Davis, L.S. 2000. Learned models for estimation of rigid and articulated human motion from stationary or moving camera. Int. J. Computer Vision, 36(1):5-30.]]
[94]
Ye, M. and Haralick, R.M. 2000. Two-stage robust optical flow estimation. In Proc. Computer Vision and Pattern Recognition Conf., vol. 2, pp. 623-628.]]
[95]
Yu, Y., Debevec, P., Malik, J., and Hawkins, T. 1999. Inverse global illumination: Recovering reflectance models of real scenes from photographs. In Proc. SIGGRAPH'99, pp. 215-224.]]
[96]
Zelnik-Manor, L. and Irani, M. 2000. Multi-frame estimation of planar motion. IEEE Trans. Pattern Anal. Machine Intell., 22(10):1105-1116.]]
[97]
Zhang, Y. and Kambhamettu, C. 2000. Integrated 3D scene flow and structure recovery from multiview image sequences. In Proc. Computer Vision and Pattern Recognition Conf., vol. 2, pp. 674- 681.]]
[98]
Zhou, L. and Kambhamettu, C. 2000. Hierarchical structure and nonrigid motion recovery from monocular views. In Proc. Computer Vision and Pattern Recognition Conf., vol. 2, pp. 752- 759.]]
[99]
Zhou, L., Kambhamettu, C., and Goldgof, D.B. 2000. Fluid structure and motion analysis from multi-spectrum 2D cloud image sequences. In Proc. Computer Vision and Pattern Recognition Conf., vol. 2, pp. 744-751.]]

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image International Journal of Computer Vision
International Journal of Computer Vision  Volume 49, Issue 2-3
September-October 2002
146 pages

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 01 September 2002

Author Tags

  1. 3D motion capture
  2. 3D reconstruction
  3. Phong reflectance model
  4. deformation analysis
  5. direct estimation methods
  6. illumination modeling
  7. image warping
  8. motion analysis
  9. multi-view motion estimation
  10. multi-view stereo
  11. multiple-view geometry
  12. reflectance modeling
  13. space carving
  14. stereoscopic vision

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 03 Feb 2025

Other Metrics

Citations

Cited By

View all

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media