research-article

Open access

PSDR-Room: Single Photo to Scene using Differentiable Rendering

Authors:

Thibault Groueix,

Valentin Deschaintre,

Shuang ZhaoAuthors Info & Claims

SA '23: SIGGRAPH Asia 2023 Conference Papers

Article No.: 28, Pages 1 - 11

https://doi.org/10.1145/3610548.3618165

Published: 11 December 2023 Publication History

All formats PDF

Abstract

A 3D digital scene contains many components: lights, materials and geometries, interacting to reach the desired appearance. Staging such a scene is time-consuming and requires both artistic and technical skills. In this work, we propose PSDR-Room, a system allowing to optimize lighting as well as the pose and materials of individual objects to match a target image of a room scene, with minimal user input. To this end, we leverage a recent path-space differentiable rendering approach that provides unbiased gradients of the rendering with respect to geometry, lighting, and procedural materials, allowing us to optimize all of these components using gradient descent to visually match the input photo appearance. We use recent single-image scene understanding methods to initialize the optimization and search for appropriate 3D models and materials. We evaluate our method on real photographs of indoor scenes and demonstrate the editability of the resulting scene components.

References

[1]

Adobe. 2023. Substance Designer. https://www.substance3d.com/.

[2]

Dejan Azinovic, Tzu-Mao Li, Anton Kaplanyan, and Matthias Nießner. 2019. Inverse path tracing for joint material and lighting estimation. In Proc. IEEE/CVF CVPR. 2447–2456.

[3]

Sai Praveen Bangaru, Tzu-Mao Li, and Frédo Durand. 2020. Unbiased Warped-Area Sampling for Differentiable Rendering. ACM Trans. Graph. 39, 6 (2020), 245:1–245:18.

Digital Library

[4]

Chengqian Che, Fujun Luan, Shuang Zhao, Kavita Bala, and Ioannis Gkioulekas. 2020. Towards Learning-based Inverse Subsurface Scattering. In 2020 IEEE International Conference on Computational Photography (ICCP). IEEE, 1–12.

[5]

Bowen Cheng, Ishan Misra, Alexander G Schwing, Alexander Kirillov, and Rohit Girdhar. 2022. Masked-attention mask transformer for universal image segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1290–1299.

[6]

Michael Fischer and Tobias Ritschel. 2022. Plateau-free Differentiable Path Tracing. arXiv preprint arXiv:2211.17263 (2022).

[7]

Huan Fu, Rongfei Jia, Lin Gao, Mingming Gong, Binqiang Zhao, Steve Maybank, and Dacheng Tao. 2021. 3d-future: 3d furniture shape with texture. International Journal of Computer Vision (2021), 1–25.

Digital Library

[8]

Leon A Gatys, Alexander S Ecker, and Matthias Bethge. 2016. Image style transfer using convolutional neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2414–2423.

[9]

Ioannis Gkioulekas, Anat Levin, and Todd Zickler. 2016. An evaluation of computational imaging techniques for heterogeneous inverse scattering. In ECCV. Springer, 685–701.

[10]

Ioannis Gkioulekas, Shuang Zhao, Kavita Bala, Todd Zickler, and Anat Levin. 2013. Inverse volume rendering with material dictionaries. ACM Trans. Graph. 32, 6 (2013), 1–13.

Digital Library

[11]

Paul Guerrero, Milos Hasan, Kalyan Sunkavalli, Radomir Mech, Tamy Boubekeur, and Niloy Mitra. 2022. MatFormer: A Generative Model for Procedural Materials. ACM Trans. Graph. 41, 4, Article 46 (2022). https://doi.org/10.1145/3528223.3530173

Digital Library

[12]

Eric Heitz, Kenneth Vanhoey, Thomas Chambon, and Laurent Belcour. 2021. A sliced wasserstein loss for neural texture synthesis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9412–9420.

[13]

Yiwei Hu, Julie Dorsey, and Holly Rushmeier. 2019. A Novel Framework for Inverse Procedural Texture Modeling. ACM Trans. Graph. 38, 6, Article 186 (Nov. 2019), 14 pages. https://doi.org/10.1145/3355089.3356516

Digital Library

[14]

Yiwei Hu, Paul Guerrero, Milos Hasan, Holly Rushmeier, and Valentin Deschaintre. 2022a. Node Graph Optimization Using Differentiable Proxies. In ACM SIGGRAPH 2022 Conference Proceedings (Vancouver, BC, Canada) (SIGGRAPH ’22). Association for Computing Machinery, New York, NY, USA, Article 5, 9 pages. https://doi.org/10.1145/3528233.3530733

Digital Library

[15]

Yiwei Hu, Miloš Hašan, Paul Guerrero, Holly Rushmeier, and Valentin Deschaintre. 2022b. Controlling Material Appearance by Examples. Computer Graphics Forum (2022). https://doi.org/10.1111/cgf.14591

[16]

Yiwei Hu, Chengan He, Valentin Deschaintre, Julie Dorsey, and Holly Rushmeier. 2022c. An Inverse Procedural Modeling Pipeline for SVBRDF Maps. ACM Transactions on Graphics (TOG) 41, 2 (2022), 1–17.

Digital Library

[17]

Siyuan Huang, Siyuan Qi, Yinxue Xiao, Yixin Zhu, Ying Nian Wu, and Song-Chun Zhu. 2018. Cooperative Holistic Scene Understanding: Unifying 3D Object, Layout, and Camera Pose Estimation. In Advances in Neural Information Processing Systems. 206–217.

[18]

Hamid Izadinia, Qi Shan, and Steven M Seitz. 2017. Im2cad. In Proceedings of the IEEE conference on computer vision and pattern recognition. 5134–5143.

[19]

Linyi Jin, Jianming Zhang, Yannick Hold-Geoffroy, Oliver Wang, Kevin Matzen, Matthew Sticha, and David F. Fouhey. 2023. Perspective Fields for Single Image Camera Calibration. In CVPR.

[20]

Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).

[21]

Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alexander C. Berg, Wan-Yen Lo, Piotr Dollár, and Ross Girshick. 2023. Segment Anything. arxiv:2304.02643 [cs.CV]

[22]

Beichen Li, Liang Shi, and Wojciech Matusik. 2023. End-to-End Procedural Material Capture with Proxy-Free Mixed-Integer Optimization. ACM Transactions on Graphics (TOG) 42, 4, Article 1 (2023), 15 pages.

Digital Library

[23]

Tzu-Mao Li, Miika Aittala, Frédo Durand, and Jaakko Lehtinen. 2018. Differentiable Monte Carlo ray tracing through edge sampling. ACM Trans. Graph. 37, 6 (2018), 222:1–222:11.

Digital Library

[24]

Zhengqin Li, Mohammad Shafiei, Ravi Ramamoorthi, Kalyan Sunkavalli, and Manmohan Chandraker. 2020. Inverse rendering for complex indoor scenes: Shape, spatially-varying lighting and svbrdf from a single image. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2475–2484.

[25]

Shichen Liu, Tianye Li, Weikai Chen, and Hao Li. 2019. Soft rasterizer: A differentiable renderer for image-based 3D reasoning. In ICCV. 7708–7717.

[26]

Manuel Lopez, Roger Mari, Pau Gargallo, Yubin Kuang, Javier Gonzalez-Jimenez, and Gloria Haro. 2019. Deep single image camera calibration with radial distortion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 11817–11825.

[27]

Guillaume Loubet, Nicolas Holzschuch, and Wenzel Jakob. 2019. Reparameterizing discontinuous integrands for differentiable rendering. ACM Transactions on Graphics (TOG) 38, 6 (2019), 1–14.

Digital Library

[28]

Fujun Luan, Shuang Zhao, Kavita Bala, and Zhao Dong. 2021. Unified Shape and SVBRDF Recovery using Differentiable Monte Carlo Rendering. Computer Graphics Forum 40, 4 (2021), 101–113.

[29]

Chuong H. Nguyen, Tobias Ritschel, Karol Myszkowski, Elmar Eisemann, and Hans-Peter Seidel. 2012. 3D Material Style Transfer. Computer Graphics Forum (Proc. EUROGRAPHICS 2012) 2, 31 (2012).

[30]

Yinyu Nie, Xiaoguang Han, Shihui Guo, Yujian Zheng, Jian Chang, and Jian Jun Zhang. 2020. Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes From a Single Image. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[31]

Merlin Nimier-David, Zhao Dong, Wenzel Jakob, and Anton Kaplanyan. 2021. Material and Lighting Reconstruction for Complex Indoor Scenes with Texture-space Differentiable Rendering. In Eurographics Symposium on Rendering - DL-only Track, Adrien Bousseau and Morgan McGuire (Eds.). The Eurographics Association. https://doi.org/10.2312/sr.20211292

[32]

Merlin Nimier-David, Delio Vicini, Tizian Zeltner, and Wenzel Jakob. 2019. Mitsuba 2: A retargetable forward and inverse renderer. ACM Trans. Graph. 38, 6 (2019), 203:1–203:17.

Digital Library

[33]

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, 2021. Learning transferable visual models from natural language supervision. In International Conference on Machine Learning. PMLR, 8748–8763.

[34]

René Ranftl, Alexey Bochkovskiy, and Vladlen Koltun. 2021. Vision transformers for dense prediction. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 12179–12188.

[35]

René Ranftl, Katrin Lasinger, David Hafner, Konrad Schindler, and Vladlen Koltun. 2020. Towards robust monocular depth estimation: Mixing datasets for zero-shot cross-dataset transfer. IEEE transactions on pattern analysis and machine intelligence (2020).

[36]

Lawrence G Roberts. 1963. Machine perception of three-dimensional solids. Ph. D. Dissertation. Massachusetts Institute of Technology.

[37]

Liang Shi, Beichen Li, Miloš Hašan, Kalyan Sunkavalli, Tamy Boubekeur, Radomir Mech, and Wojciech Matusik. 2020. MATch: Differentiable Material Graphs for Procedural Material Capture. ACM Trans. Graph. 39, 6, Article 196 (Dec. 2020), 15 pages.

Digital Library

[38]

Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).

[39]

Chia-Yin Tsai, Aswin C Sankaranarayanan, and Ioannis Gkioulekas. 2019. Beyond Volumetric Albedo–A Surface Optimization Framework for Non-Line-Of-Sight Imaging. In Proc. IEEE/CVF CVPR. 1545–1555.

[40]

Eric Veach. 1997. Robust Monte Carlo methods for light transport simulation. Vol. 1610. Stanford University PhD thesis.

[41]

Kai Yan, Christoph Lassner, Brian Budge, Zhao Dong, and Shuang Zhao. 2022. Efficient estimation of boundary integrals for path-space differentiable rendering. ACM Trans. Graph. 41, 4 (2022), 123:1–123:13.

Digital Library

[42]

Yu-Ying Yeh, Zhengqin Li, Yannick Hold-Geoffroy, Rui Zhu, Zexiang Xu, Miloš Hašan, Kalyan Sunkavalli, and Manmohan Chandraker. 2022. PhotoScene: Photorealistic Material and Lighting Transfer for Indoor Scenes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 18562–18571.

[43]

Cheng Zhang, Bailey Miller, Kai Yan, Ioannis Gkioulekas, and Shuang Zhao. 2020a. Path-space differentiable rendering. ACM Trans. Graph. 39, 4 (2020), 143:1–143:19.

Digital Library

[44]

Chaoning Zhang, Francois Rameau, Junsik Kim, Dawit Mureja Argaw, Jean-Charles Bazin, and In So Kweon. 2020b. Deepptz: Deep self-calibration for ptz cameras. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 1041–1049.

[45]

Cheng Zhang, Lifan Wu, Changxi Zheng, Ioannis Gkioulekas, Ravi Ramamoorthi, and Shuang Zhao. 2019. A differential theory of radiative transfer. ACM Trans. Graph. 38, 6 (2019), 227:1–227:16.

Digital Library

[46]

Cheng Zhang, Zihan Yu, and Shuang Zhao. 2021. Path-space differentiable rendering of participating media. ACM Trans. Graph. 40, 4 (2021), 76:1–76:15.

Digital Library

[47]

Richard Zhang, Phillip Isola, Alexei A Efros, Eli Shechtman, and Oliver Wang. 2018. The Unreasonable Effectiveness of Deep Features as a Perceptual Metric. In CVPR.

[48]

Rui Zhu, Zhengqin Li, Janarbek Matai, Fatih Porikli, and Manmohan Chandraker. 2022. IRISformer: Dense Vision Transformers for Single-Image Inverse Rendering in Indoor Scenes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2822–2831.

Cited By

Mora PGarcia CIvorra EOrtega MAlcañiz M(2024)Virtual Experience Toolkit: An End-to-End Automated 3D Scene Virtualization Framework Implementing Computer Vision TechniquesSensors10.3390/s2412383724:12(3837)Online publication date: 13-Jun-2024
https://doi.org/10.3390/s24123837
Yan KPegoraro VDroske MVorba JZhao S(2024)Differentiating Variance for Variance-Aware Inverse RenderingSIGGRAPH Asia 2024 Conference Papers10.1145/3680528.3687603(1-10)Online publication date: 3-Dec-2024
https://dl.acm.org/doi/10.1145/3680528.3687603
Deng XWu LWalter BRamamoorthi Rd'Eon EMarschner SWeidlich A(2024)Reconstructing translucent thin objects from photosSIGGRAPH Asia 2024 Conference Papers10.1145/3680528.3687572(1-11)Online publication date: 3-Dec-2024
https://dl.acm.org/doi/10.1145/3680528.3687572
Show More Cited By

Index Terms

PSDR-Room: Single Photo to Scene using Differentiable Rendering
1. Computing methodologies
  1. Computer graphics
    1. Rendering

Recommendations

Reparameterizing discontinuous integrands for differentiable rendering

Differentiable rendering has recently opened the door to a number of challenging inverse problems involving photorealistic images, such as computational material design and scattering-aware reconstruction of geometry and materials from photographs. ...
Image-based rendering of diffuse, specular and glossy surfaces from a single image
SIGGRAPH '01: Proceedings of the 28th annual conference on Computer graphics and interactive techniques

In this paper, we present a new method to recover an approximation of the bidirectional reflectance distribution function (BRDF) of the surfaces present in a real scene. This is done from a single photograph and a 3D geometric model of the scene. The ...
Differentiable Heightfield Path Tracing with Accelerated Discontinuities
SIGGRAPH '23: ACM SIGGRAPH 2023 Conference Proceedings

We investigate the problem of accelerating a physically-based differentiable renderer for heightfields based on path tracing with global illumination. On a heightfield with 1 million vertices (1024 × 1024 resolution), our differentiable renderer ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SA '23: SIGGRAPH Asia 2023 Conference Papers

December 2023

1113 pages

ISBN:9798400703157

DOI:10.1145/3610548

Copyright © 2023 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

SIGGRAPH: ACM Special Interest Group on Computer Graphics and Interactive Techniques

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 December 2023

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

NSF 1900927

Conference

SA '23

Sponsor:

SIGGRAPH

SA '23: SIGGRAPH Asia 2023

December 12 - 15, 2023

NSW, Sydney, Australia

Acceptance Rates

Overall Acceptance Rate 178 of 869 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

8
Total Citations
View Citations
466
Total Downloads

Downloads (Last 12 months)406
Downloads (Last 6 weeks)45

Reflects downloads up to 08 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Mora PGarcia CIvorra EOrtega MAlcañiz M(2024)Virtual Experience Toolkit: An End-to-End Automated 3D Scene Virtualization Framework Implementing Computer Vision TechniquesSensors10.3390/s2412383724:12(3837)Online publication date: 13-Jun-2024
https://doi.org/10.3390/s24123837
Yan KPegoraro VDroske MVorba JZhao S(2024)Differentiating Variance for Variance-Aware Inverse RenderingSIGGRAPH Asia 2024 Conference Papers10.1145/3680528.3687603(1-10)Online publication date: 3-Dec-2024
https://dl.acm.org/doi/10.1145/3680528.3687603
Deng XWu LWalter BRamamoorthi Rd'Eon EMarschner SWeidlich A(2024)Reconstructing translucent thin objects from photosSIGGRAPH Asia 2024 Conference Papers10.1145/3680528.3687572(1-11)Online publication date: 3-Dec-2024
https://dl.acm.org/doi/10.1145/3680528.3687572
Fischer MRitschel T(2024)ZeroGrads: Learning Local Surrogates for Non-Differentiable GraphicsACM Transactions on Graphics10.1145/365817343:4(1-15)Online publication date: 19-Jul-2024
https://dl.acm.org/doi/10.1145/3658173
Zhang SPeng SXu TYang YChen TXue NShen YBao HHu RZhou X(2024)MaPa: Text-driven Photorealistic Material Painting for 3D ShapesACM SIGGRAPH 2024 Conference Papers10.1145/3641519.3657504(1-12)Online publication date: 13-Jul-2024
https://dl.acm.org/doi/10.1145/3641519.3657504
Chen TCao RLu AXu TZhang XPapa MZhang MSun LZang Y(2024)High-Fidelity 3D Model Generation with Relightable Appearance from Single Freehand Sketches and Text Guidance2024 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)10.1109/ICMEW63481.2024.10645361(1-6)Online publication date: 15-Jul-2024
https://doi.org/10.1109/ICMEW63481.2024.10645361
Vecchio GDeschaintre V(2024)MatSynth: A Modern PBR Materials Dataset2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.02087(22109-22118)Online publication date: 16-Jun-2024
https://doi.org/10.1109/CVPR52733.2024.02087
Zhang GWang YLuo CXu SMing YPeng JZhang M(2024)Visual Harmony: LLM’s Power in Crafting Coherent Indoor Scenes from ImagesPattern Recognition and Computer Vision10.1007/978-981-97-8508-7_1(3-17)Online publication date: 18-Oct-2024
https://dl.acm.org/doi/10.1007/978-981-97-8508-7_1

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten