skip to main content
research-article

Modeling multi-style portrait relief from a single photograph

Published: 27 February 2024 Publication History

Abstract

This paper aims at extending the method of Zhang et al. (2023) to produce not only portrait bas-reliefs from single photographs, but also high-depth reliefs with reasonable depth ordering. We cast this task as a problem of style-aware photo-to-depth translation, where the input is a photograph conditioned by a style vector and the output is a portrait relief with desired depth style. To construct ground-truth data for network training, we first propose an optimization-based method to synthesize high-depth reliefs from 3D portraits. Then, we train a normal-to-depth network to learn the mapping from normal maps to relief depths. After that, we use the trained network to generate high-depth relief samples using the provided normal maps from Zhang et al. (2023). As each normal map has pixel-wise photograph, we are able to establish correspondences between photographs and high-depth reliefs. By taking the bas-reliefs of Zhang et al. (2023), the new high-depth reliefs and their mixtures as target ground-truths, we finally train a encoder-to-decoder network to achieve style-aware relief modeling. Specially, the network is based on a U-shaped architecture, consisting of Swin Transformer blocks to process hierarchical deep features. Extensive experiments have demonstrated the effectiveness of the proposed method. Comparisons with previous works have verified its flexibility and state-of-the-art performance.

Graphical abstract

Display Omitted

References

[1]
Zhang Y.-W., Wu J., Ji Z., Wei M., Zhang C., Computer-assisted relief modelling: A comprehensive survey, Comput. Graph. Forum 38 (2019) 521–534.
[2]
Wu J., Martin R.R., Rosin P.L., Sun X.-F., Lai Y.-K., Liu Y.-H., Wallraven C., Use of non-photorealistic rendering and photometric stereo in making bas-reliefs from photographs, Graph. Models 76 (4) (2014) 202–213.
[3]
Alexa M., Matusik W., Reliefs as images, ACM Trans. Graph. 29 (4) (2010).
[4]
Yang Z., Chen B., Zheng Y., Chen X., Zhou K., Human bas-relief generation from a single photograph, IEEE Trans. Vis. Comput. Graphics 28 (12) (2022) 4558–4569.
[5]
Zhang Y.-W., Zhang C., Wang W., Chen Y., Ji Z., Liu H., Portrait relief modeling from a single image, IEEE Trans. Vis. Comput. Graph. 26 (8) (2019) 2659–2670.
[6]
Zhang Y.-W., Luo P., Zhou H., Ji Z., Liu H., Chen Y., Zhang C., Neural modeling of portrait bas-relief from a single photograph, IEEE Trans. Vis. Comput. Graphics 29 (12) (2023) 5008–5019.
[7]
Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, B. Guo, Swin transformer: Hierarchical vision transformer using shifted windows, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 10012–10022.
[8]
Cignoni P., Montani C., Scopigno R., Computer-assisted generation of bas-and high-reliefs, J. Graphics Tools 2 (3) (1997) 15–28.
[9]
Weyrich T., Deng J., Barnes C., Rusinkiewicz S., Finkelstein A., Digital bas-relief from 3D scenes, ACM Trans. Graphics 26 (3) (2007).
[10]
Schüller C., Panozzo D., Sorkine-Hornung O., Appearance-mimicking surfaces, ACM Trans. Graph. 33 (6) (2014).
[11]
Zhang Y.-W., Zhou Y.-Q., Li X.-L., Liu H., Zhang L.-L., Bas-relief generation and shape editing through gradient-based mesh deformation, IEEE Trans. Vis. Comput. Graph. 21 (3) (2014) 328–338.
[12]
Ji Z., Ma W., Sun X., Bas-relief modeling from normal images with intuitive styles, IEEE Trans. Vis. Comput. Graph. 20 (5) (2013) 675–685.
[13]
Zhang Y.-W., Qin B.-b., Chen Y., Ji Z., Zhang C., Portrait relief generation from 3D object, Graph. Models 102 (2019) 10–18.
[14]
Ji Z., Feng W., Sun X., Qin F., Wang Y., Zhang Y.-W., Ma W., ReliefNet: fast bas-relief generation from 3D scenes, Comput. Aided Des. 130 (2021).
[15]
Su W., Du D., Yang X., Zhou S., Fu H., Interactive sketch-based normal map generation with deep neural networks, in: Proceedings of the ACM on Computer Graphics and Interactive Techniques, Vol. 1, No. 1, ACM New York, NY, USA, 2018.
[16]
Zhang Y.-W., Wang J., Wang W., Chen Y., Liu H., Ji Z., Zhang C., Neural modelling of flower bas-relief from 2D line drawing, Comput. Graph. Forum 40 (2021) 288–303.
[17]
M. Hudon, M. Grogan, A. Smolic, et al., Deep normal estimation for automatic shading of hand-drawn characters, in: Proceedings of the European Conference on Computer Vision (ECCV) Workshops, 2018, pp. 246–262.
[18]
Zhang Y.-W., Wang J., Long W., Liu H., Zhang C., Chen Y., A fast solution for Chinese calligraphy relief modeling from 2D handwriting image, Vis. Comput. 36 (10) (2020) 2241–2250.
[19]
T. Park, M.-Y. Liu, T.-C. Wang, J.-Y. Zhu, Semantic image synthesis with spatially-adaptive normalization, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 2337–2346.
[20]
Y. Choi, Y. Uh, J. Yoo, J.-W. Ha, Stargan v2: Diverse image synthesis for multiple domains, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 8188–8197.
[21]
T. Xiao, Y. Liu, B. Zhou, Y. Jiang, J. Sun, Unified perceptual parsing for scene understanding, in: Proceedings of the European Conference on Computer Vision, ECCV, 2018, pp. 418–434.
[22]
Yu C., Shao Y., Gao C., Sang N., CondNet: Conditional classifier for scene segmentation, IEEE Signal Process. Lett. 28 (2021) 758–762.
[23]
M. Tomei, M. Cornia, L. Baraldi, R. Cucchiara, Art2real: Unfolding the reality of artworks via semantically-aware image-to-image translation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 5849–5859.
[24]
Chong M.J., Forsyth D., Jojogan: One shot face stylization, 2021, arXiv preprint arXiv:2112.11641.
[25]
G. Liu, F.A. Reda, K.J. Shih, T.-C. Wang, A. Tao, B. Catanzaro, Image inpainting for irregular holes using partial convolutions, in: Proceedings of the European Conference on Computer Vision, ECCV, 2018, pp. 85–100.
[26]
Y. Song, J. Sohl-Dickstein, D.P. Kingma, A. Kumar, S. Ermon, B. Poole, Score-based generative modeling through stochastic differential equations, in: International Conference on Learning Representations, ICLR, 2021.
[27]
Luo X., Xie Y., Zhang Y., Qu Y., Li C., Fu Y., Latticenet: Towards lightweight image super-resolution with lattice block, in: European Conference on Computer Vision, Springer, 2020, pp. 272–289.
[28]
K. Zhang, J. Liang, L. Van Gool, R. Timofte, Designing a practical degradation model for deep blind image super-resolution, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 4791–4800.
[29]
Vaswani A., Shazeer N., Parmar N., Uszkoreit J., Jones L., Gomez A.N., Kaiser Ł., Polosukhin I., Attention is all you need, Adv. Neural Inf. Process. Syst. 30 (2017).
[30]
H. Cao, Y. Wang, J. Chen, D. Jiang, X. Zhang, Q. Tian, M. Wang, Swin-unet: Unet-like pure transformer for medical image segmentation, in: ECCV Medical Computer Vision Workshop, 2021.
[31]
B. Zhang, S. Gu, B. Zhang, J. Bao, D. Chen, F. Wen, Y. Wang, B. Guo, Styleswin: Transformer-based gan for high-resolution image generation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 11304–11314.
[32]
J. Liang, J. Cao, G. Sun, K. Zhang, L. Van Gool, R. Timofte, SwinIR: Image Restoration Using Swin Transformer, in: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, 2021, pp. 1833–1844.
[33]
Li F., Zhang H., Liu S., Zhang L., Ni L.M., Shum H.-Y., et al., Mask DINO: Towards a unified transformer-based framework for object detection and segmentation, 2022, arXiv preprint arXiv:2206.02777.
[34]
K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 770–778.
[35]
Ronneberger O., Fischer P., Brox T., U-net: Convolutional networks for biomedical image segmentation, in: International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer, 2015, pp. 234–241.
[36]
Kingma D.P., Ba J., Adam: A method for stochastic optimization, 2014, arXiv preprint arXiv:1412.6980.
[37]
C.-M. Fan, T.-J. Liu, K.-H. Liu, SUNet: Swin Transformer UNet for Image Denoising, in: IEEE International Symposium on Circuits and Systems, ISCAS, 2022.
[38]
Wang W., Xie E., Li X., Fan D.-P., Song K., Liang D., Lu T., Luo P., Shao L., PVTv2: Improved baselines with pyramid vision transformer, Comput. Vis. Media 8 (3) (2022).
[39]
T. Karras, S. Laine, M. Aittala, J. Hellsten, J. Lehtinen, T. Aila, Analyzing and improving the image quality of stylegan, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 8110–8119.
[40]
J.F. Blinn, Models of light reflection for computer synthesized pictures, in: Proceedings of the 4th Annual Conference on Computer Graphics and Interactive Techniques, 1977, pp. 192–198.
[41]
Chai M., Luo L., Sunkavalli K., Carr N., Hadap S., Zhou K., High-quality hair modeling from a single portrait photo, ACM Trans. Graph. 34 (6) (2015) 1–10.
[42]
Ke Z., Li K., Zhou Y., Wu Q., Mao X., Yan Q., Lau R.W., Is a green screen really necessary for real-time portrait matting?, 2020, arXiv preprint arXiv:2011.11961.

Cited By

View all
  • (2024)Reconstructing, Understanding, and Analyzing Relief Type Cultural Heritage from a Single Old PhotoProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681612(7724-7733)Online publication date: 28-Oct-2024

Index Terms

  1. Modeling multi-style portrait relief from a single photograph
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image Graphical Models
        Graphical Models  Volume 130, Issue C
        Dec 2023
        104 pages

        Publisher

        Academic Press Professional, Inc.

        United States

        Publication History

        Published: 27 February 2024

        Author Tags

        1. Portrait relief
        2. Multi-style modeling
        3. Swin Transformer
        4. Photo-to-depth translation

        Qualifiers

        • Research-article

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)0
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 13 Jan 2025

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)Reconstructing, Understanding, and Analyzing Relief Type Cultural Heritage from a Single Old PhotoProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681612(7724-7733)Online publication date: 28-Oct-2024

        View Options

        View options

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media