Geometry Sampling-Based Adaption to DCGAN for 3D Face Generation †
Abstract
:1. Introduction
- First, we propose two new sampling methods to represent 3D faces as a structured matrix, which enables us to generate 3D shapes with deep networks. Specifically, they are the geometry sampling method and the depth map-like sampling method.
- Second, we present a straight-forward unsupervised 3D face generative model, which does not require any pre-processing steps such as the extraction of facial feature points or pre-computing the correspondences.
2. Related Work
3. Geometry Sampling and 3D Face Generation
3.1. Geometry Sampling Based on the Intersection
- Iso-geodesic curves. Given the detected nose tip , an iso-geodesic curve contains a sequence of the vertices on the face surface that have the same geodesic distance to the nose-tip. We denote an iso-geodesic curve as follows:
- Radial curves. We first align the 3D face model to the plane. Then, we provide the nose tip , a radial curve contains a sequence of the vertices whose projections on the plane have the same angle as the axis, i.e., . We denote a radial curve as follows:
- Geometry sampling based on the intersections. Our objective with the geometry sampling step is to sample the vertices on a 3D face and save them into a squared matrix of the size , where R denotes the total number of the iso-geodesic curves. In order to achieve averaged sampling, we compute the iso-geodesic curves with the linearly increased d, i.e.,
- (a)
- First, starting from the detected nose tip , we assign it to the center of the sampling matrix, i.e., .
- (b)
- Then, we compute the r-th iso-geodesic curve, and radial curves .
- (c)
- After that, by computing the intersections between the k-th iso-geodesic curve and the newly computed radial curves, we obtain intersected vertices in order, which can be stored into the k-th ring within the sampling matrix .
- (d)
- By repeating the steps (b)-(c) until , we can obtain the full sampling matrix M.
3.2. Geometric Sampling of the Depth-like Map
3.3. 3D Face Generation via DCGAN
- Apply the transposed convolutions for G and the stride convolutions instead of the pooling layers.
- Apply the fully Convolutional Networks instead of the fully connected hidden layers.
- Apply the batch normalization [40] in both G and D.
4. Results and Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Richardson, E.; Sela, M.; Kimmel, R. 3d face reconstruction by learning from synthetic data. In Proceedings of the 2016 Fourth International Conference on 3D Vision (3DV), Stanford, CA, USA, 25–28 October 2016; pp. 460–469. [Google Scholar]
- Gilani, S.Z.; Mian, A. Learning from millions of 3d scans for large-scale 3d face recognition. In Proceedings of the IEEE Conference of Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018; pp. 1896–1905. [Google Scholar]
- Luo, G.; Zhao, X.; Tong, Y.; Chen, Q.; Zhu, Z.; Lei, H.; Lin, J. Geometry Sampling for 3D Face Generation via DCGAN. In Proceedings of the 2020 International Joint Conference on Neural Networks (IJCNN), Glasgow, UK, 19–24 July 2020; pp. 1–7. [Google Scholar]
- Luan, S.; Chen, C.; Zhang, B.; Han, J.; Liu, J. Gabor convolutional networks. IEEE Trans. Image Process. 2018, 27, 4357–4366. [Google Scholar]
- Minaee, S.; Liang, X.; Yan, S. Modern augmented reality: Applications, trends, and future directions. arXiv 2022, arXiv:2202.09450. [Google Scholar]
- Dünser, A.; Hornecker, E. Lessons from an ar book study. In Proceedings of the 1st International Conference on Tangible and Embedded Interaction, Yokohama, Japan, 20–23 March 2007; pp. 179–182. [Google Scholar]
- Chen, C.-H.; Lee, I.-J.; Lin, L.-Y. Augmented reality-based self-facial modeling to promote the emotional expression and social skills of adolescents with autism spectrum disorders. Res. Dev. Disabil. 2015, 36, 396–403. [Google Scholar] [CrossRef] [PubMed]
- Avinash, P.; Sharma, M. Predicting forward & backward facial depth maps from a single rgb image for mobile 3d ar application. In Proceedings of the 2019 International Conference on 3D Immersion (IC3D), Brussels, Belgium, 12–12 December 2019; pp. 1–8. [Google Scholar]
- Izadi, S.; Kim, D.; Hilliges, O.; Molyneaux, D.; Newcombe, R.; Kohli, P.; Shotton, J.; Hodges, S.; Freeman, D.; Davison, A.; et al. Kinectfusion: Real-time 3d reconstruction and interaction using a moving depth camera. In Proceedings of the 24th Annual ACM Symposium on User Interface Software and Technology, Santa Barbara, CA, USA, 16–19 October 2011; pp. 559–568. [Google Scholar]
- Song, S.; Lichtenberg, S.P.; Xiao, J. Sun rgb-d: A rgb-d scene understanding benchmark suite. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 567–576. [Google Scholar]
- Handa, A.; Whelan, T.; McDonald, J.; Davison, A.J. A benchmark for rgb-d visual odometry, 3d reconstruction and slam. In Proceedings of the 2014 IEEE International Conference on Robotics and Automation (ICRA), Hong Kong, China, 31 May–7 June 2014; pp. 1524–1531. [Google Scholar]
- Kemelmacher-Shlizerman, I.; Basri, R. 3d face reconstruction from a single image using a single reference face shape. IEEE Trans. Pattern Anal. Mach. Intell. 2011, 33, 394–4053. [Google Scholar] [PubMed]
- Zhu, Y.; Mottaghi, R.; Kolve, E.; Lim, J.J.; Gupta, A.; Fei-Fei, L.; Farhadi, A. Target-driven visual navigation in indoor scenes using deep reinforcement learning. In Proceedings of the 2017 IEEE International Conference on Robotics and Automation (ICRA), Singapore, 29 May–3 June 2017; pp. 3357–3364. [Google Scholar]
- Choy, C.B.; Xu, D.; Gwak, J.; Chen, K.; Savarese, S. 3d-r2n2: A unified approach for single and multi-view 3d object reconstruction. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 8–16 October 2016; pp. 628–644. [Google Scholar]
- Fan, H.; Su, H.; Guibas, L. A point set generation network for 3d object reconstruction from a single image. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 2463–2471. [Google Scholar]
- Garrido, P.; Zollhfer, M.; Casas, D.; Valgaerts, L.; Varanasi, K.; Prez, P.; Theobalt, C. Reconstruction of personalized 3d face rigs from monocular video. ACM Trans. Graph. 2016, 35, 28. [Google Scholar]
- Deng, Y.; Yang, J.; Xu, S.; Chen, D.; Jia, Y.; Tong, X. Accurate 3d face reconstruction with weakly-supervised learning: From single image to image set. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019. [Google Scholar]
- Karras, T.; Laine, S.; Aittala, M.; Hellsten, J.; Lehtinen, J.; Aila, T. Analyzing and improving the image quality of stylegan. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 8110–8119. [Google Scholar]
- Lu, Y.; Tai, Y.-W.; Tang, C.-K. Attribute-guided face generation using conditional cyclegan. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 282–297. [Google Scholar]
- Karras, T.; Laine, S.; Aila, T. A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 4401–4410. [Google Scholar]
- Soltanpour, S.; Boufama, B.; Wu, Q.J. A survey of local feature methods for 3d face recognition. Pattern Recognit. 2017, 72, 391–406. [Google Scholar]
- Yamauchi, H.; Gumhold, S.; Zayer, R.; Seidel, H.-P. Mesh segmentation driven by gaussian curvature. Vis. Comput. 2005, 21, 659–668. [Google Scholar]
- Drira, H.; Amor, B.B.; Daoudi, M.; Srivastava, A. Pose and expression-invariant 3d face recognition using elastic radial curves. In Proceedings of the British Machine Vision Conference, Virtual Event, 7–10 September 2020; pp. 1–11. [Google Scholar]
- Lipman, Y.; Funkhouser, T. Möbius voting for surface correspondence. ACM Trans. Graph. (TOG) 2009, 28, 72. [Google Scholar]
- Wang, C.C.; Tang, K.; Yeung, B.M. Freeform surface flattening based on fitting a woven mesh model. Comput.-Aided Des. 2005, 37, 799–814. [Google Scholar]
- Briceño, H.M.; Sander, P.V.; McMillan, L.; Gortler, S.; Hoppe, H. Geometry videos: A new representation for 3d animations. In Proceedings of the ACM SIGGRAPH/Eurographics Symposium on Computer Animation, San Diego, CA, USA, 26–27 July 2003. pp. 136–146. [Google Scholar]
- Xia, J.; He, Y.; Quynh, D.; Chen, X.; Hoi, S.C. Modeling 3d facial expressions using geometry videos. In Proceedings of the 18th ACM international conference on Multimedia, Firenze, Italy, 25–29 October 2010; pp. 591–600. [Google Scholar]
- Li, T.; Bolkart, T.; Black, M.J.; Li, H.; Romero, J. Learning a model of facial shape and expression from 4d scans. Acm Trans. Graph. (TOG) 2017, 36, 194. [Google Scholar] [CrossRef]
- Chen, L.; Ye, J.; Jiang, L.; Ma, C.; Cheng, Z.; Zhang, X. Synthesizing cloth wrinkles by cnn-based geometry image superresolution. Comput. Animat. Virtual Worlds 2018, 29, e1810. [Google Scholar]
- Fabius, O.; van Amersfoort, J.R. Variational recurrent auto-encoders. arXiv 2014, arXiv:1412.6581. [Google Scholar]
- Kingma, D.P.; Welling, M. Auto-encoding variational bayes. arXiv 2013, arXiv:1312.6114. [Google Scholar]
- Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial nets. In Proceedings of the 27th International Conference on Neural Information Processing, Montreal, QC, Canada, 8–13 December 2014; pp. 2672–2680. [Google Scholar]
- Lin, S.; Ji, R.; Yan, C.; Zhang, B.; Cao, L.; Ye, Q.; Huang, F.; Doermann, D. Towards optimal structured cnn pruning via generative adversarial learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 2790–2799. [Google Scholar]
- Radford, A.; Metz, L.; Chintala, S. Unsupervised representation learning with deep convolutional generative adversarial networks. CoRR 2015, arXiv:1511.06434. [Google Scholar]
- Kossaifi, J.; Tran, L.; Panagakis, Y.; Pantic, M. GAGAN: Geometry-aware generative adversarial networks. arXiv 2017, arXiv:1712.00684. [Google Scholar]
- Samad, M.D.; Iftekharuddin, K.M. Frenet frame-based generalized space curve representation for pose-invariant classification and recognition of 3-d face. IEEE Trans. Hum.-Mach. Syst. 2016, 46, 522–533. [Google Scholar] [CrossRef]
- Nair, V.; Hinton, G.E. Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th ICML, Haifa, Israel, 21–24 June 2010; pp. 807–814. [Google Scholar]
- Maas, A.L.; Hannun, A.Y.; Ng, A.Y. Rectifier nonlinearities improve neural network acoustic models. In Proceedings of the ICML Workshop on Deep Learning for Audio, Speech and Language Processing, Atlanta, GA, USA, 16–21 June 2013; Volume 1, pp. 1–6. [Google Scholar]
- Xu, B.; Wang, N.; Chen, T.; Li, M. Empirical evaluation of rectified activations in convolutional network. CoRR 2015, arXiv:1505.00853. [Google Scholar]
- Ioffe, S.; Szegedy, C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv 2015, arXiv:1502.03167. [Google Scholar]
- Pfister, H.; Zwicker, M.; Baar, J.V.; Gross, M. Surfels: Surface elements as rendering primitives. In Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques, New Orleans, LA, USA, 23–28 July 2000; pp. 335–342. [Google Scholar]
- Levoy, M.; Pulli, K.; Curless, B.; Rusinkiewicz, S.; Koller, D.; Pereira, L.; Ginzton, M.; Anderson, S.; Davis, J.; Ginsberg, J.; et al. The digital michelangelo project: 3D scanning of large statues. In Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques, New Orleans, LA, USA, 23–28 July 2000; pp. 131–144. [Google Scholar]
- Rusinkiewicz, S.; Levoy, M. Qsplat: A multiresolution point rendering system for large meshes. In Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques, New Orleans, LA, USA, 23–28 July 2000; pp. 343–352. [Google Scholar]
- Yin, L.; Wei, X.; Sun, Y.; Wang, J.; Rosato, M.J. A 3d facial expression database for facial behavior research. In Proceedings of the 7th International Conference on Automatic Face and Gesture Recognition (FGR06), Southampton, UK, 10–12 April 2006; pp. 211–216. [Google Scholar]
Facial Expressions | Geometry Sampling | 1 epoch | 5 epochs | 10 epochs |
---|---|---|---|---|
Angry | 73.2 | 35.4 | 164.2 | 348.2 |
Disgust | 69.9 | 35.2 | 163.0 | 345.8 |
Fear | 70.5 | 39.3 | 163.5 | 356.5 |
Happy | 74.7 | 40.4 | 163.5 | 355.0 |
Neutral | 77.0 | 38.1 | 164.0 | 353.1 |
Sad | 73.8 | 35.3 | 166.6 | 333.8 |
Surprise | 79.9 | 35.7 | 167.6 | 323.8 |
Facial Expressions | Geometry Sampling | 1 epoch | 5 epochs | 10 epochs |
---|---|---|---|---|
Angry | 0.447 | 1.762 | 8.774 | 16.558 |
Disgust | 0.537 | 1.519 | 8.250 | 15.936 |
Fear | 0.523 | 1.661 | 8.106 | 16.632 |
Happy | 0.465 | 1.628 | 7.891 | 16.022 |
Neutral | 0.448 | 1.764 | 8.544 | 16.988 |
Sad | 0.512 | 1.583 | 7.996 | 15.645 |
Surprise | 0.438 | 1.525 | 7.766 | 15.135 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Luo, G.; Xiong, G.; Huang, X.; Zhao, X.; Tong, Y.; Chen, Q.; Zhu, Z.; Lei, H.; Lin, J. Geometry Sampling-Based Adaption to DCGAN for 3D Face Generation. Sensors 2023, 23, 1937. https://doi.org/10.3390/s23041937
Luo G, Xiong G, Huang X, Zhao X, Tong Y, Chen Q, Zhu Z, Lei H, Lin J. Geometry Sampling-Based Adaption to DCGAN for 3D Face Generation. Sensors. 2023; 23(4):1937. https://doi.org/10.3390/s23041937
Chicago/Turabian StyleLuo, Guoliang, Guoming Xiong, Xiaojun Huang, Xin Zhao, Yang Tong, Qiang Chen, Zhiliang Zhu, Haopeng Lei, and Juncong Lin. 2023. "Geometry Sampling-Based Adaption to DCGAN for 3D Face Generation" Sensors 23, no. 4: 1937. https://doi.org/10.3390/s23041937
APA StyleLuo, G., Xiong, G., Huang, X., Zhao, X., Tong, Y., Chen, Q., Zhu, Z., Lei, H., & Lin, J. (2023). Geometry Sampling-Based Adaption to DCGAN for 3D Face Generation. Sensors, 23(4), 1937. https://doi.org/10.3390/s23041937