research-article

Blind face restoration: : Benchmark datasets and a baseline model

Authors:

Guoren WangAuthors Info & Claims

Volume 574, Issue C

https://doi.org/10.1016/j.neucom.2024.127271

Published: 17 April 2024 Publication History

Abstract

Blind Face Restoration (BFR) aims to generate high-quality face images from low-quality inputs. However, existing BFR methods often use private datasets for training and evaluation, making it challenging for future approaches to compare fairly. To address this issue, we introduce two benchmark datasets, BFRBD128 and BFRBD512, for evaluating state-of-the-art methods in five scenarios: blur, noise, low resolution, JPEG compression artifacts, and full degradation. We use seven standard quantitative metrics and two task-specific metrics, AFLD and AFICS. Additionally, we propose an efficient baseline model called Swin Transformer U-Net (STUNet), which outperforms state-of-the-art methods in various BFR tasks. The codes, datasets, and trained models are publicly available at: https://github.com/bitzpy/Blind-Face-Restoration-Benchmark-Datasets-and-a-Baseline-Model.

References

[1]

Truong T.-D., Duong C.N., Quach K.G., Le N., Bui T.D., Luu K., LIAAD: Lightweight attentive angular distillation for large-scale age-invariant face recognition, Neurocomputing 543 (2023).

[2]

Cheema U., Moon S., Disguised heterogeneous face recognition using deep neighborhood difference relational network, Neurocomputing 519 (2023) 44–56.

[3]

Bai Y., Liu M., Yao C., Lin C., Zhao Y., MSPNet: Multi-stage progressive network for image denoising, Neurocomputing 517 (2023) 71–80.

[4]

Fu Z., Zheng Y., Ma T., Ye H., Yang J., He L., Edge-aware deep image deblurring, Neurocomputing 502 (2022) 37–47.

[5]

Wu Y., Cao H., Yang G., Lu T., Wan S., Digital twin of intelligent small surface defect detection with cyber-manufacturing systems, ACM Trans. Internet Technol. (2022).

Digital Library

[6]

Wu Y., Zhang L., Gu Z., Lu H., Wan S., Edge-AI-driven framework with efficient mobile network design for facial expression recognition, ACM Trans. Embed. Comput. Syst. 22 (3) (2023) 1–17.

[7]

Wu Y., Kong Q., Zhang L., Castiglione A., Nappi M., Wan S., Cdt-cad: Context-aware deformable transformers for end-to-end chest abnormality detection on x-ray images, IEEE/ACM Trans. Comput. Biol. Bioinform. (2023).

[8]

Z. Shen, W.-S. Lai, T. Xu, J. Kautz, M.-H. Yang, Deep semantic face deblurring, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 8260–8269.

[9]

Yue Z., Yong H., Zhao Q., Meng D., Zhang L., Variational denoising network: Toward blind noise modeling and removal, Adv. Neural Inf. Process. Syst. 32 (2019).

[10]

S. Anwar, N. Barnes, Real image denoising with feature attention, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 3155–3164.

[11]

Dong C., Loy C.C., He K., Tang X., Image super-resolution using deep convolutional networks, IEEE Trans. Pattern Anal. Mach. Intell. 38 (2) (2015) 295–307.

Digital Library

[12]

Y. Chen, Y. Tai, X. Liu, C. Shen, J. Yang, Fsrnet: End-to-end learning face super-resolution with facial priors, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 2492–2501.

[13]

B. Lim, S. Son, H. Kim, S. Nah, K. Mu Lee, Enhanced deep residual networks for single image super-resolution, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2017, pp. 136–144.

[14]

X. Wang, K. Yu, S. Wu, J. Gu, Y. Liu, C. Dong, Y. Qiao, C. Change Loy, Esrgan: Enhanced super-resolution generative adversarial networks, in: Proceedings of the European Conference on Computer Vision (ECCV) Workshops, 2018.

[15]

Yang W., Zhang X., Tian Y., Wang W., Xue J.-H., Liao Q., Deep learning for single image super-resolution: A brief review, IEEE Trans. Multimed. 21 (12) (2019) 3106–3121.

Digital Library

[16]

C. Dong, Y. Deng, C.C. Loy, X. Tang, Compression artifacts reduction by a deep convolutional network, in: Proceedings of the IEEE International Conference on Computer Vision, 2015, pp. 576–584.

[17]

Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, B. Guo, Swin transformer: Hierarchical vision transformer using shifted windows, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 10012–10022.

[18]

Jesorsky O., Kirchberg K.J., Frischholz R.W., Robust face detection using the hausdorff distance, in: International Conference on Audio-and Video-Based Biometric Person Authentication, Springer, 2001, pp. 90–95.

[19]

G.B. Huang, M. Mattar, T. Berg, E. Learned-Miller, Labeled faces in the wild: A database forstudying face recognition in unconstrained environments, in: Workshop on Faces in’Real-Life’Images: Detection, Alignment, and Recognition, 2008.

[20]

Koestinger M., Wohlhart P., Roth P.M., Bischof H., Annotated facial landmarks in the wild: A large-scale, real-world database for facial landmark localization, in: 2011 IEEE International Conference on Computer Vision Workshops, ICCV Workshops, IEEE, 2011, pp. 2144–2151.

[21]

Le V., Brandt J., Lin Z., Bourdev L., Huang T.S., Interactive facial feature localization, in: European Conference on Computer Vision, Springer, 2012, pp. 679–692.

[22]

Yi D., Lei Z., Liao S., Li S.Z., Learning face representation from scratch, 2014, arXiv preprint arXiv:1411.7923.

[23]

Z. Liu, P. Luo, X. Wang, X. Tang, Deep learning face attributes in the wild, in: Proceedings of the IEEE International Conference on Computer Vision, 2015, pp. 3730–3738.

[24]

R. Rothe, R. Timofte, L.V. Gool, DEX: Deep EXpectation of apparent age from a single image, in: IEEE International Conference on Computer Vision Workshops, ICCVW, 2015.

[25]

Cao Q., Shen L., Xie W., Parkhi O.M., Zisserman A., Vggface2: A dataset for recognising faces across pose and age, in: 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition, FG 2018, IEEE, 2018, pp. 67–74.

[26]

T. Karras, S. Laine, T. Aila, A style-based generator architecture for generative adversarial networks, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 4401–4410.

[27]

Sun Y., Chen Y., Wang X., Tang X., Deep learning face representation by joint identification-verification, Adv. Neural Inf. Process. Syst. 27 (2014).

[28]

Zhang K., Luo W., Zhong Y., Ma L., Liu W., Li H., Adversarial spatio-temporal learning for video deblurring, IEEE Trans. Image Process. 28 (1) (2018) 291–301.

[29]

K. Zhang, W. Luo, Y. Zhong, L. Ma, B. Stenger, W. Liu, H. Li, Deblurring by realistic blurring, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 2737–2746.

[30]

O. Kupyn, V. Budzan, M. Mykhailych, D. Mishkin, J. Matas, Deblurgan: Blind motion deblurring using conditional adversarial networks, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 8183–8192.

[31]

O. Kupyn, T. Martyniuk, J. Wu, Z. Wang, Deblurgan-v2: Deblurring (orders-of-magnitude) faster and better, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 8878–8887.

[32]

Jiang X., Yao H., Zhao S., Text image deblurring via two-tone prior, Neurocomputing 242 (2017) 1–14.

[33]

Zhang K., Zuo W., Chen Y., Meng D., Zhang L., Beyond a gaussian denoiser: Residual learning of deep cnn for image denoising, IEEE Trans. Image Process. 26 (7) (2017) 3142–3155.

Digital Library

[34]

Zhang K., Zuo W., Zhang L., FFDNet: Toward a fast and flexible solution for CNN-based image denoising, IEEE Trans. Image Process. 27 (9) (2018) 4608–4622.

[35]

Liu P., Zhang H., Wang J., Wang Y., Ren D., Zuo W., Robust deep ensemble method for real-world image denoising, Neurocomputing 512 (2022) 1–14.

[36]

Y. Zhang, K. Li, K. Li, L. Wang, B. Zhong, Y. Fu, Image super-resolution using very deep residual channel attention networks, in: Proceedings of the European Conference on Computer Vision, ECCV, 2018, pp. 286–301.

[37]

C. Ledig, L. Theis, F. Huszár, J. Caballero, A. Cunningham, A. Acosta, A. Aitken, A. Tejani, J. Totz, Z. Wang, et al., Photo-realistic single image super-resolution using a generative adversarial network, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 4681–4690.

[38]

M.S. Sajjadi, B. Scholkopf, M. Hirsch, Enhancenet: Single image super-resolution through automated texture synthesis, in: Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 4491–4500.

[39]

K. Zhang, D. Li, W. Luo, W. Ren, B. Stenger, W. Liu, H. Li, M.-H. Yang, Benchmarking Ultra-High-Definition Image Super-resolution, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 14769–14778.

[40]

Wu Y., Cao R., Hu Y., Wang J., Li K., Combining global receptive field and spatial spectral information for single-image hyperspectral super-resolution, Neurocomputing 542 (2023).

[41]

Lee B., Ko K., Hong J., Ko H., Domain-agnostic single-image super-resolution via a meta-transfer neural architecture search, Neurocomputing 524 (2023) 59–68.

[42]

X. Fu, Z.-J. Zha, F. Wu, X. Ding, J. Paisley, Jpeg artifacts reduction via deep convolutional sparse coding, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 2501–2510.

[43]

Q. Cao, L. Lin, Y. Shi, X. Liang, G. Li, Attention-aware face hallucination via deep reinforcement learning, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 690–698.

[44]

Kim D., Kim M., Kwon G., Kim D.-S., Progressive face super-resolution via attention to facial landmark, 2019, arXiv preprint arXiv:1908.08239.

[45]

S. Menon, A. Damian, S. Hu, N. Ravi, C. Rudin, Pulse: Self-supervised photo upsampling via latent space exploration of generative models, in: Proceedings of the Ieee/Cvf Conference on Computer Vision and Pattern Recognition, 2020, pp. 2437–2445.

[46]

H. Huang, R. He, Z. Sun, T. Tan, Wavelet-srnet: A wavelet-based cnn for multi-scale face super resolution, in: Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 1689–1697.

[47]

Zhang K., Li D., Luo W., Liu J., Deng J., Liu W., Zafeiriou S., EDFace-Celeb-1M: Benchmarking face hallucination with a million-scale dataset, 2021, arXiv preprint arXiv:2110.05031.

[48]

X. Yu, F. Porikli, Hallucinating very low-resolution unaligned and noisy face images by transformative discriminative autoencoders, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 3760–3768.

[49]

X. Yu, B. Fernando, R. Hartley, F. Porikli, Super-resolving very low-resolution face images with supplementary attributes, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 908–917.

[50]

X. Li, M. Liu, Y. Ye, W. Zuo, L. Lin, R. Yang, Learning warped guidance for blind face restoration, in: Proceedings of the European Conference on Computer Vision, ECCV, 2018, pp. 272–289.

[51]

Li X., Chen C., Zhou S., Lin X., Zuo W., Zhang L., Blind face restoration via deep multi-scale component dictionaries, in: European Conference on Computer Vision, Springer, 2020, pp. 399–415.

[52]

L. Yang, S. Wang, S. Ma, W. Gao, C. Liu, P. Wang, P. Ren, Hifacegan: Face renovation via collaborative suppression and replenishment, in: Proceedings of the 28th ACM International Conference on Multimedia, 2020, pp. 1551–1560.

[53]

T. Yang, P. Ren, X. Xie, L. Zhang, GAN prior embedded network for blind face restoration in the wild, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 672–681.

[54]

C. Chen, X. Li, L. Yang, X. Lin, L. Zhang, K.-Y.K. Wong, Progressive semantic-aware style transformation for blind face restoration, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 11896–11905.

[55]

X. Wang, Y. Li, H. Zhang, Y. Shan, Towards real-world blind face restoration with generative facial prior, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 9168–9178.

[56]

Vaswani A., Shazeer N., Parmar N., Uszkoreit J., Jones L., Gomez A.N., Kaiser L.u., Polosukhin I., Attention is all you need, in: Guyon I., Luxburg U.V., Bengio S., Wallach H., Fergus R., Vishwanathan S., Garnett R. (Eds.), Advances in Neural Information Processing Systems, Vol. 30, Curran Associates, Inc., 2017, [Online]. Available: https://proceedings.neurips.cc/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf.

[57]

Parmar N., Vaswani A., Uszkoreit J., Kaiser L., Shazeer N., Ku A., Tran D., Image transformer, in: International Conference on Machine Learning, PMLR, 2018, pp. 4055–4064.

[58]

H. Hu, Z. Zhang, Z. Xie, S. Lin, Local relation networks for image recognition, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 3464–3473.

[59]

Ramachandran P., Parmar N., Vaswani A., Bello I., Levskaya A., Shlens J., Stand-alone self-attention in vision models, Adv. Neural Inf. Process. Syst. 32 (2019).

[60]

H. Zhao, J. Jia, V. Koltun, Exploring self-attention for image recognition, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 10076–10085.

[61]

Child R., Gray S., Radford A., Sutskever I., Generating long sequences with sparse transformers, 2019, arXiv preprint arXiv:1904.10509.

[62]

Weissenborn D., Täckström O., Uszkoreit J., Scaling autoregressive video models, 2019, arXiv preprint arXiv:1906.02634.

[63]

Ho J., Kalchbrenner N., Weissenborn D., Salimans T., Axial attention in multidimensional transformers, 2019, arXiv preprint arXiv:1912.12180.

[64]

Wang H., Zhu Y., Green B., Adam H., Yuille A., Chen L.-C., Axial-deeplab: Stand-alone axial-attention for panoptic segmentation, in: European Conference on Computer Vision, Springer, 2020, pp. 108–126.

[65]

Cordonnier J.-B., Loukas A., Jaggi M., On the relationship between self-attention and convolutional layers, 2019, arXiv preprint arXiv:1911.03584.

[66]

Dosovitskiy A., Beyer L., Kolesnikov A., Weissenborn D., Zhai X., Unterthiner T., Dehghani M., Minderer M., Heigold G., Gelly S., et al., An image is worth 16x16 words: Transformers for image recognition at scale, 2020, arXiv preprint arXiv:2010.11929.

[67]

Touvron H., Cord M., Douze M., Massa F., Sablayrolles A., Jégou H., Training data-efficient image transformers & distillation through attention, in: International Conference on Machine Learning, PMLR, 2021, pp. 10347–10357.

[68]

Cao J., Li Y., Zhang K., Van Gool L., Video super-resolution transformer, 2021, arXiv preprint arXiv:2106.06847.

[69]

H. Chen, Y. Wang, T. Guo, C. Xu, Y. Deng, Z. Liu, S. Ma, C. Xu, C. Xu, W. Gao, Pre-trained image processing transformer, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 12299–12310.

[70]

Wang Z., Cun X., Bao J., Liu J., Uformer: A general u-shaped transformer for image restoration, 2021, arXiv preprint arXiv:2106.03106.

[71]

J. Liang, J. Cao, G. Sun, K. Zhang, L. Van Gool, R. Timofte, Swinir: Image restoration using swin transformer, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 1833–1844.

[72]

Vaswani A., Shazeer N., Parmar N., Uszkoreit J., Jones L., Gomez A.N., Kaiser Ł., Polosukhin I., Attention is all you need, Adv. Neural Inf. Process. Syst. 30 (2017).

Digital Library

[73]

Zhou S., Chan K., Li C., Loy C.C., Towards robust blind face restoration with codebook lookup transformer, Adv. Neural Inf. Process. Syst. 35 (2022) 30599–30611.

[74]

Yue Z., Loy C.C., DifFace: Blind face restoration with diffused error contraction, 2022, arXiv preprint arXiv:2212.06512.

[75]

Wang Z., Bovik A.C., Sheikh H.R., Simoncelli E.P., Image quality assessment: from error visibility to structural similarity, IEEE Trans. Image Process. 13 (4) (2004) 600–612.

Digital Library

[76]

Wang Z., Simoncelli E.P., Bovik A.C., Multiscale structural similarity for image quality assessment, in: The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003, Vol. 2, Ieee, 2003, pp. 1398–1402.

[77]

R. Zhang, P. Isola, A.A. Efros, E. Shechtman, O. Wang, The unreasonable effectiveness of deep features as a perceptual metric, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 586–595.

[78]

Mittal A., Soundararajan R., Bovik A.C., Making a “completely blind” image quality analyzer, IEEE Signal Process. Lett. 20 (3) (2012) 209–212.

[79]

J. Ke, Q. Wang, Y. Wang, P. Milanfar, F. Yang, Musiq: Multi-scale image quality transformer, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 5148–5157.

[80]

J. Wang, K.C. Chan, C.C. Loy, Exploring clip for assessing the look and feel of images, in: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 37, No. 2, 2023, pp. 2555–2563.

Cited By

Gao WSong ZZhang ZLu J(2024)On the Robustness of Deep Face Inpainting: An Adversarial PerspectiveProceedings of the 6th ACM International Conference on Multimedia in Asia10.1145/3696409.3700268(1-7)Online publication date: 3-Dec-2024
https://dl.acm.org/doi/10.1145/3696409.3700268

Recommendations

DiffBFR: Bootstrapping Diffusion Model for Blind Face Restoration
MM '23: Proceedings of the 31st ACM International Conference on Multimedia

Blind face restoration (BFR) is important while challenging. Prior works prefer to exploit GAN-based frameworks to tackle this task due to the balance of quality and efficiency. However, these methods suffer from poor stability and adaptability to long-...
Degradation-Aware Blind Face Restoration via High-Quality VQ Codebook
Advances in Computer Graphics
Abstract
Blind face restoration, as a kind of face restoration method dealing with complex degradation, has been a challenging research hotspot recently. However, due to the influence of a variety of degradation in low-quality images, artifacts commonly ...
Adapting total generalized variation for blind image restoration

In this paper, a fast blind deconvolution approach is proposed for image deblurring by modifying a recent well-known natural image model, i.e., the total generalized variation (TGV). As a generalization of total variation, TGV aims at reconstructing a ...

Comments

Information & Contributors

Information

Published In

cover image Neurocomputing

Neurocomputing Volume 574, Issue C

Mar 2024

367 pages

Issue’s Table of Contents

Elsevier B.V.

Publisher

Elsevier Science Publishers B. V.

Netherlands

Publication History

Published: 17 April 2024

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 05 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Gao WSong ZZhang ZLu J(2024)On the Robustness of Deep Face Inpainting: An Adversarial PerspectiveProceedings of the 6th ACM International Conference on Multimedia in Asia10.1145/3696409.3700268(1-7)Online publication date: 3-Dec-2024
https://dl.acm.org/doi/10.1145/3696409.3700268

View Options

View options

Media

Figures

Other

Tables

View Issue’s Table of Contents