skip to main content
10.1145/3595353.3595883acmconferencesArticle/Chapter ViewAbstractPublication Pagesasia-ccsConference Proceedingsconference-collections
poster

GAN Discriminator based Audio Deepfake Detection

Published: 02 September 2023 Publication History

Abstract

Deepfake is a new technology that has emerged in recent times and is becoming one of the great challenges for society and individuals. In particular, scammers could use deepfake to start a phishing attack by cloning the victim’s voice and calling his related person with the fake audio. In this study, we propose a transfer learning model for detecting deepfake audio. This model introduces a method to leverage the discriminator of a GAN-based vocoder model to extract the front-end features of an unidentified audio sample, which helps the model to detect fake voices more easily. With a detection efficiency of up to 94%, we demonstrate that this transfer learning method is feasible.

References

[1]
Su-Yu Chang, Kai-Cheng Wu, and Chia-Ping Chen. 2019. Transfer-Representation Learning for Detecting Spoofing Attacks with Converted and Synthesized Speech in Automatic Speaker Verification System. In Interspeech 2019. ISCA, 1063–1067. https://doi.org/10.21437/Interspeech.2019-2014
[2]
Emanuele Conti, Davide Salvi, Clara Borrelli, Brian Hosler, Paolo Bestagini, Fabio Antonacci, Augusto Sarti, Matthew C. Stamm, and Stefano Tubaro. 2022. Deepfake Speech Detection Through Emotion Recognition: A Semantic Approach. In ICASSP 2022 - 2022 IEEE Int. Conf. Acoust. Speech Signal Process. ICASSP. 8962–8966. https://doi.org/10.1109/ICASSP43922.2022.9747186
[3]
Youngsik Eom, Yeonghyeon Lee, Ji Sub Um, and Hoi Rin Kim. 2022. Anti-Spoofing Using Transfer Learning with Variational Information Bottleneck. In Interspeech 2022. ISCA, 3568–3572. https://doi.org/10.21437/Interspeech.2022-10200
[4]
Ameer Hamza, Abdul Rehman Rehman Javed, Farkhund Iqbal, Natalia Kryvinska, Ahmad S. Almadhor, Zunera Jalil, and Rouba Borghol. 2022. Deepfake Audio Detection via MFCC Features Using Machine Learning. IEEE Access 10 (2022), 134018–134028. https://doi.org/10.1109/ACCESS.2022.3231480
[5]
Jungil Kong, Jaehyeon Kim, and Jaekyoung Bae. 2020. HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis. In Adv. Neural Inf. Process. Syst., Vol. 33. Curran Associates, Inc., 17022–17033.
[6]
Juan-Miguel López-Gil, Rosa Gil, and Roberto García. 2022. Do Deepfakes Adequately Display Emotions? A Study on Deepfake Facial Emotion Expression. Comput. Intell. Neurosci. 2022 (Oct. 2022), e1332122. https://doi.org/10.1155/2022/1332122
[7]
Daniele Mari, Federica Latora, and Simone Milani. 2022. The Sound of Silence: Efficiency of First Digit Features in Synthetic Audio Detection. In 2022 IEEE Int. Workshop Inf. Forensics Secur. WIFS. 1–6. https://doi.org/10.1109/WIFS55849.2022.9975404
[8]
Rahul T. P, P. R. Aravind, Ranjith C, Usamath Nechiyil, and Nandakumar Paramparambath. 2020. Audio Spoofing Verification Using Deep Convolutional Neural Networks by Transfer Learning. https://doi.org/10.48550/arXiv.2008.03464 arxiv:arXiv:2008.03464
[9]
Nishant Subramani and Delip Rao. 2020. Learning Efficient Representations for Fake Speech Detection. Proc. AAAI Conf. Artif. Intell. 34, 04 (April 2020), 5859–5866. https://doi.org/10.1609/aaai.v34i04.6044
[10]
Hemlata Tak, Jose Patino, Massimiliano Todisco, Andreas Nautsch, Nicholas Evans, and Anthony Larcher. 2021. End-to-End Anti-Spoofing with RawNet2. In ICASSP 2021 - 2021 IEEE Int. Conf. Acoust. Speech Signal Process. ICASSP. 6369–6373. https://doi.org/10.1109/ICASSP39728.2021.9414234
[11]
Hemlata Tak, Massimiliano Todisco, Xin Wang, Jee-weon Jung, Junichi Yamagishi, and Nicholas Evans. 2022. Automatic Speaker Verification Spoofing and Deepfake Detection Using Wav2vec 2.0 and Data Augmentation. In Speak. Lang. Recognit. Workshop Odyssey 2022. ISCA, 112–119. https://doi.org/10.21437/Odyssey.2022-16
[12]
Xu Tan, Tao Qin, Frank Soong, and Tie-Yan Liu. 2021. A Survey on Neural Speech Synthesis. https://doi.org/10.48550/arXiv.2106.15561 arxiv:arXiv:2106.15561
[13]
Ville Vestman, Tomi Kinnunen, Rosa González Hautamäki, and Md Sahidullah. 2020. Voice Mimicry Attacks Assisted by Automatic Speaker Verification. Computer Speech & Language 59 (Jan. 2020), 36–54. https://doi.org/10.1016/j.csl.2019.05.005
[14]
Junichi Yamagishi, Xin Wang, Massimiliano Todisco, Md Sahidullah, Jose Patino, Andreas Nautsch, Xuechen Liu, Kong Aik Lee, Tomi Kinnunen, Nicholas Evans, and Héctor Delgado. 2021. ASVspoof 2021: Accelerating Progress in Spoofed and Deepfake Speech Detection. In 2021 Ed. Autom. Speak. Verification Spoofing Countermeas. Chall.ISCA, 47–54. https://doi.org/10.21437/ASVSPOOF.2021-8
[15]
Fuzhen Zhuang, Zhiyuan Qi, Keyu Duan, Dongbo Xi, Yongchun Zhu, Hengshu Zhu, Hui Xiong, and Qing He. 2021. A Comprehensive Survey on Transfer Learning. Proc. IEEE 109, 1 (Jan. 2021), 43–76. https://doi.org/10.1109/JPROC.2020.3004555

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
WDC '23: Proceedings of the 2nd Workshop on Security Implications of Deepfakes and Cheapfakes
July 2023
37 pages
ISBN:9798400702037
DOI:10.1145/3595353
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 September 2023

Check for updates

Author Tags

  1. Deepfake audio detection
  2. GAN-based model
  3. Transfer learning

Qualifiers

  • Poster
  • Research
  • Refereed limited

Funding Sources

  • No.2020-0-00952, Development of 5G Edge Security Technology for Ensuring 5G+ Service Stability and Availability
  • No.2022-0-00495, On-Device Voice Phishing Call Detection

Conference

ASIA CCS '23
Sponsor:

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)158
  • Downloads (Last 6 weeks)11
Reflects downloads up to 13 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media