AAT: Adapting Audio Transformer for Various Acoustics Recognition Tasks

Liang, Yun; Lin, Hai; Qiu, Shaojian; Zhang, Yihang

Computer Science > Sound

arXiv:2401.10544 (cs)

[Submitted on 19 Jan 2024]

Title:AAT: Adapting Audio Transformer for Various Acoustics Recognition Tasks

Authors:Yun Liang, Hai Lin, Shaojian Qiu, Yihang Zhang

View PDF HTML (experimental)

Abstract:Recently, Transformers have been introduced into the field of acoustics recognition. They are pre-trained on large-scale datasets using methods such as supervised learning and semi-supervised learning, demonstrating robust generality--It fine-tunes easily to downstream tasks and shows more robust performance. However, the predominant fine-tuning method currently used is still full fine-tuning, which involves updating all parameters during training. This not only incurs significant memory usage and time costs but also compromises the model's generality. Other fine-tuning methods either struggle to address this issue or fail to achieve matching performance. Therefore, we conducted a comprehensive analysis of existing fine-tuning methods and proposed an efficient fine-tuning approach based on Adapter tuning, namely AAT. The core idea is to freeze the audio Transformer model and insert extra learnable Adapters, efficiently acquiring downstream task knowledge without compromising the model's original generality. Extensive experiments have shown that our method achieves performance comparable to or even superior to full fine-tuning while optimizing only 7.118% of the parameters. It also demonstrates superiority over other fine-tuning methods.

Comments:	Preprint version for ICASSP 2024, Korea
Subjects:	Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2401.10544 [cs.SD]
	(or arXiv:2401.10544v1 [cs.SD] for this version)
	https://doi.org/10.48550/arXiv.2401.10544

Submission history

From: Hai Lin [view email]
[v1] Fri, 19 Jan 2024 08:07:59 UTC (122 KB)

Computer Science > Sound

Title:AAT: Adapting Audio Transformer for Various Acoustics Recognition Tasks

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Sound

Title:AAT: Adapting Audio Transformer for Various Acoustics Recognition Tasks

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators