Energy and Carbon Considerations of Fine-Tuning BERT

Xiaorong Wang, Clara Na, Emma Strubell, Sorelle Friedler, Sasha Luccioni


Abstract
Despite the popularity of the pre-train then fine-tune paradigm in the NLP community, existing work quantifying energy costs and associated carbon emissions has largely focused on language model pre-training. Although a single pre-training run draws substantially more energy than fine-tuning, fine-tuning is performed more frequently by many more individual actors, and thus must be accounted for when considering the energy and carbon footprint of NLP. In order to better characterize the role of fine-tuning in the landscape of energy and carbon emissions in NLP, we perform a careful empirical study of the computational costs of fine-tuning across tasks, datasets, hardware infrastructure and measurement modalities. Our experimental results allow us to place fine-tuning energy and carbon costs into perspective with respect to pre-training and inference, and outline recommendations to NLP researchers and practitioners who wish to improve their fine-tuning energy efficiency.
Anthology ID:
2023.findings-emnlp.607
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2023
Month:
December
Year:
2023
Address:
Singapore
Editors:
Houda Bouamor, Juan Pino, Kalika Bali
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
9058–9069
Language:
URL:
https://aclanthology.org/2023.findings-emnlp.607
DOI:
10.18653/v1/2023.findings-emnlp.607
Bibkey:
Cite (ACL):
Xiaorong Wang, Clara Na, Emma Strubell, Sorelle Friedler, and Sasha Luccioni. 2023. Energy and Carbon Considerations of Fine-Tuning BERT. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 9058–9069, Singapore. Association for Computational Linguistics.
Cite (Informal):
Energy and Carbon Considerations of Fine-Tuning BERT (Wang et al., Findings 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.findings-emnlp.607.pdf