Decoding Decoded: Understanding Hyperparameter Effects in Open-Ended Text Generation

Arias, Esteban Garces; Li, Meimingwei; Heumann, Christian; Aßenmacher, Matthias

Computer Science > Computation and Language

arXiv:2410.06097 (cs)

[Submitted on 8 Oct 2024]

Title:Decoding Decoded: Understanding Hyperparameter Effects in Open-Ended Text Generation

Authors:Esteban Garces Arias, Meimingwei Li, Christian Heumann, Matthias Aßenmacher

View PDF HTML (experimental)

Abstract:Decoding strategies for large language models (LLMs) are a critical but often underexplored aspect of text generation tasks. Since LLMs produce probability distributions over the entire vocabulary, various decoding methods have been developed to transform these probabilities into coherent and fluent text, each with its own set of hyperparameters. In this study, we present a large-scale, comprehensive analysis of how hyperparameter selection affects text quality in open-ended text generation across multiple LLMs, datasets, and evaluation metrics. Through an extensive sensitivity analysis, we provide practical guidelines for hyperparameter tuning and demonstrate the substantial influence of these choices on text quality. Using three established datasets, spanning factual domains (e.g., news) and creative domains (e.g., fiction), we show that hyperparameter tuning significantly impacts generation quality, though its effects vary across models and tasks. We offer in-depth insights into these effects, supported by both human evaluations and a synthesis of widely-used automatic evaluation metrics.

Subjects:	Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as:	arXiv:2410.06097 [cs.CL]
	(or arXiv:2410.06097v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2410.06097

Submission history

From: Matthias Aßenmacher [view email]
[v1] Tue, 8 Oct 2024 14:51:03 UTC (4,811 KB)

Computer Science > Computation and Language

Title:Decoding Decoded: Understanding Hyperparameter Effects in Open-Ended Text Generation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Decoding Decoded: Understanding Hyperparameter Effects in Open-Ended Text Generation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators