Scaling Data Diversity for Fine-Tuning Language Models in Human Alignment

Song, Feifan; Yu, Bowen; Lang, Hao; Yu, Haiyang; Huang, Fei; Wang, Houfeng; Li, Yongbin

Computer Science > Computation and Language

arXiv:2403.11124 (cs)

[Submitted on 17 Mar 2024 (v1), last revised 30 Mar 2024 (this version, v2)]

Title:Scaling Data Diversity for Fine-Tuning Language Models in Human Alignment

Authors:Feifan Song, Bowen Yu, Hao Lang, Haiyang Yu, Fei Huang, Houfeng Wang, Yongbin Li

View PDF HTML (experimental)

Abstract:Alignment with human preference prevents large language models (LLMs) from generating misleading or toxic content while requiring high-cost human feedback. Assuming resources of human annotation are limited, there are two different ways of allocating considered: more diverse PROMPTS or more diverse RESPONSES to be labeled. Nonetheless, a straightforward comparison between their impact is absent. In this work, we first control the diversity of both sides according to the number of samples for fine-tuning, which can directly reflect their influence. We find that instead of numerous prompts, more responses but fewer prompts better trigger LLMs for human alignment. Additionally, the concept of diversity for prompts can be more complex than responses that are typically quantified by single digits. Consequently, a new formulation of prompt diversity is proposed, further implying a linear correlation with the final performance of LLMs after fine-tuning. We also leverage it on data augmentation and conduct experiments to show its effect on different algorithms.

Comments:	Accepted by LREC-COLING 2024
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2403.11124 [cs.CL]
	(or arXiv:2403.11124v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2403.11124

Submission history

From: Feifan Song [view email]
[v1] Sun, 17 Mar 2024 07:08:55 UTC (203 KB)
[v2] Sat, 30 Mar 2024 16:48:16 UTC (203 KB)

Computer Science > Computation and Language

Title:Scaling Data Diversity for Fine-Tuning Language Models in Human Alignment

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Scaling Data Diversity for Fine-Tuning Language Models in Human Alignment

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators