Towards a Unified View of Preference Learning for Large Language Models: A Survey

Gao, Bofei; Song, Feifan; Miao, Yibo; Cai, Zefan; Yang, Zhe; Chen, Liang; Hu, Helan; Xu, Runxin; Dong, Qingxiu; Zheng, Ce; Quan, Shanghaoran; Xiao, Wen; Zhang, Ge; Zan, Daoguang; Lu, Keming; Yu, Bowen; Liu, Dayiheng; Cui, Zeyu; Yang, Jian; Sha, Lei; Wang, Houfeng; Sui, Zhifang; Wang, Peiyi; Liu, Tianyu; Chang, Baobao

Computer Science > Computation and Language

arXiv:2409.02795 (cs)

[Submitted on 4 Sep 2024 (v1), last revised 31 Oct 2024 (this version, v5)]

Title:Towards a Unified View of Preference Learning for Large Language Models: A Survey

Abstract:Large Language Models (LLMs) exhibit remarkably powerful capabilities. One of the crucial factors to achieve success is aligning the LLM's output with human preferences. This alignment process often requires only a small amount of data to efficiently enhance the LLM's performance. While effective, research in this area spans multiple domains, and the methods involved are relatively complex to understand. The relationships between different methods have been under-explored, limiting the development of the preference alignment. In light of this, we break down the existing popular alignment strategies into different components and provide a unified framework to study the current alignment strategies, thereby establishing connections among them. In this survey, we decompose all the strategies in preference learning into four components: model, data, feedback, and algorithm. This unified view offers an in-depth understanding of existing alignment algorithms and also opens up possibilities to synergize the strengths of different strategies. Furthermore, we present detailed working examples of prevalent existing algorithms to facilitate a comprehensive understanding for the readers. Finally, based on our unified perspective, we explore the challenges and future research directions for aligning large language models with human preferences.

Comments:	23 pages, 6 figures
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2409.02795 [cs.CL]
	(or arXiv:2409.02795v5 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2409.02795

Submission history

From: Bofei Gao [view email]
[v1] Wed, 4 Sep 2024 15:11:55 UTC (2,792 KB)
[v2] Fri, 6 Sep 2024 10:30:36 UTC (2,794 KB)
[v3] Mon, 9 Sep 2024 09:31:30 UTC (2,798 KB)
[v4] Tue, 29 Oct 2024 09:27:15 UTC (2,802 KB)
[v5] Thu, 31 Oct 2024 05:39:06 UTC (2,804 KB)

Computer Science > Computation and Language

Title:Towards a Unified View of Preference Learning for Large Language Models: A Survey

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Towards a Unified View of Preference Learning for Large Language Models: A Survey

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators