Agent-Pro: Learning to Evolve via Policy-Level Reflection and Optimization

Zhang, Wenqi; Tang, Ke; Wu, Hai; Wang, Mengna; Shen, Yongliang; Hou, Guiyang; Tan, Zeqi; Li, Peng; Zhuang, Yueting; Lu, Weiming

Computer Science > Artificial Intelligence

arXiv:2402.17574 (cs)

[Submitted on 27 Feb 2024 (v1), last revised 6 Jun 2024 (this version, v3)]

Title:Agent-Pro: Learning to Evolve via Policy-Level Reflection and Optimization

Authors:Wenqi Zhang, Ke Tang, Hai Wu, Mengna Wang, Yongliang Shen, Guiyang Hou, Zeqi Tan, Peng Li, Yueting Zhuang, Weiming Lu

View PDF HTML (experimental)

Abstract:Large Language Models (LLMs) exhibit robust problem-solving capabilities for diverse tasks. However, most LLM-based agents are designed as specific task solvers with sophisticated prompt engineering, rather than agents capable of learning and evolving through interactions. These task solvers necessitate manually crafted prompts to inform task rules and regulate LLM behaviors, inherently incapacitating to address complex dynamic scenarios e.g., large interactive games. In light of this, we propose Agent-Pro: an LLM-based Agent with Policy-level Reflection and Optimization that can learn a wealth of expertise from interactive experiences and progressively elevate its behavioral policy. Specifically, it involves a dynamic belief generation and reflection process for policy evolution. Rather than action-level reflection, Agent-Pro iteratively reflects on past trajectories and beliefs, fine-tuning its irrational beliefs for a better policy. Moreover, a depth-first search is employed for policy optimization, ensuring continual enhancement in policy payoffs. Agent-Pro is evaluated across two games: Blackjack and Texas Hold'em, outperforming vanilla LLM and specialized models. Our results show Agent-Pro can learn and evolve in complex and dynamic scenes, which also benefits numerous LLM-based applications.

Comments:	Accepted to ACL-2024 Main, camera-ready version
Subjects:	Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
Cite as:	arXiv:2402.17574 [cs.AI]
	(or arXiv:2402.17574v3 [cs.AI] for this version)
	https://doi.org/10.48550/arXiv.2402.17574

Submission history

From: Wenqi Zhang [view email]
[v1] Tue, 27 Feb 2024 15:09:20 UTC (2,597 KB)
[v2] Wed, 27 Mar 2024 17:34:57 UTC (2,598 KB)
[v3] Thu, 6 Jun 2024 18:40:47 UTC (2,601 KB)

Computer Science > Artificial Intelligence

Title:Agent-Pro: Learning to Evolve via Policy-Level Reflection and Optimization

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Artificial Intelligence

Title:Agent-Pro: Learning to Evolve via Policy-Level Reflection and Optimization

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators