ACL2024

Plum: Prompt Learning using Metaheuristics

Rui Pan, Shuo Xing, Shizhe Diao, Wenhe Sun, Xiang Liu, Kashun Shum, Jipeng Zhang, Renjie Pi, Tong Zhang

Abstract

Since the emergence of large language models, prompt learning has become a popular method for optimizing and customizing these models. Special prompts, such as Chain-of-Thought, have even revealed previously unknown reasoning capabilities within these models. However, the progress of discovering effective prompts has been slow, driving a desire for general prompt optimization methods. Unfortunately, few existing prompt learning methods satisfy the criteria of being truly "general", i.e., automatic, discrete, black-box, gradient-free, and interpretable all at once. In this paper, we introduce metaheuristics, a branch of discrete nonconvex optimization methods with over 100 options, as a promising approach to prompt learning. Within our paradigm, we test six typical methods: hill climbing, simulated annealing, genetic algorithms with/without crossover, tabu search, and harmony search, demonstrating their effectiveness in white-box and blackbox prompt learning. Furthermore, we show that these methods can be used to discover more human-understandable prompts that were previously unknown in both reasoning and image generation tasks, opening the door to a cornucopia of possibilities in prompt optimization. 043 2022). Notably, prompt learning distinguishes it-044 self from other methods by eliminating the need 045 for gradient information from models, resulting in 046 substantially reduced memory consumption and 047 computational resource requirements. Further-048 more, prompt learning often yields interpretable 049 outcomes, which help researchers and engineers in-050 tuitively understand its effectiveness, thereby being 051 beneficial in inspiring more generalizable prompts 052 for various tasks (Prasad et al., 2022; Guo et al., 053 2023; Yu et al., 2023). 054 Since the introduction of prompt engineering and 055 prompt learning, significant advancements have 056 been made in the discovery of effective prompts. A 057 noteworthy illustration is Chain-of-Thought (COT), 058 whereby the simple inclusion of accurate deduction 059 steps for few-shot examples within the original 060 prompt empowers LLMs to achieve substantial per-061 formance improvements in reasoning tasks (Wei 062 et al., 2022b). An even more impressive result 063 is Zero-shot-COT (Kojima et al., 2022), where 064 adding the magic phrase "Let's think step by step" 065 produces a remarkable accuracy gain of over 10% 066 across multiple models engaged in a diverse spec-067 trum of reasoning tasks. 068 However, the quest for such highly effective 069 prompts remains unfulfilled, underscoring the need 070 for tools that accelerate the discovery process. Ide-071 ally, these prompt learning tools should possess a 072 notable level of generality, while simultaneously 073 meeting the following criteria: 074 • Automatic: since human involvements are nor-075 mally expensive and time-consuming.