ACL2024

PACIT: Unlocking the Power of Examples for Better In-Context Instruction Tuning

Tianci Xue, Ziqi Wang, Yixia Li, Yun Chen, Guanhua Chen

Abstract

Instruction tuning enhances the instruc-001 tion following ability of large language 002 models by finetuning with supervised in-003 struction data. Previous work proposes 004 in-context instruction tuning (ICIT) where 005 specific positive or negative examples are 006 incorporated into the prompt for better 007 performance. In this work, we propose 008 PACIT, a simple and effective in-context 009 instruction tuning method, inspired by the 010 pedagogical concept of desirable difficulty. 011 The PACIT method unlocks the power of 012 examples by encouraging the model to ac-013 tively learn to grasp the distinctions be-014 tween the positive and negative examples 015 instead of merely reading. The model is ex-016 pected to first verify the correctness of the 017 provided example according to the task de-018 scription, which is then set as the condition 019 for generating a better response to the task 020 instance. Our extensive experiments prove 021 the effectiveness of Pacit, outperforming 022 ICIT baseline on both in-domain and out-023 domain tasks up to 9.16 and 3.14 average 024 ROUGE-L scores, respectively. Moreover, 025 PACIT can notably enhance the perfor-026 mance of instruction tuning even when all 027 positive and negative examples are gener-028 ated with a self-instruct method. 029 1 Introduction 030 Large language models (LLMs) have garnered sig-031 nificant interest from both academia and industry 032 due to their superior performance on a variety of 033 natural language processing tasks such as question 034 answering and text generation. Instruction tun-035 ing (IT; Ouyang et al. 2022) optimizes the pre-036 trained language models with supervised instruc-037 tion data to enhance the capabilities of the instruc-038 tion following and zero-shot generalization to un-039