ACL2025
Bridging Relevance and Reasoning: Rationale Distillation in Retrieval-Augmented Generation
Pengyue Jia, Derong Xu, Xiaopeng Li, Zhaocheng Du, Xiangyang Li, Yichao Wang, Yuhao Wang, Qidong Liu, Maolin Wang, Huifeng Guo, Ruiming Tang, Xiangyu Zhao
Abstract
The reranker and generator are two critical components in the Retrieval-Augmented Generation (i.e., RAG) pipeline, responsible for ranking relevant documents and generating responses. However, due to differences in pretraining data and objectives, there is an inevitable misalignment between the documents ranked as relevant by the reranker and those required by the generator to support queryspecific answers. To bridge this gap, we propose RADIO, a novel and practical preference alignment framework with RAtionale DIstillatiOn. Specifically, we first propose a rationale extraction method that leverages the reasoning capabilities of Large Language Models (LLMs) to extract the rationales necessary for answering a query. Subsequently, a rationalebased alignment process is designed to rerank documents based on the extracted rationales and fine-tune the reranker to better align the preferences. Extensive experiments conducted on three tasks across four datasets demonstrate the effectiveness and transferability of our approach. Our code is released online 1 .