WWW2026

Rethinking the Hidden Risk of Reranking: Achieving Risk-aware Reranking with Information Gain for RAG with LLMs

Zhizhao Liu, Zhihua Wen, Zhiliang Tian, Zhen Huang, Miaorong Zhu, Zimian Wei, Yifu Gao, Liang Ding, Dongsheng Li

摘要

Retrieval-augmented generation (RAG) has become a cornerstone for enhancing large language models (LLMs) with real-time information from the Web, but its performance often heavily depends on the quality of the retrieved documents. Given that RAG systems frequently draw from vast and often noisy Web corpora, ensuring the reliability of retrieved content is paramount. While rerankers improve the factual accuracy of the RAG system by elevating the proportion of ground-truth documents (GD) in high-ranked results, the shifts of document type distributions during reranking remain unclear, hindering the understanding of the reranker's behavior. To bridge this gap, we conduct an empirical study to categorize documents and compare their distribution before and after reranking. We reveal a counterintuitive finding: though rerankers improve the proportion of GD, they also significantly increase the proportion of harmful documents (HD) in top-ranked retrieved documents. It not only narrows the potential context window for ranking the GD higher but also increases the risk of HD misleading the LLMs, potentially leading to the generation and propagation of misinformation across Web platforms. Motivated by this finding, we propose a risk-aware reranking method for RAG with LLMs, which balances the risk and benefit during reranking. Given a query, the RAG framework first retrieves relevant documents. Then, our approach quantifies the potential beneficial and harmful impacts of various documents on the LLMs' generation. To estimate the impacts, we conduct a dual-aspect document impact assessment via information gain, which employs a risk clipping to avoid the numerical fluctuations in the estimation. Finally, we conduct the reranking according to the potential impact of each document, enabling the reranker to significantly reduce the HD proportion. Experiments and analysis across multiple models and datasets, including Wikipedia, web news, and research papers, show the effectiveness of our method. Our code is available at https://github.com/lzz335/hidden_risk_of_reranking.