EMNLP2024

Large Language Models as Foundations for Next-Gen Dense Retrieval: A Comprehensive Empirical Assessment

Kun Luo, Minghao Qin, Zheng Liu, Shitao Xiao, Jun Zhao, Kang Liu

3 citations

Abstract

Pre-trained language models like BERT and T5 serve as crucial backbone encoders for dense retrieval.However, these models often exhibit limited generalization capabilities and face challenges in improving in-domain accuracy.Recent research has explored using large language models (LLMs) as retrievers, achieving state-of-the-art performance across various tasks.Despite these advancements, the specific benefits of LLMs over traditional retrievers and the impact of different LLM configurations-such as parameter sizes, pre-training duration, and alignment processes-on retrieval tasks remain unclear.In this work, we conduct a comprehensive empirical study on six key dimensions of dense retrieval capabilities, including in-domain accuracy, data efficiency, zero-shot generalization, lengthy retrieval, instruction-based retrieval, and multi-task learning.We evaluate over 15 different backbone LLMs and non-LLMs.Our findings reveal that larger models and extensive pre-training consistently enhance in-domain accuracy and data efficiency.Additionally, larger models demonstrate significant potential in zero-shot generalization, lengthy retrieval, instruction-based retrieval, and multi-task learning.These results underscore the advantages of LLMs as versatile and effective backbone encoders in dense retrieval, providing valuable insights for future research and development in this field.