NeurIPS2022
Recommender Forest for Efficient Retrieval
Chao Feng, Wuchao Li, Defu Lian, Zheng Liu, Enhong Chen
被引用 22 次
摘要
Recommender systems (RS) have to select the top-n items from a massive item set. For the sake of efficient recommendation, RS usually represents users and items as latent embeddings and relies on approximate nearest neighbor search (ANNs) to retrieve the recommendation results. Despite the reduction of running time, the representation learning is independent of ANNs index construction; thus, the two operations can be incompatible, which results in a potential loss of recommendation accuracy. To overcome the above problem, we propose the Recommender Forest (a.k.a., RecForest), which jointly learns latent embedding and index for an efficient and high-fidelity recommendation. RecForest consists of multiple K-ary trees, each of which is a partition of the item set via hierarchical balanced clustering such that each item is uniquely represented by a path from the root to a leaf. Given such a data structure, an encoder-decoder-based routing network is developed: it first encodes user information into user representation; then, leveraging a transformer-based decoder, it identifies the top-n items via beam search. Compared with the existing methods, RecForest brings in the following advantages: 1) the false partition of the near-boundary items can be effectively alleviated by the use of multiple trees; 2) the routing operation becomes much more accurate thanks to the powerful transformer decoder; 3) the branch parameters are shared across different tree levels, making the index to be extremely memory-efficient. The experimental studies are performed on six popular recommendation datasets: with a significantly simplified training cost, RecForest outperforms competitive baseline approaches in terms of both recommendation accuracy and efficiency. The code is available at https://github.com/wuchao-li/RecForest .