Architecture is All You Need: Improving LLM Recommenders by Dropping the Text
Kevin Foley, Shaghayegh Agah, Kavya Priyanka Kakinada

TL;DR
This paper demonstrates that a simplified LLM architecture using discrete item tokens significantly outperforms traditional and PLM-based recommenders, highlighting architecture as the key factor rather than extensive pre-training.
Contribution
The authors introduce a streamlined LLM-based recommender model that replaces text tokenization with discrete item tokens and reduces model size, achieving superior performance.
Findings
Outperforms traditional sequential recommenders
Outperforms PLM-based recommenders at smaller size
Architecture is the main benefit of LLMs in recommendations
Abstract
In recent years, there has been an explosion of interest in the applications of large pre-trained language models (PLMs) to recommender systems, with many studies showing strong performance of PLMs on common benchmark datasets. PLM-based recommender models benefit from flexible and customizable prompting, an unlimited vocabulary of recommendable items, and general ``world knowledge'' acquired through pre-training on massive text corpora. While PLM-based recommenders show promise in settings where data is limited, they are hard to implement in practice due to their large size and computational cost. Additionally, fine-tuning PLMs to improve performance on collaborative signals may degrade the model's capacity for world knowledge and generalizability. We propose a recommender model that uses the architecture of large language models (LLMs) while reducing layer count and dimensions and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Topic Modeling · Explainable Artificial Intelligence (XAI)
