TL;DR
BLADE introduces a Bayesian framework for list-wise alignment in LLM-based recommendation, dynamically updating targets to improve ranking and fairness metrics beyond static methods.
Contribution
It proposes BLADE, a Bayesian list-wise alignment method that adaptively refines training targets, overcoming static supervision limitations in LLM4Rec.
Findings
BLADE outperforms state-of-the-art baselines on real-world datasets.
It achieves sustained improvements in ranking accuracy and list-wise metrics.
The code is publicly available at https://github.com/RegionCh/BLADE.
Abstract
Large Language Models have revolutionized recommender systems (LLM4Rec) by leveraging their generative capabilities to model complex user preferences. However, existing LLM4Rec methods primarily rely on token-level objectives, making it difficult to optimize list-level and non-differentiable metrics (e.g., NDCG, fairness) that define actual recommendation quality. While Best-of-N (BoN) directly optimizes these metrics during inference, its high computational cost hinders real-world deployment. To address this, BoN Alignment aims to distill the search capability into the model itself, yet current approaches suffer from two critical limitations: (1) Indiscriminate Supervision, where the static reference fails to distinguish the relative quality of candidates exceeding its empirical range, leading to a loss of ranking guidance; and (2) Gradient Decay, where the effective supervision signal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
