TL;DR
This paper introduces a behavior-guided candidate calibration method for multimodal recommendation that leverages spectral analysis to improve ranking by selectively applying behavior evidence, resulting in consistent performance gains.
Contribution
It proposes a novel spectral analysis-based approach to calibrate candidates in multimodal recommendation, enhancing ranking effectiveness without destabilizing the backbone.
Findings
Consistent improvements on Amazon datasets across categories.
Spectral analysis reveals a split between shared structure and discriminative signals.
Selective application of behavior evidence boosts recommendation quality.
Abstract
Multimodal recommendation benefits from content signals, but the gain depends on how those signals interact with the ranking pipeline. We find that moderate cross-view agreement helps, while stronger agreement suppresses recommendation-specific variation. Spectral analysis shows a clear split: low-frequency components capture shared structure, and higher-frequency components preserve more discriminative signal. Based on this finding, we introduce a behavior-guided candidate calibration model that converts training-only co-user overlap into signed candidate evidence and applies it only to the shortlist produced by the multimodal backbone. The backbone keeps the representation space stable; behavior evidence acts only where ranking is decided. Results on Amazon Baby, Sports, and Electronics show consistent gains over strong multimodal baselines. Code is available at…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
