Multi-Modal Multi-Behavior Sequential Recommendation with Conditional Diffusion-Based Feature Denoising

Xiaoxi Cui; Weihai Lu; Yu Tong; Yiheng Li; Zhejun Zhao

arXiv:2508.05352·cs.IR·August 8, 2025

Multi-Modal Multi-Behavior Sequential Recommendation with Conditional Diffusion-Based Feature Denoising

Xiaoxi Cui, Weihai Lu, Yu Tong, Yiheng Li, Zhejun Zhao

PDF

TL;DR

This paper introduces M$^3$BSR, a novel recommendation model that effectively denoises multi-modal and multi-behavior user data using diffusion techniques and models shared and specific interests, significantly improving accuracy.

Contribution

The paper proposes a new multi-modal multi-behavior recommendation model with diffusion-based denoising and interest extraction, addressing noise and preference characterization challenges.

Findings

01

M$^3$BSR outperforms state-of-the-art methods on benchmark datasets.

02

Effective noise mitigation improves recommendation accuracy.

03

Explicit modeling of shared and specific interests enhances user preference understanding.

Abstract

The sequential recommendation system utilizes historical user interactions to predict preferences. Effectively integrating diverse user behavior patterns with rich multimodal information of items to enhance the accuracy of sequential recommendations is an emerging and challenging research direction. This paper focuses on the problem of multi-modal multi-behavior sequential recommendation, aiming to address the following challenges: (1) the lack of effective characterization of modal preferences across different behaviors, as user attention to different item modalities varies depending on the behavior; (2) the difficulty of effectively mitigating implicit noise in user behavior, such as unintended actions like accidental clicks; (3) the inability to handle modality noise in multi-modal representations, which further impacts the accurate modeling of user preferences. To tackle these…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.