FITMM: Adaptive Frequency-Aware Multimodal Recommendation via Information-Theoretic Representation Learning
Wei Yang, Rui Zhong, Yiqun Chen, Shixuan Li, Heng Ping, Chi Lu, Peng Jiang

TL;DR
This paper introduces FITMM, a spectral information-theoretic framework for multimodal recommendation that leverages frequency decomposition to improve user preference modeling and recommendation accuracy.
Contribution
It proposes a novel frequency-aware spectral decomposition approach using an information bottleneck, enabling better fusion and regularization of multimodal data in recommendation systems.
Findings
FITMM outperforms state-of-the-art baselines on three real-world datasets.
The frequency-domain IB regularization improves model generalization.
Spectral decomposition enhances multimodal feature alignment and representation.
Abstract
Multimodal recommendation aims to enhance user preference modeling by leveraging rich item content such as images and text. Yet dominant systems fuse modalities in the spatial domain, obscuring the frequency structure of signals and amplifying misalignment and redundancy. We adopt a spectral information-theoretic view and show that, under an orthogonal transform that approximately block-diagonalizes bandwise covariances, the Gaussian Information Bottleneck objective decouples across frequency bands, providing a principled basis for separate-then-fuse paradigm. Building on this foundation, we propose FITMM, a Frequency-aware Information-Theoretic framework for multimodal recommendation. FITMM constructs graph-enhanced item representations, performs modality-wise spectral decomposition to obtain orthogonal bands, and forms lightweight within-band multimodal components. A residual,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Advanced Graph Neural Networks · Emotion and Mood Recognition
