M3TR: Temporal Retrieval Enhanced Multi-Modal Micro-video Popularity Prediction
Jiacheng Lu, Weijian Wang, Mingyuan Xiao, Yang Hua, Tao Song, Jiaru Zhang, Bo Peng, Cheng Hua, Haibing Guan

TL;DR
M3TR is a novel framework that combines temporal modeling of user interactions with a retrieval mechanism based on content and popularity patterns to improve micro-video popularity prediction accuracy.
Contribution
It introduces a Mamba-Hawkes Process for detailed temporal modeling and a retrieval system that leverages both content and temporal popularity trajectories, enhancing long-term prediction.
Findings
Outperforms previous methods by up to 19.3% in nMSE.
Effectively captures long-range dependencies in user interactions.
Achieves state-of-the-art results on real-world datasets.
Abstract
Accurately predicting the popularity of micro-videos is a critical but challenging task, characterized by volatile, `rollercoaster-like' engagement dynamics. Existing methods often fail to capture these complex temporal patterns, leading to inaccurate long-term forecasts. This failure stems from two fundamental limitations: \ding{172} a superficial understanding of user feedback dynamics, which overlooks the mutually exciting and decaying nature of interactions such as likes, comments, and shares; and~\ding{173} retrieval mechanisms that rely solely on static content similarity, ignoring the crucial patterns of how a video's popularity evolves over time. To address these limitations, we propose \textbf{MTR}, a \textbf{T}emporal \textbf{R}etrieval enhanced \textbf{M}ulti-\textbf{M}odal framework that uniquely synergizes fine-grained temporal modeling with a novel temporal-aware…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSentiment Analysis and Opinion Mining · Media Influence and Health · Human Mobility and Location-Based Analysis
MethodsMamba: Linear-Time Sequence Modeling with Selective State Spaces
