TL;DR
This paper introduces a new model for predicting how video memories decay over time, integrating visual and semantic data, and demonstrates its effectiveness on a novel dataset and benchmarks.
Contribution
The paper presents Memento10k, a dynamic video memorability dataset, and a novel decay model that estimates memory retention over arbitrary delays using multimodal information.
Findings
Model outperforms previous approaches by 12% on average.
Introduces the first quantitative decay estimation for videos.
Combines visual and semantic data for improved predictions.
Abstract
A key capability of an intelligent system is deciding when events from past experience must be remembered and when they can be forgotten. Towards this goal, we develop a predictive model of human visual event memory and how those memories decay over time. We introduce Memento10k, a new, dynamic video memorability dataset containing human annotations at different viewing delays. Based on our findings we propose a new mathematical formulation of memorability decay, resulting in a model that is able to produce the first quantitative estimation of how a video decays in memory over time. In contrast with previous work, our model can predict the probability that a video will be remembered at an arbitrary delay. Importantly, our approach combines visual and semantic information (in the form of textual captions) to fully represent the meaning of events. Our experiments on two video memorability…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
