WeaveTime: Stream from Earlier Frames into Emergent Memory in VideoLLMs

Yulin Zhang; Cheng Shi; Sibei Yang

arXiv:2602.22142·cs.CV·February 26, 2026

WeaveTime: Stream from Earlier Frames into Emergent Memory in VideoLLMs

Yulin Zhang, Cheng Shi, Sibei Yang

PDF

Open Access

TL;DR

WeaveTime enhances Video-LLMs for streaming video analysis by teaching models to understand temporal order and focus dynamically on relevant past information, improving accuracy and efficiency in online scenarios.

Contribution

It introduces a novel, model-agnostic framework with a lightweight training objective and dynamic focus mechanism for streaming Video-LLMs, addressing core limitations of time-agnosticism.

Findings

01

Improves accuracy on streaming video benchmarks

02

Reduces latency in video processing

03

Enhances temporal reasoning in Video-LLMs

Abstract

Recent advances in Multimodal Large Language Models have greatly improved visual understanding and reasoning, yet their quadratic attention and offline training protocols make them ill-suited for streaming settings where frames arrive sequentially and future observations are inaccessible. We diagnose a core limitation of current Video-LLMs, namely Time-Agnosticism, in which videos are treated as an unordered bag of evidence rather than a causally ordered sequence, yielding two failures in streams: temporal order ambiguity, in which the model cannot follow or reason over the correct chronological order, and past-current focus blindness where it fails to distinguish present observations from accumulated history. We present WeaveTime, a simple, efficient, and model agnostic framework that first teaches order and then uses order. We introduce a lightweight Temporal Reconstruction…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Generative Adversarial Networks and Image Synthesis