StructInbet: Integrating Explicit Structural Guidance into Inbetween Frame Generation

Zhenglin Pan; Haoran Xie

arXiv:2507.13377·cs.GR·July 21, 2025

StructInbet: Integrating Explicit Structural Guidance into Inbetween Frame Generation

Zhenglin Pan, Haoran Xie

PDF

Open Access

TL;DR

StructInbet is a novel inbetweening system that uses explicit structural guidance and temporal attention to produce controllable, consistent frame transitions in video or animation, reducing ambiguity and improving visual coherence.

Contribution

It introduces explicit structural guidance and a temporal attention mechanism to enhance controllability and consistency in inbetween frame generation.

Findings

01

Improved control over frame transitions.

02

Enhanced visual consistency across frames.

03

Reduced ambiguity in pixel trajectories.

Abstract

In this paper, we propose StructInbet, an inbetweening system designed to generate controllable transitions over explicit structural guidance. StructInbet introduces two key contributions. First, we propose explicit structural guidance to the inbetweening problem to reduce the ambiguity inherent in pixel trajectories. Second, we adopt a temporal attention mechanism that incorporates visual identity from both the preceding and succeeding keyframes, ensuring consistency in character appearance.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications · Computer Graphics and Visualization Techniques