Framer: Interactive Frame Interpolation
Wen Wang, Qiuyu Wang, Kecheng Zheng, Hao Ouyang, Zhekai Chen, Biao, Gong, Hao Chen, Yujun Shen, Chunhua Shen

TL;DR
Framer is an interactive frame interpolation system that allows users to customize transitions between images through keypoints, combining human control with automatic estimation for versatile applications like video and cartoon interpolation.
Contribution
We introduce Framer, a novel interactive framework for frame interpolation that incorporates user-defined keypoints and automatic estimation to enhance control and handle challenging cases.
Findings
Effective in image morphing and video generation
Handles objects with different shapes and styles
Offers both manual and automatic keypoint modes
Abstract
We propose Framer for interactive frame interpolation, which targets producing smoothly transitioning frames between two images as per user creativity. Concretely, besides taking the start and end frames as inputs, our approach supports customizing the transition process by tailoring the trajectory of some selected keypoints. Such a design enjoys two clear benefits. First, incorporating human interaction mitigates the issue arising from numerous possibilities of transforming one image to another, and in turn enables finer control of local motions. Second, as the most basic form of interaction, keypoints help establish the correspondence across frames, enhancing the model to handle challenging cases (e.g., objects on the start and end frames are of different shapes and styles). It is noteworthy that our system also offers an "autopilot" mode, where we introduce a module to estimate the…
Peer Reviews
Decision·ICLR 2025 Poster
The paper is mostly well written and easy to follow. The novelty here lies in the control branch for guiding the interpolation process that allows for fine-grained user-friendly control over the interpolation process.. The authors evaluate several existing methods on a variety of input types like real-world, sketch, cartoon, etc both qualitatively and quantitatively and produce high quality results. Also included in a technique to produce the control signals automatically using bidirectional co
- There is no mention of inference speed, training speed or overall training methodology (end-to-end, multi stage, pretraining, etc) - There appears to be no underlying camera model for handling the "novel view synthesis" examples. Having such a model could potentially improve quality while providing another user-control. - The paper isn't very easy to follow specifically around the trajectory update for autopilot mode section.
1) The method is straight forward but effective, which can be used for many applications; 2) Most results shown in this paper are impressive. 3) Ablation experiments are sufficient to evaluate the method.
1) The component of the Framer includes Trajectory preprocessing, Trajectory control and Interpolation. Although the method is effective, the novelty is limited. Trajectory control is designed by following DragNUWA, and the Interpolation is mainly based on SVD that determines the performance of this method. 2) The examples shown in this paper are a bit simple. The motion and the difference between two key frames (first and last frames) are not complex. More complex scenarios, as cases shown in t
The method’s introduction of keypoint trajectories addresses motion ambiguity, leveraging the stable video diffusion model to improve interpolation fidelity. The qualitative outcomes are notably strong.
1. **Related Work**: The paper does not sufficiently discuss previous work tackling motion ambiguity and interactive frame interpolation, such as [1] and [2], which address motion/velocity ambiguities. And [2] is the early work in interactive VFI. A more comprehensive discussion would clarify the novelty and contributions of *Framer*. 2. **Real-World Application**: The practical applicability of this method is unclear, especially given that real-world videos typically run at frame rates like 24–
Code & Models
Videos
Taxonomy
TopicsHuman Motion and Animation
