Video Reconstruction using Diffusion-based Image-to-Video Generation with Trajectory Guidance
Stelio Bompai, Ioannis Kontopoulos, Giannis Spiliopoulos, Dimitris Zissis, Konstantinos Tserpes

TL;DR
This paper introduces a diffusion-based method for reconstructing missing drone video frames using GPS trajectory guidance, achieving realistic and trajectory-adherent results without domain-specific fine-tuning.
Contribution
It presents a novel pipeline that converts GPS telemetry and a reference frame into a trajectory-guided video sequence using a pre-trained diffusion model, eliminating the need for fine-tuning.
Findings
SG-I2V produces more natural frames than baselines.
The method achieves better motion realism and trajectory adherence.
It outperforms optical flow and RIFE interpolation in key metrics.
Abstract
This paper addresses the problem of reconstructing missing or dropped frames in top-down drone video of autonomous surface vehicles performing structured maritime manoeuvres. We propose a pipeline that converts raw GPS telemetry and a single reference frame into a trajectory-guided video sequence using a pre-trained image-to-video diffusion model, requiring no domain-specific fine-tuning. GPS coordinates from onboard telemetry logs are projected into image space via an equirectangular mapping, producing per-vessel motion cues that condition the SG-I2V diffusion model. The generated frames are evaluated against ground-truth video using perceptual, temporal and trajectory-based metrics, and benchmarked against optical flow extrapolation and RIFE interpolation baselines. SG-I2V produces the most naturally appearing frames among all methods (BRISQUE 25.52, closest to ground-truth 23.64),…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
