Video Reconstruction using Diffusion-based Image-to-Video Generation with Trajectory Guidance

Stelio Bompai; Ioannis Kontopoulos; Giannis Spiliopoulos; Dimitris Zissis; Konstantinos Tserpes

arXiv:2605.16420·cs.CV·May 19, 2026

Video Reconstruction using Diffusion-based Image-to-Video Generation with Trajectory Guidance

Stelio Bompai, Ioannis Kontopoulos, Giannis Spiliopoulos, Dimitris Zissis, Konstantinos Tserpes

PDF

TL;DR

This paper introduces a diffusion-based method for reconstructing missing drone video frames using GPS trajectory guidance, achieving realistic and trajectory-adherent results without domain-specific fine-tuning.

Contribution

It presents a novel pipeline that converts GPS telemetry and a reference frame into a trajectory-guided video sequence using a pre-trained diffusion model, eliminating the need for fine-tuning.

Findings

01

SG-I2V produces more natural frames than baselines.

02

The method achieves better motion realism and trajectory adherence.

03

It outperforms optical flow and RIFE interpolation in key metrics.

Abstract

This paper addresses the problem of reconstructing missing or dropped frames in top-down drone video of autonomous surface vehicles performing structured maritime manoeuvres. We propose a pipeline that converts raw GPS telemetry and a single reference frame into a trajectory-guided video sequence using a pre-trained image-to-video diffusion model, requiring no domain-specific fine-tuning. GPS coordinates from onboard telemetry logs are projected into image space via an equirectangular mapping, producing per-vessel motion cues that condition the SG-I2V diffusion model. The generated frames are evaluated against ground-truth video using perceptual, temporal and trajectory-based metrics, and benchmarked against optical flow extrapolation and RIFE interpolation baselines. SG-I2V produces the most naturally appearing frames among all methods (BRISQUE 25.52, closest to ground-truth 23.64),…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.