DMAligner: Enhancing Image Alignment via Diffusion Model Based View Synthesis

Xinglong Luo; Ao Luo; Zhengning Wang; Yueqi Yang; Chaoyu Feng; Lei Lei; Bing Zeng; Shuaicheng Liu

arXiv:2602.23022·cs.CV·March 27, 2026

DMAligner: Enhancing Image Alignment via Diffusion Model Based View Synthesis

Xinglong Luo, Ao Luo, Zhengning Wang, Yueqi Yang, Chaoyu Feng, Lei Lei, Bing Zeng, Shuaicheng Liu

PDF

Open Access

TL;DR

DMAligner introduces a diffusion model-based view synthesis framework for image alignment, effectively addressing occlusion and illumination challenges that hinder traditional optical flow methods, and demonstrates superior performance on new and existing datasets.

Contribution

The paper proposes a novel diffusion-based image alignment method with a dynamics-aware training approach and a new dataset, improving robustness and accuracy over classical flow-based techniques.

Findings

01

Outperforms traditional optical flow methods on DSIA benchmark

02

Effectively handles occlusions and illumination variations

03

Demonstrates superior qualitative results on video datasets

Abstract

Image alignment is a fundamental task in computer vision with broad applications. Existing methods predominantly employ optical flow-based image warping. However, this technique is susceptible to common challenges such as occlusions and illumination variations, leading to degraded alignment visual quality and compromised accuracy in downstream tasks. In this paper, we present DMAligner, a diffusion-based framework for image alignment through alignment-oriented view synthesis. DMAligner is crafted to tackle the challenges in image alignment from a new perspective, employing a generation-based solution that showcases strong capabilities and avoids the problems associated with flow-based image warping. Specifically, we propose a Dynamics-aware Diffusion Training approach for learning conditional image generation, synthesizing a novel view for image alignment. This incorporates a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Vision and Imaging · Multimodal Machine Learning Applications