Efficient Multi-Purpose Cross-Attention Based Image Alignment Block for Edge Devices
Bahri Batuhan Bilecen, Alparslan Fisne, Mustafa Ayazoglu

TL;DR
This paper introduces XABA, an efficient cross-attention-based image alignment block designed for edge devices, enabling real-time performance and improved multi-image super-resolution accuracy.
Contribution
The paper presents a novel pyramidal cross-attention scheme for image alignment that is both efficient and suitable for real-time edge device applications.
Findings
Achieves over 20 FPS on NVIDIA Jetson Xavier
Reduces memory and computational requirements
Improves super-resolution network performance
Abstract
Image alignment, also known as image registration, is a critical block used in many computer vision problems. One of the key factors in alignment is efficiency, as inefficient aligners can cause significant overhead to the overall problem. In the literature, there are some blocks that appear to do the alignment operation, although most do not focus on efficiency. Therefore, an image alignment block which can both work in time and/or space and can work on edge devices would be beneficial for almost all networks dealing with multiple images. Given its wide usage and importance, we propose an efficient, cross-attention-based, multi-purpose image alignment block (XABA) suitable to work within edge devices. Using cross-attention, we exploit the relationships between features extracted from images. To make cross-attention feasible for real-time image alignment problems and handle large…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Sensor-Based Localization · Advanced Vision and Imaging · Advanced Image and Video Retrieval Techniques
