ERVD: An Efficient and Robust ViT-Based Distillation Framework for Remote Sensing Image Retrieval
Le Dong, Qixuan Cao, Lei Pu, Fangfang Wu, Weisheng Dong, Xin Li, Guangming Shi

TL;DR
This paper introduces ERVD, a novel framework utilizing Vision Transformer (ViT) for efficient and robust remote sensing image retrieval, improving accuracy and computational efficiency over existing methods.
Contribution
The paper proposes a new ViT-based distillation framework specifically designed for remote sensing image retrieval, enhancing robustness and efficiency.
Findings
Achieves higher retrieval accuracy compared to baseline methods
Reduces computational complexity in image retrieval tasks
Demonstrates robustness across diverse remote sensing datasets
Abstract
ERVD: An Efficient and Robust ViT-Based Distillation Framework for Remote Sensing Image Retrieval
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Retrieval and Classification Techniques · Advanced Image and Video Retrieval Techniques · Data Management and Algorithms
MethodsLinear Layer · Softmax · Layer Normalization · Residual Connection · Attention Is All You Need · Dense Connections · Multi-Head Attention · Vision Transformer
