Transferable-guided Attention Is All You Need for Video Domain Adaptation
Andr\'e Sacilotti, Samuel Felipe dos Santos, Nicu Sebe, Jurandy, Almeida

TL;DR
This paper introduces TransferAttn, a transformer-based framework with a novel attention mechanism and a domain transferability module, significantly improving unsupervised video domain adaptation across multiple datasets and backbones.
Contribution
It proposes TransferAttn and DTAB, novel modules that enhance transformer-based video UDA by focusing on spatio-temporal transferability, outperforming existing methods.
Findings
TransferAttn outperforms state-of-the-art methods on multiple datasets.
DTAB improves transferability when integrated with other transformer UDA models.
The approach is effective across various backbones and video datasets.
Abstract
Unsupervised domain adaptation (UDA) in videos is a challenging task that remains not well explored compared to image-based UDA techniques. Although vision transformers (ViT) achieve state-of-the-art performance in many computer vision tasks, their use in video UDA has been little explored. Our key idea is to use transformer layers as a feature encoder and incorporate spatial and temporal transferability relationships into the attention mechanism. A Transferable-guided Attention (TransferAttn) framework is then developed to exploit the capacity of the transformer to adapt cross-domain knowledge across different backbones. To improve the transferability of ViT, we introduce a novel and effective module, named Domain Transferable-guided Attention Block (DTAB). DTAB compels ViT to focus on the spatio-temporal transferability relationship among video frames by changing the self-attention…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning
MethodsSoftmax · Attention Is All You Need · Focus
