FAN: Focused Attention Networks
Chu Wang, Babak Samari, Vladimir Kim, Siddhartha Chaudhuri, Kaleem, Siddiqi

TL;DR
This paper introduces a focused attention mechanism with a novel loss function that improves the learning of attention weights, leading to better relation recovery and enhanced performance in vision and language tasks.
Contribution
The paper proposes a new focused attention module with a center-mass cross entropy loss to improve attention weight learning across tasks.
Findings
Improved attention distribution across meaningful entities.
Enhanced feature aggregation from attended entities.
State-of-the-art relation recovery in a proposal task.
Abstract
Attention networks show promise for both vision and language tasks, by emphasizing relationships between constituent elements through weighting functions. Such elements could be regions in an image output by a region proposal network, or words in a sentence, represented by word embedding. Thus far the learning of attention weights has been driven solely by the minimization of task specific loss functions. We introduce a method for learning attention weights to better emphasize informative pair-wise relations between entities. The key component is a novel center-mass cross entropy loss, which can be applied in conjunction with the task specific ones. We further introduce a focused attention backbone to learn these attention weights for general tasks. We demonstrate that the focused supervision leads to improved attention distribution across meaningful entities, and that it enhances the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Advanced Neural Network Applications
