Efficient Representation Learning via Adaptive Context Pooling

Chen Huang; Walter Talbott; Navdeep Jaitly; Josh Susskind

arXiv:2207.01844·cs.LG·July 6, 2022·1 cites

Efficient Representation Learning via Adaptive Context Pooling

Chen Huang, Walter Talbott, Navdeep Jaitly, Josh Susskind

PDF

Open Access

TL;DR

This paper introduces ContextPool, an adaptive pooling method for attention models that learns to adjust context granularity, improving efficiency and performance in language and image tasks.

Contribution

We propose ContextPool, a novel adaptive pooling technique that enhances attention models by learning to adjust context size dynamically, reducing computational cost and improving expressiveness.

Findings

01

Achieves state-of-the-art performance with less compute.

02

Outperforms recent methods with learned context sizes.

03

Applicable to both transformers and ConvNets.

Abstract

Self-attention mechanisms model long-range context by using pairwise attention between all input tokens. In doing so, they assume a fixed attention granularity defined by the individual tokens (e.g., text characters or image pixels), which may not be optimal for modeling complex dependencies at higher levels. In this paper, we propose ContextPool to address this problem by adapting the attention granularity for each token. Inspired by the success of ConvNets that are combined with pooling to capture long-range dependencies, we learn to pool neighboring features for each token before computing attention in a given attention layer. The pooling weights and support size are adaptively determined, allowing the pooled features to encode meaningful context with varying scale. We show that ContextPool makes attention models more expressive, achieving strong performance often with fewer layers…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Machine Learning in Healthcare · Domain Adaptation and Few-Shot Learning