# Teaching Compositionality to CNNs

**Authors:** Austin Stone, Huayan Wang, Michael Stark, Yi Liu, D. Scott Phoenix,, Dileep George

arXiv: 1706.04313 · 2017-06-15

## TL;DR

This paper introduces a training method for CNNs that promotes the learning of compositional, disentangled features, leading to improved object recognition performance and better generalization.

## Contribution

It proposes a novel, CNN-agnostic training approach that encourages the formation of localized, compositional features for enhanced visual recognition.

## Key findings

- Learned features are more localized and disentangled.
- Improved accuracy over non-compositional baselines.
- Enhanced generalization in object recognition tasks.

## Abstract

Convolutional neural networks (CNNs) have shown great success in computer vision, approaching human-level performance when trained for specific tasks via application-specific loss functions. In this paper, we propose a method for augmenting and training CNNs so that their learned features are compositional. It encourages networks to form representations that disentangle objects from their surroundings and from each other, thereby promoting better generalization. Our method is agnostic to the specific details of the underlying CNN to which it is applied and can in principle be used with any CNN. As we show in our experiments, the learned representations lead to feature activations that are more localized and improve performance over non-compositional baselines in object recognition tasks.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1706.04313/full.md

## Figures

20 figures with captions in the complete paper: https://tomesphere.com/paper/1706.04313/full.md

## References

49 references — full list in the complete paper: https://tomesphere.com/paper/1706.04313/full.md

---
Source: https://tomesphere.com/paper/1706.04313