Capsule Networks as Generative Models
Alex B. Kiefer, Beren Millidge, Alexander Tschantz, Christopher L., Buckley

TL;DR
This paper introduces a probabilistic generative model for capsule networks, utilizing transformer-based self-attention and iterative inference, aiming to improve scene understanding and interpretability in visual recognition tasks.
Contribution
It develops an explicit probabilistic framework for capsule networks, integrating self-attention mechanisms and iterative inference under sparsity constraints.
Findings
Proposes a new capsule routing algorithm based on iterative inference.
Introduces a probabilistic model linking capsule networks with transformer self-attention.
Shows relation to predictive coding networks using Von-Mises-Fisher distributions.
Abstract
Capsule networks are a neural network architecture specialized for visual scene recognition. Features and pose information are extracted from a scene and then dynamically routed through a hierarchy of vector-valued nodes called 'capsules' to create an implicit scene graph, with the ultimate aim of learning vision directly as inverse graphics. Despite these intuitions, however, capsule networks are not formulated as explicit probabilistic generative models; moreover, the routing algorithms typically used are ad-hoc and primarily motivated by algorithmic intuition. In this paper, we derive an alternative capsule routing algorithm utilizing iterative inference under sparsity constraints. We then introduce an explicit probabilistic generative model for capsule networks based on the self-attention operation in transformer networks and show how it is related to a variant of predictive coding…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Image and Video Retrieval Techniques · Image Retrieval and Classification Techniques
