DPR-CAE: Capsule Autoencoder with Dynamic Part Representation for Image Parsing
Canqun Xiang, Zhennan Wang, Wenbin Zou, Chen Xu

TL;DR
DPR-CAE introduces a capsule autoencoder with dynamic part representation and an invariant module, improving hierarchical image parsing and object classification performance on MNIST datasets.
Contribution
The paper presents a novel capsule autoencoder with dynamic part representation and an invariant module, enhancing interpretability and performance in image parsing tasks.
Findings
Significant performance improvement on $rm$-MNIST and $rm$-Fashion-MNIST.
Effective hierarchical image parsing with interpretable part representations.
Enhanced object classification accuracy in unsupervised settings.
Abstract
Parsing an image into a hierarchy of objects, parts, and relations is important and also challenging in many computer vision tasks. This paper proposes a simple and effective capsule autoencoder to address this issue, called DPR-CAE. In our approach, the encoder parses the input into a set of part capsules, including pose, intensity, and dynamic vector. The decoder introduces a novel dynamic part representation (DPR) by combining the dynamic vector and a shared template bank. These part representations are then regulated by corresponding capsules to composite the final output in an interpretable way. Besides, an extra translation-invariant module is proposed to avoid directly learning the uncertain scene-part relationship in our DPR-CAE, which makes the resulting method achieves a promising performance gain on -MNIST and -Fashion-MNIST. % to model the scene-object relationship…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Generative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications
