MSPCaps: A Multi-Scale Patchify Capsule Network with Cross-Agreement Routing for Visual Recognition

Yudong Hu; Yueju Han; Rui Sun; Jinke Ren

arXiv:2508.16922·cs.CV·January 29, 2026

MSPCaps: A Multi-Scale Patchify Capsule Network with Cross-Agreement Routing for Visual Recognition

Yudong Hu, Yueju Han, Rui Sun, Jinke Ren

PDF

TL;DR

MSPCaps introduces a multi-scale capsule network with cross-agreement routing that effectively captures diverse features and improves visual recognition accuracy over existing capsule methods.

Contribution

The paper proposes MSPCaps, integrating multi-scale feature extraction, patchify capsules, and cross-agreement routing for enhanced capsule network performance.

Findings

01

Outperforms baseline methods in classification accuracy.

02

Scalable from Tiny to Large models with superior robustness.

03

Effectively captures multi-scale features for better visual recognition.

Abstract

Capsule Network (CapsNet) has demonstrated significant potential in visual recognition by capturing spatial relationships and part-whole hierarchies for learning equivariant feature representations. However, existing CapsNet and variants often rely on a single high-level feature map, overlooking the rich complementary information from multi-scale features. Furthermore, conventional feature fusion strategies (e.g., addition and concatenation) struggle to reconcile multi-scale feature discrepancies, leading to suboptimal classification performance. To address these limitations, we propose the Multi-Scale Patchify Capsule Network (MSPCaps), a novel architecture that integrates multi-scale feature learning and efficient capsule routing. Specifically, MSPCaps consists of three key components: a Multi-Scale ResNet Backbone (MSRB), a Patchify Capsule Layer (PatchifyCaps), and Cross-Agreement…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.