From Layers to States: A State Space Model Perspective to Deep Neural   Network Layer Dynamics

Qinshuo Liu; Weiqin Zhao; Wei Huang; Yanwen Fang; Lequan Yu and; Guodong Li

arXiv:2502.10463·cs.LG·February 18, 2025

From Layers to States: A State Space Model Perspective to Deep Neural Network Layer Dynamics

Qinshuo Liu, Weiqin Zhao, Wei Huang, Yanwen Fang, Lequan Yu and, Guodong Li

PDF

Open Access

TL;DR

This paper introduces a novel continuous state space model approach to layer aggregation in deep neural networks, leveraging SSMs and S6 to improve feature extraction and performance in vision tasks.

Contribution

It proposes a new layer aggregation method using state space models, specifically the Selective State Space Model Layer Aggregation (S6LA), for very deep neural networks.

Findings

01

S6LA improves image classification accuracy.

02

S6LA enhances object detection performance.

03

The approach effectively models long-range dependencies.

Abstract

The depth of neural networks is a critical factor for their capability, with deeper models often demonstrating superior performance. Motivated by this, significant efforts have been made to enhance layer aggregation - reusing information from previous layers to better extract features at the current layer, to improve the representational power of deep neural networks. However, previous works have primarily addressed this problem from a discrete-state perspective which is not suitable as the number of network layers grows. This paper novelly treats the outputs from layers as states of a continuous process and considers leveraging the state space model (SSM) to design the aggregation of layers in very deep neural networks. Moreover, inspired by its advancements in modeling long sequences, the Selective State Space Models (S6) is employed to design a new module called Selective State Space…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications