Information-Theoretic Greedy Layer-wise Training for Traffic Sign Recognition

Shuyan Lyu; Zhanzimo Wu; Junliang Du

arXiv:2510.27651·cs.LG·November 3, 2025

Information-Theoretic Greedy Layer-wise Training for Traffic Sign Recognition

Shuyan Lyu, Zhanzimo Wu, Junliang Du

PDF

Open Access

TL;DR

This paper introduces a novel information-theoretic layer-wise training method for deep CNNs, improving training efficiency and performance on traffic sign recognition without backpropagation.

Contribution

It proposes a new layer-wise training approach based on the deterministic information bottleneck and Rényi entropy, validated on CIFAR datasets and traffic sign recognition.

Findings

01

Layer-wise training converges from bottom to top following an information bottleneck.

02

The proposed method outperforms existing layer-wise approaches.

03

Achieves comparable performance to standard SGD training.

Abstract

Modern deep neural networks (DNNs) are typically trained with a global cross-entropy loss in a supervised end-to-end manner: neurons need to store their outgoing weights; training alternates between a forward pass (computation) and a top-down backward pass (learning) which is biologically implausible. Alternatively, greedy layer-wise training eliminates the need for cross-entropy loss and backpropagation. By avoiding the computation of intermediate gradients and the storage of intermediate outputs, it reduces memory usage and helps mitigate issues such as vanishing or exploding gradients. However, most existing layer-wise training approaches have been evaluated only on relatively small datasets with simple deep architectures. In this paper, we first systematically analyze the training dynamics of popular convolutional neural networks (CNNs) trained by stochastic gradient descent (SGD)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Generative Adversarial Networks and Image Synthesis