On the emergence of simplex symmetry in the final and penultimate layers   of neural network classifiers

Weinan E; Stephan Wojtowytsch

arXiv:2012.05420·cs.LG·June 7, 2021·6 cites

On the emergence of simplex symmetry in the final and penultimate layers of neural network classifiers

Weinan E, Stephan Wojtowytsch

PDF

Open Access

TL;DR

This paper investigates the geometric symmetry properties of neural network classifiers, revealing that penultimate layer features form a simplex structure and analyzing conditions under which this symmetry emerges or fails.

Contribution

It provides an analytical explanation for the emergence of simplex symmetry in deep networks and demonstrates when such symmetry appears or breaks down in shallow versus deep models.

Findings

01

Penultimate layer features form a regular simplex in high dimensions.

02

Simplex symmetry is analytically explained in toy models of deep networks.

03

Shallow networks or layers without proper geometric configuration do not exhibit this symmetry.

Abstract

A recent numerical study observed that neural network classifiers enjoy a large degree of symmetry in the penultimate layer. Namely, if $h (x) = A f (x) + b$ where $A$ is a linear map and $f$ is the output of the penultimate layer of the network (after activation), then all data points $x_{i, 1}, \dots, x_{i, N_{i}}$ in a class $C_{i}$ are mapped to a single point $y_{i}$ by $f$ and the points $y_{i}$ are located at the vertices of a regular $k - 1$ -dimensional standard simplex in a high-dimensional Euclidean space. We explain this observation analytically in toy models for highly expressive deep neural networks. In complementary examples, we demonstrate rigorously that even the final output of the classifier $h$ is not uniform over data samples from a class $C_{i}$ if $h$ is a shallow network (or if the deeper layers do not bring the data samples into a convenient geometric configuration).

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications