Neuron Block Dynamics for XOR Classification with Zero-Margin
Guillaume Braun, Masaaki Imaizumi

TL;DR
This paper analyzes the training dynamics of neural networks on zero-margin XOR classification with Gaussian inputs, revealing neuron clustering and block-level coherence crucial for understanding generalization without margin assumptions.
Contribution
It introduces a novel block-level analysis framework for neuron dynamics in Gaussian XOR classification, extending previous discrete input studies and highlighting the importance of neuron clustering.
Findings
Neurons cluster into four directions during training.
Block signals evolve coherently, enabling better understanding of generalization.
Numerical experiments confirm two-phase block dynamics and robustness beyond Gaussian inputs.
Abstract
The ability of neural networks to learn useful features through stochastic gradient descent (SGD) is a cornerstone of their success. Most theoretical analyses focus on regression or on classification tasks with a positive margin, where worst-case gradient bounds suffice. In contrast, we study zero-margin nonlinear classification by analyzing the Gaussian XOR problem, where inputs are Gaussian and the XOR decision boundary determines labels. In this setting, a non-negligible fraction of data lies arbitrarily close to the boundary, breaking standard margin-based arguments. Building on Glasgow's (2024) analysis, we extend the study of training dynamics from discrete to Gaussian inputs and develop a framework for the dynamics of neuron blocks. We show that neurons cluster into four directions and that block-level signals evolve coherently, a phenomenon essential in the Gaussian setting…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Machine Learning and ELM · Model Reduction and Neural Networks
