Understanding the Distributions of Aggregation Layers in Deep Neural Networks
Eng-Jon Ong, Sameed Husain, Miroslaw Bober

TL;DR
This paper introduces a novel mathematical approach to model the probability distributions of aggregation layer outputs in deep neural networks, enabling analytical predictions of KL-divergence and enhancing understanding of feature aggregation effects.
Contribution
It proposes a new analytical model for the distributions of aggregation layer outputs in DNNs, validated through experiments across various datasets and tasks.
Findings
Successfully predicts KL-divergence of output nodes.
Validates theoretical models with empirical data.
Provides insights into the role of aggregation layers in DNN performance.
Abstract
The process of aggregation is ubiquitous in almost all deep nets models. It functions as an important mechanism for consolidating deep features into a more compact representation, whilst increasing robustness to overfitting and providing spatial invariance in deep nets. In particular, the proximity of global aggregation layers to the output layers of DNNs mean that aggregated features have a direct influence on the performance of a deep net. A better understanding of this relationship can be obtained using information theoretic methods. However, this requires the knowledge of the distributions of the activations of aggregation layers. To achieve this, we propose a novel mathematical formulation for analytically modelling the probability distributions of output values of layers involved with deep feature aggregation. An important outcome is our ability to analytically predict the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Anomaly Detection Techniques and Applications · Machine Learning and Data Classification
