PEP: Parameter Ensembling by Perturbation
Alireza Mehrtash, Purang Abolmaesumi, Polina Golland, Tina Kapur,, Demian Wassermann, William M. Wells III

TL;DR
PEP introduces a simple, effective method for ensembling deep network parameters through Gaussian perturbations, improving calibration and likelihood without extra training.
Contribution
The paper proposes Parameter Ensembling by Perturbation (PEP), a novel approach that enhances model calibration and likelihood by perturbing trained parameters, applicable to pre-trained networks without additional training.
Findings
PEP improves calibration and likelihood on ImageNet networks.
PEP yields mild accuracy improvements on pre-trained models.
PEP helps probe overfitting levels during training.
Abstract
Ensembling is now recognized as an effective approach for increasing the predictive performance and calibration of deep networks. We introduce a new approach, Parameter Ensembling by Perturbation (PEP), that constructs an ensemble of parameter values as random perturbations of the optimal parameter set from training by a Gaussian with a single variance parameter. The variance is chosen to maximize the log-likelihood of the ensemble average () on the validation data set. Empirically, and perhaps surprisingly, has a well-defined maximum as the variance grows from zero (which corresponds to the baseline model). Conveniently, calibration level of predictions also tends to grow favorably until the peak of is reached. In most experiments, PEP provides a small improvement in performance, and, in some cases, a substantial improvement in empirical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMachine Learning and Data Classification · AI in cancer detection · Cell Image Analysis Techniques
MethodsAverage Pooling · Residual Connection · Concatenated Skip Connection · Softmax · Dense Block · Convolution · Dense Connections · Kaiming Initialization · Global Average Pooling · 1x1 Convolution
