Stress and Adaptation: Applying Anna Karenina Principle in Deep Learning   for Image Classification

Nesma Mahmoud; Hanna Antson; Jaesik Choi; Osamu Shimmi; Kallol Roy

arXiv:2302.11380·cs.LG·February 23, 2023

Stress and Adaptation: Applying Anna Karenina Principle in Deep Learning for Image Classification

Nesma Mahmoud, Hanna Antson, Jaesik Choi, Osamu Shimmi, Kallol Roy

PDF

Open Access

TL;DR

This paper introduces an Anna Karenina Principle for deep learning, showing that more generalizable models have similar internal representations, while less generalizable ones vary more, supported by theoretical and experimental evidence.

Contribution

It proposes a novel AKP framework for deep neural networks, linking model generalizability to internal representation similarity, with theoretical proof and experimental validation.

Findings

01

Generalizable models have similar internal representations.

02

Perturbations cause less generalizable models to vary more.

03

Theoretical proof supports the similarity of happy models.

Abstract

Image classification with deep neural networks has reached state-of-art with high accuracy. This success is attributed to good internal representation features that bypasses the difficulties of the non-convex optimization problems. We have little understanding of these internal representations, let alone quantifying them. Recent research efforts have focused on alternative theories and explanations of the generalizability of these deep networks. We propose the alternative perturbation of deep models during their training induces changes that lead to transitions to different families. The result is an Anna Karenina Principle AKP for deep learning, in which less generalizable models unhappy families vary more in their representation than more generalizable models happy families paralleling Leo Tolstoy dictum that all happy families look alike, each unhappy family is unhappy in its own…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Explainable Artificial Intelligence (XAI) · Neural Networks and Applications