Towards Exact Computation of Inductive Bias

Akhilan Boopathy; William Yue; Jaedong Hwang; Abhiram Iyer; and Ila Fiete

arXiv:2406.15941·cs.LG·June 25, 2024

Towards Exact Computation of Inductive Bias

Akhilan Boopathy, William Yue, Jaedong Hwang, Abhiram Iyer, and Ila Fiete

PDF

Open Access 1 Repo 3 Reviews

TL;DR

This paper introduces a novel, efficient method to quantify the inductive bias of machine learning models, providing insights into how model complexity and architecture influence generalization on specific tasks.

Contribution

The authors propose a direct, bounds-free approach to measure inductive bias across diverse hypothesis spaces, with theoretical error bounds and empirical validation.

Findings

01

Higher dimensional tasks require more inductive bias.

02

Neural networks encode significant inductive bias compared to other models.

03

The measure quantifies differences in inductive bias between neural architectures.

Abstract

Much research in machine learning involves finding appropriate inductive biases (e.g. convolutional neural networks, momentum-based optimizers, transformers) to promote generalization on tasks. However, quantification of the amount of inductive bias associated with these architectures and hyperparameters has been limited. We propose a novel method for efficiently computing the inductive bias required for generalization on a task with a fixed training data budget; formally, this corresponds to the amount of information required to specify well-generalizing models within a specific hypothesis space of models. Our approach involves modeling the loss distribution of random hypotheses drawn from a hypothesis space to estimate the required inductive bias for a task relative to these hypotheses. Unlike prior work, our method provides a direct estimate of inductive bias without using bounds and…

Peer Reviews

Decision·Submitted to ICLR 2024

Reviewer 01Rating 5· marginally below the acceptance thresholdConfidence 3

Strengths

- The problem addressed by the paper is of central importance in the community and I believe that the work could be of some interest because of its novelty - The paper is mostly well written and easy enough to follow, except for some passages related to the estimation of the test loss distribution

Weaknesses

- For an objective standpoint, I find that the "experimental" claims about 1) high-dimensional tasks -> more inductive bias and especially that 2) "neural networks encode massive amount of inductive bias" rather weak. These are based on an extremely sparse set of experiments and are not backed by theoretical justifications. Especially for 2) the observation even seems to me quite indirect: I do not think that the method proposed can directly quantify "the inductive bias encoded in (some subclass

Reviewer 02Rating 3· reject, not good enoughConfidence 2

Strengths

- defining and quantifying inductive bias are both important problems to tackle - Figure 1 nicely illustrates the idea of the definition - Definition of inductive bias clear (Definition 1) but novelty unclear

Weaknesses

Overall, I feel unable to properly review the paper because it seems to be in an early draft stage where the experiments are not entirely finished, the method only partially developed, and the related work is not completely clear, yet. I can see that there might be interesting ideas in the paper but in its current form, this paper seems not ready for publication. More detailed comments: - introduction is long and imprecise and it is unclear what the work builds on, all contributions are just men

Reviewer 03Rating 3· reject, not good enoughConfidence 3

Strengths

1. The paper is clear and well-written (**Clarity**) 2. The considered problem of estimating inductive bias is relevant and worth of being studied (**Significance**)

Weaknesses

1. Several theoretical results are overstated and it is not clear what is their novelty compared to existing ones. For instance, the result about the test error distribution (Section 3.3) follows directly from a known one, i.e. it is well known that the sum of squared errors for a linear regressor follows a chi-squared distribution. Additionally, the statistical result about the finite sample approximation (Section 3.4) is already known for a chi-squared distribution. Why not simply casting the

Code & Models

Repositories

FieteLab/Exact-Inductive-Bias
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRough Sets and Fuzzy Logic