Quantifying the effect of representations on task complexity
Julian Zilly, Lorenz Hetzel, Andrea Censi, Emilio Frazzoli

TL;DR
This paper investigates how data representations influence learning difficulty by aligning model and true data distributions, demonstrating that better representations improve learning outcomes through empirical analysis of neural network performance.
Contribution
It introduces a method to quantify the impact of data representations on task complexity and shows how representation choice affects model alignment and learning efficiency.
Findings
Better representations improve alignment with true data distribution.
Representation-dependent information coding length predicts learning performance.
Empirical results validate the importance of tailored representations for different models.
Abstract
We examine the influence of input data representations on learning complexity. For learning, we posit that each model implicitly uses a candidate model distribution for unexplained variations in the data, its noise model. If the model distribution is not well aligned to the true distribution, then even relevant variations will be treated as noise. Crucially however, the alignment of model and true distribution can be changed, albeit implicitly, by changing data representations. "Better" representations can better align the model to the true distribution, making it easier to approximate the input-output relationship in the data without discarding useful data variations. To quantify this alignment effect of data representations on the difficulty of a learning task, we make use of an existing task complexity score and show its connection to the representation-dependent information coding…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Machine Learning and Algorithms · Domain Adaptation and Few-Shot Learning
MethodsLinear Regression
