Learning More by Seeing Less: Structure First Learning for Efficient, Transferable, and Human-Aligned Vision

Tianqin Li; George Liu; Tai Sing Lee

arXiv:2508.06696·cs.CV·November 13, 2025

Learning More by Seeing Less: Structure First Learning for Efficient, Transferable, and Human-Aligned Vision

Tianqin Li, George Liu, Tai Sing Lee

PDF

Open Access

TL;DR

This paper introduces a structure-first learning paradigm using line drawings to develop more efficient, generalizable, and human-aligned visual models that require less data and are more robust.

Contribution

It proposes a novel training approach that emphasizes structural representations, leading to models with stronger shape bias, lower dimensionality, and better transferability compared to traditional methods.

Findings

01

Models trained with line drawings have a stronger shape bias.

02

They exhibit lower intrinsic dimensionality and require fewer principal components.

03

Distilled student models outperform those from color-supervised teachers.

Abstract

Despite remarkable progress in computer vision, modern recognition systems remain fundamentally limited by their dependence on rich, redundant visual inputs. In contrast, humans can effortlessly understand sparse, minimal representations like line drawings, suggesting that structure, rather than appearance, underlies efficient visual understanding. In this work, we propose a novel structure-first learning paradigm that uses line drawings as an initial training modality to induce more compact and generalizable visual representations. We demonstrate that models trained with this approach develop a stronger shape bias, more focused attention, and greater data efficiency across classification, detection, and segmentation tasks. Notably, these models also exhibit lower intrinsic dimensionality, requiring significantly fewer principal components to capture representational variance, which…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Face Recognition and Perception · Domain Adaptation and Few-Shot Learning