A Concept-Centric Approach to Multi-Modality Learning

Yuchong Geng; Ao Tang

arXiv:2412.13847·cs.AI·January 26, 2026

A Concept-Centric Approach to Multi-Modality Learning

Yuchong Geng, Ao Tang

PDF

Open Access

TL;DR

This paper proposes a concept-centric multi-modality learning framework that uses a shared, modality-agnostic concept space to improve efficiency, adaptability, and interpretability in multi-modal learning tasks.

Contribution

It introduces a novel shared concept space and modality-specific projection models, enabling more efficient, modular, and interpretable multi-modality learning inspired by human cognition.

Findings

01

Faster convergence compared to baseline models

02

Supports seamless integration of new modalities

03

Achieves competitive results with less training and no task-specific fine-tuning

Abstract

Humans possess a remarkable ability to acquire knowledge efficiently and apply it across diverse modalities through a coherent and shared understanding of the world. Inspired by this cognitive capability, we introduce a concept-centric multi-modality learning framework built around a modality-agnostic concept space that captures structured, abstract knowledge, alongside a set of modality-specific projection models that map raw inputs onto this shared space. The concept space is decoupled from any specific modality and serves as a repository of universally applicable knowledge. Once learned, the knowledge embedded in the concept space enables more efficient adaptation to new modalities, as projection models can align with existing conceptual representations rather than learning from scratch. This efficiency is empirically validated in our experiments, where the proposed framework…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEducational Technology and Assessment · Natural Language Processing Techniques

MethodsSparse Evolutionary Training