Label2Label: A Language Modeling Framework for Multi-Attribute Learning
Wanhua Li, Zhexuan Cao, Jianjiang Feng, Jie Zhou, Jiwen Lu

TL;DR
Label2Label introduces a novel language modeling framework for multi-attribute learning, leveraging attribute correlations by treating attribute labels as words in a sentence, inspired by NLP pre-training techniques.
Contribution
It is the first to apply language modeling to multi-attribute prediction, achieving state-of-the-art results without task-specific modifications.
Findings
Achieves state-of-the-art results on three multi-attribute tasks.
Effectively models attribute correlations through masked language modeling.
Does not rely on task-specific priors or complex network designs.
Abstract
Objects are usually associated with multiple attributes, and these attributes often exhibit high correlations. Modeling complex relationships between attributes poses a great challenge for multi-attribute learning. This paper proposes a simple yet generic framework named Label2Label to exploit the complex attribute correlations. Label2Label is the first attempt for multi-attribute prediction from the perspective of language modeling. Specifically, it treats each attribute label as a "word" describing the sample. As each sample is annotated with multiple attribute labels, these "words" will naturally form an unordered but meaningful "sentence", which depicts the semantic information of the corresponding sample. Inspired by the remarkable success of pre-training language models in NLP, Label2Label introduces an image-conditioned masked language model, which randomly masks some of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText and Document Classification Technologies · Topic Modeling · Multimodal Machine Learning Applications
