Label-semantics Aware Generative Approach for Domain-Agnostic Multilabel Classification
Subhendu Khatuya, Shashwat Naidu, Saptarshi Ghosh, Pawan Goyal, Niloy Ganguly

TL;DR
This paper presents LAGAMC, a domain-agnostic generative model for multi-label text classification that leverages label descriptions and semantic matching, achieving state-of-the-art results efficiently across diverse datasets.
Contribution
Introduces a novel generative framework using label descriptions and semantic matching for multi-label classification, improving accuracy and efficiency.
Findings
Achieves new state-of-the-art performance on multiple datasets.
Surpasses strong baselines with 13.94% Micro-F1 improvement.
Enhances macro-F1 by 24.85% over previous methods.
Abstract
The explosion of textual data has made manual document classification increasingly challenging. To address this, we introduce a robust, efficient domain-agnostic generative model framework for multi-label text classification. Instead of treating labels as mere atomic symbols, our approach utilizes predefined label descriptions and is trained to generate these descriptions based on the input text. During inference, the generated descriptions are matched to the pre-defined labels using a finetuned sentence transformer. We integrate this with a dual-objective loss function, combining cross-entropy loss and cosine similarity of the generated sentences with the predefined target descriptions, ensuring both semantic alignment and accuracy. Our proposed model LAGAMC stands out for its parameter efficiency and versatility across diverse datasets, making it well-suited for practical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsText and Document Classification Technologies · Topic Modeling · Sentiment Analysis and Opinion Mining
