Large Language Models Meet Extreme Multi-label Classification: Scaling and Multi-modal Framework
Diego Ortego, Marlon Rodr\'iguez, Mario Almagro, Kunal Dahiya, David Jim\'enez, Juan C. SanMiguel

TL;DR
This paper explores leveraging larger decoder-only models and visual information in extreme multi-label classification, introducing a multi-modal framework that improves performance while maintaining efficiency, and extending datasets for future benchmarking.
Contribution
It demonstrates the effectiveness of large decoder models and visual data integration in XMC, and provides new multi-modal datasets for benchmarking.
Findings
Decoder-only models with billions of parameters improve XMC performance.
Visual information enhances label prediction accuracy.
Proposed ViXML outperforms previous state-of-the-art methods.
Abstract
Foundation models have revolutionized artificial intelligence across numerous domains, yet their transformative potential remains largely untapped in Extreme Multi-label Classification (XMC). Queries in XMC are associated with relevant labels from extremely large label spaces, where it is critical to strike a balance between efficiency and performance. Therefore, many recent approaches efficiently pose XMC as a maximum inner product search between embeddings learned from small encoder-only transformer architectures. In this paper, we address two important aspects in XMC: how to effectively harness larger decoder-only models, and how to exploit visual information while maintaining computational efficiency. We demonstrate that both play a critical role in XMC separately and can be combined for improved performance. We show that a few billion-size decoder can deliver substantial…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsText and Document Classification Technologies · Domain Adaptation and Few-Shot Learning · Sentiment Analysis and Opinion Mining
