Large Language Model as a Teacher for Zero-shot Tagging at Extreme   Scales

Jinbin Zhang; Nasib Ullah; Rohit Babbar

arXiv:2406.09288·cs.LG·February 25, 2025

Large Language Model as a Teacher for Zero-shot Tagging at Extreme Scales

Jinbin Zhang, Nasib Ullah, Rohit Babbar

PDF

Open Access

TL;DR

This paper presents LMTX, a framework that uses large language models to generate high-quality pseudo labels for zero-shot extreme multi-label classification, combining the accuracy of LLMs with the efficiency of lightweight bi-encoders.

Contribution

LMTX introduces a novel training approach where LLMs serve as teachers to improve pseudo label quality, enabling efficient inference without LLMs at test time.

Findings

01

LMTX outperforms existing methods in accuracy and efficiency.

02

Achieves state-of-the-art results in EZ-XMC tasks.

03

Eliminates the need for LLMs during inference, reducing computational costs.

Abstract

Extreme Multi-label Text Classification (XMC) entails selecting the most relevant labels for an instance from a vast label set. Extreme Zero-shot XMC (EZ-XMC) extends this challenge by operating without annotated data, relying only on raw text instances and a predefined label set, making it particularly critical for addressing cold-start problems in large-scale recommendation and categorization systems. State-of-the-art methods, such as MACLR and RTS, leverage lightweight bi-encoders but rely on suboptimal pseudo labels for training, such as document titles (MACLR) or document segments (RTS), which may not align well with the intended tagging or categorization tasks. On the other hand, LLM-based approaches, like ICXML, achieve better label-instance alignment but are computationally expensive and impractical for real-world EZ-XMC applications due to their heavy inference costs. In this…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCOVID-19 diagnosis using AI · Domain Adaptation and Few-Shot Learning · Speech Recognition and Synthesis

MethodsSparse Evolutionary Training