An Active Learning Framework for Inclusive Generation by Large Language Models
Sabit Hassan, Anthony Sicilia, Malihe Alikhani

TL;DR
This paper introduces a clustering-based active learning framework with knowledge distillation to improve the inclusivity and diversity of large language models' generated text, especially for under-represented groups.
Contribution
The paper presents a novel active learning method combining clustering and knowledge distillation for generative models, enhancing diversity without prior data distribution knowledge.
Findings
2%-10% performance improvement over baselines
More consistent results across data subgroups
Increased lexical diversity and model resilience
Abstract
Ensuring that Large Language Models (LLMs) generate text representative of diverse sub-populations is essential, particularly when key concepts related to under-represented groups are scarce in the training data. We address this challenge with a novel clustering-based active learning framework, enhanced with knowledge distillation. The proposed framework transforms the intermediate outputs of the learner model, enabling effective active learning for generative tasks for the first time. Integration of clustering and knowledge distillation yields more representative models without prior knowledge of underlying data distribution and overbearing human efforts. We validate our approach in practice through case studies in counter-narration and style transfer. We construct two new datasets in tandem with model training, showing a performance improvement of 2%-10% over baseline models. Our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIntelligent Tutoring Systems and Adaptive Learning
MethodsKnowledge Distillation
