Enhancing SLM via ChatGPT and Dataset Augmentation
Tom Pieper, Mohamad Ballout, Ulf Krumnack, Gunther Heidemann, and, Kai-Uwe K\"uhnberger

TL;DR
This paper demonstrates that augmenting small language models with synthetic data generated by ChatGPT-3.5-Turbo and knowledge distillation techniques improves their performance on natural language inference tasks, offering a cost-effective alternative to large models.
Contribution
The paper introduces a novel dataset augmentation method using ChatGPT-3.5-Turbo to enhance small language models' performance on NLI tasks, reducing reliance on human annotation.
Findings
Synthetic rationales improve accuracy by 1.3% and 2.3% on ANLI.
Knowledge distillation with augmented data enhances small model capabilities.
Cost-effective approach for improving NLP model performance.
Abstract
This paper explores the enhancement of small language models through strategic dataset augmentation via ChatGPT-3.5-Turbo, in the domain of Natural Language Inference (NLI). By employing knowledge distillation-based techniques and synthetic dataset augmentation, we aim to bridge the performance gap between large language models (LLMs) and small language models (SLMs) without the immense cost of human annotation. Our methods involve two forms of rationale generation--information extraction and informed reasoning--to enrich the ANLI dataset. We then fine-tune T5-Small on these augmented datasets, evaluating its performance against an established benchmark. Our findings reveal that the incorporation of synthetic rationales significantly improves the model's ability to comprehend natural language, leading to 1.3\% and 2.3\% higher classification accuracy, respectively, on the ANLI dataset,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTraffic Prediction and Management Techniques
MethodsKnowledge Distillation
