ZARA: Improving Few-Shot Self-Rationalization for Small Language Models
Wei-Lin Chen, An-Zi Yen, Cheng-Kuang Wu, Hen-Hsen Huang, Hsin-Hsi Chen

TL;DR
This paper introduces ZARA, a novel method that enhances few-shot self-rationalization in small language models by automatically generating pseudo-parallel data, leading to state-of-the-art results on the FEB benchmark.
Contribution
ZARA is a new approach that leverages natural language inference to automatically create training data, improving self-rationalization in small language models.
Findings
ZARA achieves state-of-the-art performance on FEB benchmark.
ZARA effectively identifies plausible and accurate rationale-answer pairs.
Improves few-shot self-rationalization for small LMs.
Abstract
Language models (LMs) that jointly generate end-task answers as well as free-text rationales are known as self-rationalization models. Recent works demonstrate great performance gain for self-rationalization by few-shot prompting LMs with rationale-augmented exemplars. However, the ability to benefit from explanations only emerges with large-scale LMs, which have poor accessibility. In this work, we explore the less-studied setting of leveraging explanations for small LMs to improve few-shot self-rationalization. We first revisit the relationship between rationales and answers. Inspired by the implicit mental process of how human beings assess explanations, we present a novel approach, Zero-shot Augmentation of Rationale-Answer pairs (ZARA), to automatically construct pseudo-parallel data for self-training by reducing the problem of plausibility judgement to natural language inference.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
