LaFFi: Leveraging Hybrid Natural Language Feedback for Fine-tuning   Language Models

Qianxi Li; Yingyue Cao; Jikun Kang; Tianpei Yang; Xi Chen; Jun Jin and; Matthew E. Taylor

arXiv:2401.00907·cs.LG·January 3, 2024·1 cites

LaFFi: Leveraging Hybrid Natural Language Feedback for Fine-tuning Language Models

Qianxi Li, Yingyue Cao, Jikun Kang, Tianpei Yang, Xi Chen, Jun Jin and, Matthew E. Taylor

PDF

Open Access

TL;DR

This paper introduces LaFFi, a novel fine-tuning method where LLMs predict the feedback they will receive, leading to improved accuracy in question-answering tasks by leveraging natural language feedback.

Contribution

LaFFi is a new fine-tuning approach enabling LLMs to predict feedback, enhancing performance over traditional supervised fine-tuning, especially with limited annotated data.

Findings

01

LaFFi improves question-answering accuracy significantly.

02

Predicting feedback enhances LLM reflection and learning.

03

The amount of human-annotated data impacts fine-tuning performance.

Abstract

Fine-tuning Large Language Models (LLMs) adapts a trained model to specific downstream tasks, significantly improving task-specific performance. Supervised Fine-Tuning (SFT) is a common approach, where an LLM is trained to produce desired answers. However, LLMs trained with SFT sometimes make simple mistakes and result in hallucinations on reasoning tasks such as question-answering. Without external feedback, it is difficult for SFT to learn a good mapping between the question and the desired answer, especially with a small dataset. This paper introduces an alternative to SFT called Natural Language Feedback for Finetuning LLMs (LaFFi). LaFFi has LLMs directly predict the feedback they will receive from an annotator. We find that requiring such reflection can significantly improve the accuracy in in-domain question-answering tasks, providing a promising direction for the application of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications

MethodsShrink and Fine-Tune