LaFFi: Leveraging Hybrid Natural Language Feedback for Fine-tuning Language Models
Qianxi Li, Yingyue Cao, Jikun Kang, Tianpei Yang, Xi Chen, Jun Jin and, Matthew E. Taylor

TL;DR
This paper introduces LaFFi, a novel fine-tuning method where LLMs predict the feedback they will receive, leading to improved accuracy in question-answering tasks by leveraging natural language feedback.
Contribution
LaFFi is a new fine-tuning approach enabling LLMs to predict feedback, enhancing performance over traditional supervised fine-tuning, especially with limited annotated data.
Findings
LaFFi improves question-answering accuracy significantly.
Predicting feedback enhances LLM reflection and learning.
The amount of human-annotated data impacts fine-tuning performance.
Abstract
Fine-tuning Large Language Models (LLMs) adapts a trained model to specific downstream tasks, significantly improving task-specific performance. Supervised Fine-Tuning (SFT) is a common approach, where an LLM is trained to produce desired answers. However, LLMs trained with SFT sometimes make simple mistakes and result in hallucinations on reasoning tasks such as question-answering. Without external feedback, it is difficult for SFT to learn a good mapping between the question and the desired answer, especially with a small dataset. This paper introduces an alternative to SFT called Natural Language Feedback for Finetuning LLMs (LaFFi). LaFFi has LLMs directly predict the feedback they will receive from an annotator. We find that requiring such reflection can significantly improve the accuracy in in-domain question-answering tasks, providing a promising direction for the application of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
MethodsShrink and Fine-Tune
