Improving Consistency in Large Language Models through Chain of Guidance
Harsh Raj, Vipul Gupta, Domenic Rosati, Subhabrata Majumdar

TL;DR
This paper introduces Chain of Guidance, a multistep prompting method that significantly improves the semantic consistency of Large Language Models in question-answering tasks, enhancing trustworthiness.
Contribution
It proposes a novel guided prompting technique, Chain of Guidance, and demonstrates its effectiveness in increasing LLM output consistency through fine-tuning with synthetic data.
Findings
Fine-tuned models are over twice as consistent as base models.
CoG improves consistency without sacrificing accuracy.
Models generalize well to unseen datasets.
Abstract
Consistency is a fundamental dimension of trustworthiness in Large Language Models (LLMs). For humans to be able to trust LLM-based applications, their outputs should be consistent when prompted with inputs that carry the same meaning or intent. Despite this need, there is no known mechanism to control and guide LLMs to be more consistent at inference time. In this paper, we introduce a novel alignment strategy to maximize semantic consistency in LLM outputs. Our proposal is based on Chain of Guidance (CoG), a multistep prompting technique that generates highly consistent outputs from LLMs. For closed-book question-answering (Q&A) tasks, when compared to direct prompting, the outputs generated using CoG show improved consistency. While other approaches like template-based responses and majority voting may offer alternative paths to consistency, our work focuses on exploring the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques
MethodsBalanced Selection
