Divide-Verify-Refine: Can LLMs Self-Align with Complex Instructions?
Xianren Zhang, Xianfeng Tang, Hui Liu, Zongyu Wu, Qi He, Dongwon Lee, and Suhang Wang

TL;DR
This paper introduces the Divide-Verify-Refine framework to improve LLMs' ability to follow complex instructions by systematically breaking down constraints, verifying responses with tools, and using dynamic few-shot refinement, significantly enhancing performance.
Contribution
The paper proposes a novel Divide-Verify-Refine approach with dynamic few-shot prompting and creates a new dataset for complex instructions, improving LLMs' constraint adherence.
Findings
DVR doubles Llama3.1-8B's constraint adherence
DVR triples Mistral-7B's performance
Introduces a new dataset of complex instructions
Abstract
Recent studies show LLMs struggle with complex instructions involving multiple constraints (e.g., length, format, sentiment). Existing works address this issue by fine-tuning, which heavily relies on fine-tuning data quality and is computational expensive. An alternative is leveraging LLMs' self-correction to refine responses for better constraint adherence. However, this is limited by the feedback quality, as LLMs cannot generate reliable feedback or detect errors. Moreover, its effectiveness relies on few-shot examples illustrating response modifications. As constraints in complex instructions are diverse, manually crafting such examples for each constraint type can be labor-intensive and sub-optimal. To address these two challenges, we propose the Divide-Verify-Refine (DVR) framework with three steps: (1) Divide complex instructions into single constraints and prepare appropriate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
