# GIER: Gap-Driven Self-Refinement for Large Language Models

**Authors:** Rinku Dewri

arXiv: 2509.00325 · 2025-09-03

## TL;DR

GIER is a framework that enhances large language model outputs by iterative self-refinement based on identifying and addressing reasoning gaps, improving explanation quality without sacrificing accuracy.

## Contribution

GIER introduces a novel self-refinement approach using natural language gap descriptions, enabling models to improve reasoning outputs through iterative critique and revision.

## Key findings

- Improves rationale quality, grounding, and reasoning alignment across multiple tasks and models.
- Models can interpret and act on abstract reasoning gaps.
- Enhances explanation quality without reducing task accuracy.

## Abstract

We introduce GIER (Gap-driven Iterative Enhancement of Responses), a general framework for improving large language model (LLM) outputs through self-reflection and revision based on conceptual quality criteria. Unlike prompting strategies that rely on demonstrations, examples, or chain-of-thought templates, GIER utilizes natural language descriptions of reasoning gaps, and prompts a model to iteratively critique and refine its own outputs to better satisfy these criteria. Across three reasoning-intensive tasks (SciFact, PrivacyQA, and e-SNLI) and four LLMs (GPT-4.1, GPT-4o Mini, Gemini 1.5 Pro, and Llama 3.3 70B), GIER improves rationale quality, grounding, and reasoning alignment without degrading task accuracy. Our analysis demonstrates that models can not only interpret abstract conceptual gaps but also translate them into concrete reasoning improvements.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2509.00325/full.md

## Figures

11 figures with captions in the complete paper: https://tomesphere.com/paper/2509.00325/full.md

## References

37 references — full list in the complete paper: https://tomesphere.com/paper/2509.00325/full.md

---
Source: https://tomesphere.com/paper/2509.00325