Mitigating Legibility Tax with Decoupled Prover-Verifier Games

Yegon Kim; Juho Lee

arXiv:2602.23248·cs.AI·February 27, 2026

Mitigating Legibility Tax with Decoupled Prover-Verifier Games

Yegon Kim, Juho Lee

PDF

Open Access

TL;DR

This paper introduces a decoupled prover-verifier game framework that improves the checkability of large language model outputs by training a translator to convert solutions into a checkable form, reducing the legibility tax.

Contribution

It proposes a novel decoupled training approach with a translator model to enhance checkability without sacrificing correctness in prover-verifier systems.

Findings

01

Decoupled training reduces legibility tax.

02

Translator maintains solver's answer fidelity.

03

Framework achieves faithful and checkable outputs.

Abstract

As large language models become increasingly capable, it is critical that their outputs can be easily checked by less capable systems. Prover-verifier games can be used to improve checkability of model outputs, but display a degradation in accuracy compared to a baseline trained only to maximize correctness -- a phenonemon named legibility tax. We propose a solution by decoupling the correctness from the checkability condition and instead training a "translator" model that turns a fixed solver model's solution into a checkable form. This allows us to first train the solver to maximize correctness, and then train the translator to translate the solver into a checkable form while retaining the solver's answer. To accommodate this new objective of translation, we formulate a decoupled prover-verifier game where the equilibria correspond to faithful and checkable translators.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Explainable Artificial Intelligence (XAI)