Suffix-Constrained Greedy Search Algorithms for Causal Language Models

Ayoub Hammal; Pierre Zweigenbaum; Caio Corro

arXiv:2603.01243·cs.CL·March 3, 2026

Suffix-Constrained Greedy Search Algorithms for Causal Language Models

Ayoub Hammal, Pierre Zweigenbaum, Caio Corro

PDF

Open Access

TL;DR

This paper introduces suffix-constrained greedy search algorithms for large language models to produce well-formed, easily extractable final answers without sacrificing performance, enhancing reasoning trace generation.

Contribution

It proposes novel suffix-constrained greedy algorithms that ensure well-formed, parseable responses from LLMs, improving answer extraction and maintaining or improving accuracy.

Findings

01

Guarantee of trivial answer extraction from LLM outputs

02

No negative impact on model performance

03

Improved results on multiple datasets

Abstract

Large language models (LLMs) are powerful tools that have found applications beyond human-machine interfaces and chatbots. In particular, their ability to generate reasoning traces motivated their use in many prediction tasks like math question answering. Unfortunately, extracting the final answer in an LLM free-form output is difficult, as it is an information extraction problem on its own. In this work, we introduce suffix-constrained generation, that aims to produce well-formed LLM responses in which final answers follow strict templates and are guaranteed to be trivially parseable. To this end, we introduce several algorithms that are based on greedy search procedures. We experiment on several datasets, and show that our approach allows to guarantee trivial deterministic extraction of the final answer from an LLM output without having a negative impact on results, and even…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications