Decomposition-Enhanced Training for Post-Hoc Attributions In Language Models

Sriram Balasubramanian; Samyadeep Basu; Koustava Goswami; Ryan Rossi; Varun Manjunatha; Roshan Santhosh; Ruiyi Zhang; Soheil Feizi; Nedim Lipka

arXiv:2510.25766·cs.CL·November 7, 2025

Decomposition-Enhanced Training for Post-Hoc Attributions In Language Models

Sriram Balasubramanian, Samyadeep Basu, Koustava Goswami, Ryan Rossi, Varun Manjunatha, Roshan Santhosh, Ruiyi Zhang, Soheil Feizi, Nedim Lipka

PDF

1 Video

TL;DR

This paper introduces DecompTune, a novel post-training approach that enhances language models' ability to generate answer decompositions, significantly improving attribution accuracy in complex, multi-hop question answering tasks.

Contribution

It proposes a new decomposition-based training method and curated dataset to improve attribution in language models for complex QA, outperforming prior methods.

Findings

01

DecompTune improves attribution quality in complex QA tasks.

02

Models trained with DecompTune outperform prior attribution methods.

03

DecompTune matches or exceeds state-of-the-art models in attribution accuracy.

Abstract

Large language models (LLMs) are increasingly used for long-document question answering, where reliable attribution to sources is critical for trust. Existing post-hoc attribution methods work well for extractive QA but struggle in multi-hop, abstractive, and semi-extractive settings, where answers synthesize information across passages. To address these challenges, we argue that post-hoc attribution can be reframed as a reasoning problem, where answers are decomposed into constituent units, each tied to specific context. We first show that prompting models to generate such decompositions alongside attributions improves performance. Building on this, we introduce DecompTune, a post-training method that teaches models to produce answer decompositions as intermediate reasoning steps. We curate a diverse dataset of complex QA tasks, annotated with decompositions by a strong LLM, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Decomposition-Enhanced Training for Post-Hoc Attributions in Language Models· underline