Long-Document QA with Chain-of-Structured-Thought and Fine-Tuned SLMs

Zhuowen Liang; Xiaotian Lin; Zhengxuan Zhang; Yuyu Luo; Haixun Wang; Nan Tang

arXiv:2603.29232·cs.CL·April 1, 2026

Long-Document QA with Chain-of-Structured-Thought and Fine-Tuned SLMs

Zhuowen Liang, Xiaotian Lin, Zhengxuan Zhang, Yuyu Luo, Haixun Wang, Nan Tang

PDF

1 Repo 1 Video

TL;DR

This paper introduces LiteCoST, a framework combining structured reasoning templates and fine-tuned small language models to improve long-document question answering with high accuracy and efficiency.

Contribution

It proposes a novel two-pillar approach using structured reasoning templates and targeted fine-tuning to enable small models to perform comparably to large models on long-document QA.

Findings

01

Achieves LLM-like quality with 3B/7B models on multi-domain long-document QA.

02

Delivers 2-4x lower latency than GPT-4o and DeepSeek-R1.

03

Uses a structured reasoning template to guide small models in producing verifiable outputs.

Abstract

Large language models (LLMs) are widely applied to data analytics over documents, yet direct reasoning over long, noisy documents remains brittle and error-prone. Hence, we study document question answering (QA) that consolidates dispersed evidence into a structured output (e.g., a table, graph, or chunks) to support reliable, verifiable QA. We propose a two-pillar framework, LiteCoST, to achieve both high accuracy and low latency with small language models (SLMs). Pillar 1: Chain-of-Structured-Thought (CoST). We introduce a CoST template, a schema-aware instruction that guides a strong LLM to produce both a step-wise CoST trace and the corresponding structured output. The process induces a minimal structure, normalizes entities/units, aligns records, serializes the output, and verifies/refines it, yielding auditable supervision. Pillar 2: SLM fine-tuning. The compact models are trained…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

HKUSTDial/LiteCoST
github

Videos

Long-Document QA with Chain-of-Structured-Thought and Fine-Tuned SLMs· slideslive