Analyzing LLM Instruction Optimization for Tabular Fact Verification

Xiaotang Du; Giwon Hong; Wai-Chung Kwan; Rohit Saxena; Ivan Titov; Pasquale Minervini; Emily Allaway

arXiv:2602.17937·cs.CL·February 23, 2026

Analyzing LLM Instruction Optimization for Tabular Fact Verification

Xiaotang Du, Giwon Hong, Wai-Chung Kwan, Rohit Saxena, Ivan Titov, Pasquale Minervini, Emily Allaway

PDF

Open Access 1 Video

TL;DR

This paper systematically compares instruction optimization techniques for large language models in tabular fact verification, demonstrating consistent accuracy improvements and analyzing the effects of different optimizers and prompting methods.

Contribution

It introduces a comprehensive evaluation of instruction optimization methods for tabular fact verification using the DSPy framework, highlighting the effectiveness of specific optimizers and prompting strategies.

Findings

01

MiPROv2 provides stable gains for Chain-of-Thought prompting.

02

SIMBA yields the largest benefits for ReAct agents, especially at larger scales.

03

Instruction optimization consistently improves verification accuracy.

Abstract

Instruction optimization provides a lightweight, model-agnostic approach to enhancing the reasoning performance of large language models (LLMs). This paper presents the first systematic comparison of instruction optimization, based on the DSPy optimization framework, for tabular fact verification. We evaluate four out-of-the-box prompting techniques that cover both text-only prompting and code use: direct prediction, Chain-of-Thought (CoT), ReAct with SQL tools, and CodeAct with Python execution. We study three optimizers from the DSPy framework -- COPRO, MiPROv2, and SIMBA -- across four benchmarks and three model families. We find that instruction optimization consistently improves verification accuracy, with MiPROv2 yielding the most stable gains for CoT, and SIMBA providing the largest benefits for ReAct agents, particularly at larger model scales. Behavioral analyses reveal that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Analyzing LLM Instruction Optimization for Tabular Fact Verification· underline

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications