Error Detection and Correction for Interpretable Mathematics in Large Language Models

Yijin Yang; Cristina Cornelio; Mario Leiva; Paulo Shakarian

arXiv:2508.03500·cs.AI·August 6, 2025

Error Detection and Correction for Interpretable Mathematics in Large Language Models

Yijin Yang, Cristina Cornelio, Mario Leiva, Paulo Shakarian

PDF

TL;DR

This paper presents EDCIM, a method that detects and corrects errors in large language models' mathematical reasoning, improving accuracy and efficiency in generating interpretable solutions.

Contribution

Introduces EDCIM, a novel framework combining symbolic error detection with LLM correction for interpretable mathematics tasks, optimizing cost and accuracy trade-offs.

Findings

01

EDCIM reduces computational costs significantly.

02

EDCIM maintains or improves prediction accuracy.

03

The hyperparameter effectively balances cost and accuracy.

Abstract

Recent large language models (LLMs) have demonstrated the ability to perform explicit multi-step reasoning such as chain-of-thought prompting. However, their intermediate steps often contain errors that can propagate leading to inaccurate final predictions. Additionally, LLMs still struggle with hallucinations and often fail to adhere to prescribed output formats, which is particularly problematic for tasks like generating mathematical expressions or source code. This work introduces EDCIM (Error Detection and Correction for Interpretable Mathematics), a method for detecting and correcting these errors in interpretable mathematics tasks, where the model must generate the exact functional form that explicitly solve the problem (expressed in natural language) rather than a black-box solution. EDCIM uses LLMs to generate a system of equations for a given problem, followed by a symbolic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.