Controlling Equational Reasoning in Large Language Models with Prompt   Interventions

Jordan Meadows; Marco Valentino; Andre Freitas

arXiv:2307.09998·cs.CL·January 14, 2025·2 cites

Controlling Equational Reasoning in Large Language Models with Prompt Interventions

Jordan Meadows, Marco Valentino, Andre Freitas

PDF

Open Access 3 Models 1 Video

TL;DR

This paper explores controlling hallucination rates in large language models through prompt interventions, using symbolic data generation to analyze and improve mathematical derivation accuracy.

Contribution

It introduces a symbolic framework for data generation and prompt interventions to systematically study and control mathematical errors in LLMs.

Findings

01

T5-Large outperforms GPT-4 on generated evaluation sets.

02

Prompt interventions influence derivation quality and error distribution.

03

Human evaluation reveals weaknesses not captured by reference-based metrics.

Abstract

This paper investigates how hallucination rates in Large Language Models (LLMs) may be controlled via a symbolic data generation framework, exploring a fundamental relationship between the rate of certain mathematical errors and types of input intervention. Specifically, we systematically generate data for a derivation generation task using a symbolic engine, applying targeted interventions to prompts to perturb features of mathematical derivations such as the surface forms of symbols, equational tree structures, and mathematical context. We then evaluate the effect of prompt interventions across a range of LLMs including fine-tuned T5 models, GPT, and LLaMa-based models. Our experiments suggest that T5-Large can outperform the few-shot performance of GPT-4 on various evaluation sets generated via the framework. However, an extensive evaluation based on human analysis, template-based…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Videos

Controlling Equational Reasoning in Large Language Models with Prompt Interventions· underline

Taxonomy

TopicsTopic Modeling · Mathematics, Computing, and Information Processing

MethodsGated Linear Unit · Multi-Head Attention · Attention Is All You Need · fail · Byte Pair Encoding · Weight Decay · Discriminative Fine-Tuning · Residual Connection · Adam · Layer Normalization