NeuroProlog: Multi-Task Fine-Tuning for Neurosymbolic Mathematical Reasoning via the Cocktail Effect
Pratibha Zunjare, Michael Hsiao

TL;DR
NeuroProlog introduces a neurosymbolic framework with multi-task training and execution-guided decoding to improve mathematical reasoning accuracy and verifiability in large language models.
Contribution
It presents a novel multi-task Cocktail training strategy and an execution-guided decoding pipeline that enhance symbolic reasoning and self-debugging in LLMs for math problems.
Findings
Significant accuracy improvements across model scales.
Enhanced error correction capabilities at larger scales.
Revealed scale-dependent learning dynamics and reasoning thresholds.
Abstract
Large Language Models (LLMs) achieve strong performance on natural language tasks but remain unreliable in mathematical reasoning, frequently generating fluent yet logically inconsistent solutions. We present \textbf{NeuroProlog}, a neurosymbolic framework that ensures verifiable reasoning by compiling math word problems into executable Prolog programs with formal verification guarantees. We propose a multi-task Cocktail training strategy that jointly optimizes three synergistic objectives in a unified symbolic representation space: (i) mathematical formula-to-rule translation (KB), (ii) natural language-to-program synthesis (SOLVE), and (iii) program-answer alignment. This joint supervision enables positive transfer, where symbolic grounding in formula translation directly improves compositional reasoning capabilities. At inference, we introduce an execution-guided decoding pipeline…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMathematics, Computing, and Information Processing · Topic Modeling · Model Reduction and Neural Networks
