Ontology-Guided Neuro-Symbolic Inference: Grounding Language Models with Mathematical Domain Knowledge
Marcelo Labre

TL;DR
This paper explores how formal mathematical ontologies can improve language models' reasoning by integrating relevant domain knowledge, demonstrating both potential benefits and challenges in high-stakes mathematical tasks.
Contribution
It introduces a neuro-symbolic pipeline using the OpenMath ontology to enhance language models with domain-specific knowledge for mathematical reasoning.
Findings
Ontology-guided context improves model performance with good retrieval quality.
Irrelevant context can negatively impact model accuracy.
Highlights both promise and limitations of neuro-symbolic methods in formal reasoning.
Abstract
Language models exhibit fundamental limitations -- hallucination, brittleness, and lack of formal grounding -- that are particularly problematic in high-stakes specialist fields requiring verifiable reasoning. I investigate whether formal domain ontologies can enhance language model reliability through retrieval-augmented generation. Using mathematics as proof of concept, I implement a neuro-symbolic pipeline leveraging the OpenMath ontology with hybrid retrieval and cross-encoder reranking to inject relevant definitions into model prompts. Evaluation on the MATH benchmark with three open-source models reveals that ontology-guided context improves performance when retrieval quality is high, but irrelevant context actively degrades it -- highlighting both the promise and challenges of neuro-symbolic approaches.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMathematics, Computing, and Information Processing · Topic Modeling · Machine Learning in Materials Science
