FregeLogic at SemEval 2026 Task 11: A Hybrid Neuro-Symbolic Architecture for Content-Robust Syllogistic Validity Prediction

Adewale Akinfaderin; Nafi Diallo

arXiv:2604.18328·cs.CL·April 21, 2026

FregeLogic at SemEval 2026 Task 11: A Hybrid Neuro-Symbolic Architecture for Content-Robust Syllogistic Validity Prediction

Adewale Akinfaderin, Nafi Diallo

PDF

TL;DR

FregeLogic is a hybrid neuro-symbolic system combining LLM ensembles and formal logic to improve syllogistic validity prediction by reducing content bias and enhancing accuracy.

Contribution

The paper introduces a novel neuro-symbolic architecture that integrates LLM ensembles with a formal SMT solver to improve content-robust logical validity prediction.

Findings

01

Achieved 94.3% accuracy with reduced content effect of 2.85.

02

Improved combined score by 2.76 points over pure ensemble.

03

Reduced Z3 failure rate from ~22% to near zero using structured API calls.

Abstract

We present FregeLogic, a hybrid neuro-symbolic system for SemEval-2026 Task 11 (Subtask 1), which addresses syllogistic validity prediction while reducing content effects on predictions. Our approach combines an ensemble of five LLM classifiers, spanning three open-weights models (Llama 4 Maverick, Llama 4 Scout, and Qwen3-32B) paired with varied prompting strategies, with a Z3 SMT solver that serves as a formal logic tiebreaker. The central hypothesis is that LLM disagreement within the ensemble signals likely content-biased errors, where real-world believability interferes with logical judgment. By deferring to Z3's structurally-grounded formal verification on these disputed cases, our system achieves 94.3% accuracy with a content effect of 2.85 and a combined score of 41.88 in nested 5-fold cross-validation on the dataset (N=960). This represents a 2.76-point improvement in combined…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.