LFC-DA: Logical Formula-Controlled Data Augmentation for Enhanced Logical Reasoning
Shenghao Li

TL;DR
LFC-DA introduces a symbolic-logic-controlled data augmentation pipeline that enhances logical reasoning in models by systematically generating diverse, logically rigorous natural language questions through propositional logic and rule-based search.
Contribution
It presents a novel, interpretable data augmentation method that combines symbolic logic with large language models to improve logical reasoning accuracy.
Findings
Significant accuracy improvements on ReClor and LogiQA datasets.
Effective generation of diverse, logically rigorous questions.
Demonstrates the benefit of logic-controlled augmentation for LLMs.
Abstract
For complex logical data augmentation, heavy reliance on human annotation is costly, whereas direct generation with large language models yields uninterpretable and logically homogeneous examples. To address this, we present LFC-DA, a symbolic-logic-controlled pipeline: logical text is first mapped to propositional expressions, a compact rule library is compiled, and a bounded state-space search systematically discovers valid formulas that are then verbalized back into natural-language questions, ensuring both diversity and logical rigor under propositional logic. Experiments on ReClor and LogiQA show significant improvements in the logical-reasoning accuracy of pretrained models, confirming the effectiveness of LFC-DA for LLM-guided logical data augmentation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
