Automated Generation of Massive Reasonable Empirical Theorems by Forward Reasoning Based on Strong Relevant Logics -- A Solution to the Problem of LLM Pre-training Data Exhaustion
Jingde Cheng

TL;DR
This paper introduces a method for automatically generating a large number of plausible empirical theorems using forward reasoning grounded in strong relevant logics, addressing the data exhaustion issue in LLM pre-training.
Contribution
It presents a novel approach combining forward reasoning with strong relevant logics to generate empirical theorems, enhancing LLM training data without additional human input.
Findings
Successfully generated massive sets of empirical theorems
Improved the diversity and reasoning capacity of generated theorems
Addresses pre-training data exhaustion in large language models
Abstract
Recently, it is often said that the data used for the pre-training of large language models (LLMs) have been exhausted. This paper proposes a solution to the problem: Automated generation of massive reasonable empirical theorems by forward reasoning based on strong relevant logics. In fact, this can be regarded as a part of our approach to the problems of ATF (Automated Theorem Finding) and AKA (Automated Knowledge Appreciation).
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRough Sets and Fuzzy Logic · Semantic Web and Ontologies · AI-based Problem Solving and Planning
