Automated Generation of Massive Reasonable Empirical Theorems by Forward   Reasoning Based on Strong Relevant Logics -- A Solution to the Problem of LLM   Pre-training Data Exhaustion

Jingde Cheng

arXiv:2412.12408·cs.AI·December 18, 2024

Automated Generation of Massive Reasonable Empirical Theorems by Forward Reasoning Based on Strong Relevant Logics -- A Solution to the Problem of LLM Pre-training Data Exhaustion

Jingde Cheng

PDF

Open Access

TL;DR

This paper introduces a method for automatically generating a large number of plausible empirical theorems using forward reasoning grounded in strong relevant logics, addressing the data exhaustion issue in LLM pre-training.

Contribution

It presents a novel approach combining forward reasoning with strong relevant logics to generate empirical theorems, enhancing LLM training data without additional human input.

Findings

01

Successfully generated massive sets of empirical theorems

02

Improved the diversity and reasoning capacity of generated theorems

03

Addresses pre-training data exhaustion in large language models

Abstract

Recently, it is often said that the data used for the pre-training of large language models (LLMs) have been exhausted. This paper proposes a solution to the problem: Automated generation of massive reasonable empirical theorems by forward reasoning based on strong relevant logics. In fact, this can be regarded as a part of our approach to the problems of ATF (Automated Theorem Finding) and AKA (Automated Knowledge Appreciation).

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRough Sets and Fuzzy Logic · Semantic Web and Ontologies · AI-based Problem Solving and Planning