Distilling Wikipedia mathematical knowledge into neural network models

Joanne T. Kim; Mikel Landajuela; Brenden K. Petersen

arXiv:2104.05930·cs.LG·July 6, 2022·1 cites

Distilling Wikipedia mathematical knowledge into neural network models

Joanne T. Kim, Mikel Landajuela, Brenden K. Petersen

PDF

Open Access

TL;DR

This paper introduces a pipeline to extract and encode mathematical expressions from Wikipedia, creating a valuable resource for training machine learning models in symbolic mathematics, and demonstrates its effectiveness in symbolic regression tasks.

Contribution

It presents a novel method for distilling Wikipedia's mathematical content into symbolic encodings, enabling improved neural-guided symbolic regression.

Findings

01

Enhanced performance in symbolic regression tasks

02

Effective encoding of Wikipedia's mathematical expressions

03

A new resource for symbolic mathematics machine learning

Abstract

Machine learning applications to symbolic mathematics are becoming increasingly popular, yet there lacks a centralized source of real-world symbolic expressions to be used as training data. In contrast, the field of natural language processing leverages resources like Wikipedia that provide enormous amounts of real-world textual data. Adopting the philosophy of "mathematics as language," we bridge this gap by introducing a pipeline for distilling mathematical expressions embedded in Wikipedia into symbolic encodings to be used in downstream machine learning tasks. We demonstrate that a $mathematical$ $language$ $model$ trained on this "corpus" of expressions can be used as a prior to improve the performance of neural-guided search for the task of symbolic regression.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEvolutionary Algorithms and Applications · Model Reduction and Neural Networks · Artificial Intelligence in Games