Lessons on Datasets and Paradigms in Machine Learning for Symbolic Computation: A Case Study on CAD
Tereso del R\'io, Matthew England

TL;DR
This paper explores how machine learning can optimize symbolic computation tasks, emphasizing dataset analysis and different paradigms, demonstrated through variable ordering in cylindrical algebraic decomposition, with improved dataset handling and methodological insights.
Contribution
It highlights the importance of dataset analysis and augmentation in applying machine learning to symbolic computation, and shows how classification can be reframed as regression for broader applicability.
Findings
Dataset imbalance affects machine learning performance in symbolic computation.
Augmentation techniques improve model accuracy by up to 38%.
Recasting classification as regression broadens methodological scope.
Abstract
Symbolic Computation algorithms and their implementation in computer algebra systems often contain choices which do not affect the correctness of the output but can significantly impact the resources required: such choices can benefit from having them made separately for each problem via a machine learning model. This study reports lessons on such use of machine learning in symbolic computation, in particular on the importance of analysing datasets prior to machine learning and on the different machine learning paradigms that may be utilised. We present results for a particular case study, the selection of variable ordering for cylindrical algebraic decomposition, but expect that the lessons learned are applicable to other decisions in symbolic computation. We utilise an existing dataset of examples derived from applications which was found to be imbalanced with respect to the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPolynomial and algebraic computation · Machine Learning and Algorithms · Formal Methods in Verification
