TL;DR
ATPboost is a system that combines automated theorem proving with machine learning to improve premise selection by using binary classification with ATP feedback, leading to better performance than previous multilabel methods.
Contribution
It introduces a binary classification approach for premise relevance estimation in ATP, utilizing XGBoost and novel negative example selection methods.
Findings
ATPboost outperforms k-nearest neighbors multilabel classifier
Binary classification effectively estimates premise relevance
Negative example handling improves learning accuracy
Abstract
ATPboost is a system for solving sets of large-theory problems by interleaving ATP runs with state-of-the-art machine learning of premise selection from the proofs. Unlike many previous approaches that use multi-label setting, the learning is implemented as binary classification that estimates the pairwise-relevance of (theorem, premise) pairs. ATPboost uses for this the XGBoost gradient boosting algorithm, which is fast and has state-of-the-art performance on many tasks. Learning in the binary setting however requires negative examples, which is nontrivial due to many alternative proofs. We discuss and implement several solutions in the context of the ATP/ML feedback loop, and show that ATPboost with such methods significantly outperforms the k-nearest neighbors multilabel classifier.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
