Revisiting Variable Ordering for Real Quantifier Elimination using Machine Learning
John Hester, Briland Hitaj, Grant Passmore, Sam Owre, Natarajan, Shankar, Eric Yeh

TL;DR
This paper improves variable ordering for Cylindrical Algebraic Decomposition in formal verification by creating a more balanced dataset and evaluating machine learning models' generalizability.
Contribution
It introduces a new, less biased dataset for training ML models on variable ordering and assesses their performance and generalizability.
Findings
New dataset with over 41K challenges reduces bias.
Machine learning models show improved generalization on the new dataset.
Addressing dataset bias enhances the effectiveness of variable ordering strategies.
Abstract
Cylindrical Algebraic Decomposition (CAD) is a key proof technique for formal verification of cyber-physical systems. CAD is computationally expensive, with worst-case doubly-exponential complexity. Selecting an optimal variable ordering is paramount to efficient use of CAD. Prior work has demonstrated that machine learning can be useful in determining efficient variable orderings. Much of this work has been driven by CAD problems extracted from applications of the MetiTarski theorem prover. In this paper, we revisit this prior work and consider issues of bias in existing training and test data. We observe that the classical MetiTarski benchmarks are heavily biased towards particular variable orderings. To address this, we apply symmetries to create a new dataset containing more than 41K MetiTarski challenges designed to remove bias. Furthermore, we evaluate issues of information…
|
|
|
|
||||||||
| SVM | 69.38% | 58.88% | 28.9% | ||||||||
| k-NN | 75.27% | 57.36% | 32.21% | ||||||||
| DT | 75%% | 55.69% | 31.44% | ||||||||
| RF | 76.39% | 58.23% | 34.15% | ||||||||
| MLP | 58.81% | 53% | 33.64% |
|
|
|
|
||||||||
| SVM | 62.1% | 57.39% | 60.43% | ||||||||
| k-NN | 69.2% | 54.9% | 66.28% | ||||||||
| DT | 68.03% | 55.04% | 64.16% | ||||||||
| RF | 70.47% | 55.07% | 66.96% | ||||||||
| MLP | 50.51% | 49.62% | 48.19% |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFormal Methods in Verification · Software Testing and Debugging Techniques · Adversarial Robustness in Machine Learning
MethodsTest
11institutetext: Imandra Inc., Austin, TX 78704, USA
11email: {john,grant}@imandra.ai
22institutetext: SRI International, Menlo Park, CA 94025, USA 22email: {briland.hitaj,natarajan.shankar,sam.owre,eric.yeh}@sri.com
Revisiting Variable Ordering for Real Quantifier Elimination using Machine Learning
John Hester Authors ordered alphabetically.
This research was, in part, developed with funding from the Defense Advanced Research Projects Agency (DARPA). The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the U.S. Government or DARPA.11
Briland Hitaj 22
Grant Passmore 11
Sam Owre 22
Natarajan Shankar 22
Eric Yeh 22
Abstract
Cylindrical Algebraic Decomposition (CAD) is a key proof technique for formal verification of cyber-physical systems. CAD is computationally expensive, with worst-case doubly-exponential complexity. Selecting an optimal variable ordering is paramount to efficient use of CAD. Prior work has demonstrated that machine learning can be useful in determining efficient variable orderings. Much of this work has been driven by CAD problems extracted from applications of the MetiTarski theorem prover. In this paper, we revisit this prior work and consider issues of bias in existing training and test data. We observe that the classical MetiTarski benchmarks are heavily biased towards particular variable orderings. To address this, we apply symmetries to create a new dataset containing more than 41K MetiTarski challenges designed to remove bias. Furthermore, we evaluate issues of information leakage, and test the generalizability of our models on the new dataset.
Keywords:
Variable Ordering Cylindrical Algebraic Decomposition Machine Learning Generalizability Bias.
1 Introduction
Cylindrical Algebraic Decomposition (CAD) is a key proof technique for formal verification of cyber-physical systems such as aircraft collision avoidance systems, autonomous vehicles and medical robotics. While CAD is a complete decision procedure, it is computationally expensive with worst-case exponential complexity. However, the expense of running CAD on a given problem can be greatly affected by the way in which the problem is posed. In particular, selecting an optimal variable ordering is paramount to efficient use of CAD. Prior work has demonstrated that machine learning (ML) may be fruitfully applied to determining efficient variable orderings [4, 7]. Much of this work has been driven by CAD problems extracted from applications of the MetiTarski theorem prover [1, 10].
Generalizability is a major concern for ML-trained systems [6, 8]. For example, in visual object classification systems, if objects to classify in training sets always occur in the upper left of an image, the classifier may fail if it encounters objects in the bottom right.
The identification of symmetries is a major factor in creating robust ML systems. Symmetries are transformations of data that preserve relevant semantics. Image based object classification provides many natural examples, where transformations such as mirror flipping or large random crops do not change the identity of the object in the image. These symmetries form the basis of data augmentation methods, where such transformations applied over existing training data can create additional training data. This augmentation can substantially improve the robustness of trained systems [5].
We argue that this problem also exists in applications to mathematical tasks such as CAD variable ordering. In this case, variables may be permuted over both statements and target orderings while preserving semantics. Performing an augmentation that debiases using formula symmetries is important for maintaining performance in a statistically trained ML system.
1.0.1 Contributions:
- •
We observe bias in the classical MetiTarski benchmarks often used in training and evaluating ML-based CAD variable ordering selection systems.
- •
We show how data augmentation along inherent symmetries can remove bias and improve generalizability, resulting in a new “debiased” benchmark set.
- •
We argue that this phenomenon is more general than CAD, and that debiasing using formula symmetries should be a standard tool for applications of ML in computer algebra, program verification, and other fields manipulating mathematical formulas.
2 The CAD Variable Ordering Problem
2.1 CAD-based Real Quantifier Elimination
Consider a system of nonlinear equations and inequalities over :
[TABLE]
CAD may be used to decide the satisfiability of this system, i.e., the truth of
[TABLE]
by (a) partitioning into finitely many connected regions called cells s.t. each is sign-invariant over each , and (b) evaluating at a sample point drawn from each cell. Given sign-invariance and the fact that is a covering (), it follows that is satisfiable over iff it evaluates to true on at least one such sample point. This approach can be readily extended to formulas involving quantifier alternation [3, 9].
2.2 Efficient Variable Ordering
CAD construction proceeds in stages, eliminating each variable in turn. The variable ordering is the order in which variables are eliminated. In the present work, we are concerned with determining an efficient variable ordering for a given problem. For example, consider a CAD induced by polynomials over . Depending on the variable ordering, the efficiency of CAD computation can vary substantially [2]. For example, with ordering a CAD is constructed in 5s with 3,373 cells, while with a CAD is constructed in 93s with 43,235 cells.
3 Experimental Setup
In this section, we provide details about the datasets that we have used in our experiments, and we introduce a new augmented dataset designed to remove bias. In addition to the datasets, we provide details on the features considered based on England et al. [4] and Huang et al. [7] respective works, followed by a discussion of our labeling strategy. We conclude the section with a discussion of the machine learning models considered together with their respective hyperparameter setup.
3.1 MetiTarski Datasets
3.1.1 Dataset 1 (Original):
The first dataset is predominantly gathered from MetiTarski, by logging MetiTarski’s RCF subproblem queries during its proof search [10, 4]. The dataset contains 6,895 polynomial systems, together with data on the performance of a CAD-based decision procedure on different orderings of their variables. Every problem in this data set has 3 variables () and thus 6 possible variable orderings.
3.1.2 Dataset 2 (Augmented):
We noted that the original MetiTarski dataset was highly imbalanced. While class [math], corresponding to the () variable order contained 580-records, class , corresponding to the () variable order contained 2,657-records, nearly 4-times more, Figure 1(a). We note that this may lead ML models to be biased towards certain label/s, thus preventing the models to learn relevant information and hinder their generalizability to new, previously unseen data samples.
We recognize that variables in the formula and ordering can be swapped without changing the time and cost needed to perform the computation. For instance, swapping variables with in the formulas and in the ordering leads to a CAD with the same time and cell cost. While this may seem apparent to a human or a machine reasoner that already has the ability to recognize this symmetry, for current machine learning systems it needs to be made explicit in the training data. This procedure resulted in a new augmented composed of 41,370 polynomial systems, Figure 1(b).
3.2 Feature Engineering and Labeling
3.2.1 Feature Engineering:
We process each set of polynomials extracting features enlisted in [4, 7], including the number of polynomials, maximum total degree of polynomials, maximum degree of each , and proportion of each appearing in polynomials and monomials.
3.2.2 Data Labeling:
In addition to the feature set, we assign to each polynomial problem a label ranging from , where each label translates to one of the -possible variable orderings. At present, the label for each polynomial problem corresponds to the variable ordering that takes the least amount of time.
3.3 Models
To evaluate our approach, we used -ML classifiers: 1) Support Vector Machines (SVM), 2) k-Nearest Neighbours (k-NN), 3) Decision Trees (DT), 4) Random Forests (RF), and 5) Multi-Layer Perceptrons (MLP). In their work, England et al. [4] rely on SVMs, k-NNs, DTs, and MLPs. By following a similar approach, we ensure that our strategy is comparable to the state-of-the-art and thus can be used as a foundation for future adoption of more complex ML strategies, such as Transformers [12] or Graph Neural Networks (GNNs) [11]. We used the scikit-learn111https://scikit-learn.org/stable/-based implementations of the ML algorithms. Similar to the works of England et al. [4] and Huang et al. [7], we employed a grid-search strategy with 5-fold cross-validation to identify the right parameter setup for each of the models.
4 Evaluation
In this section, we proceed with the evaluation of the performance of the selected machine learning models (cf. Section 3.3) on identifying the preferred (best) variable order for a given input problem. We transform the problem of determining the best variable order into a multi-class classification problem. In these kind of problems the training set is composed of tuples of data, where is an input sample and is the corresponding label for that sample. The goal of the learning process translates into the task of finding a function , such that .
In our setting, the input data corresponds to the series of 11-features (cf. Section 3.2) whereas is one of the 6-labels from belonging to a preferred variable order from . The features were scaled by subtracting the mean and then scaling to unit variance222https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html.
4.0.1 Training:
For each of the datasets, we used of the data for training and kept the remaining as part of the testing set. For Dataset 1 (original), this corresponded to samples for training and for testing, whereas for Dataset 2 (augmented), samples were used during training and the remaining samples for testing. Tables 1 and 2 provide accuracy of each model obtained on the respective training set.
4.0.2 Testing:
We test each of the trained models on 20% of the respective datasets, i.e., on samples for models trained on Dataset 1 and samples for models trained on Dataset 2.
The performance of models trained on Dataset 1 varied from 53% for the MLP model up to 58.88% for the SVM, Table 1, these results being in-line with state-of-the-art work by England et al. [4] and Huang et al. [7]. Likewise, the models trained on the augmented Dataset 2 exhibit similar performance with the MLP architecture performing poorly with 49.62% accuracy and SVM obtaining up to 57.39% accuracy on the testing set, Table 2.
4.0.3 Evaluation on respective datasets:
Indeed, the performance of the models on the respective testing sets is quite promising and substantially better than random choice (). However, we observe that the original version of the collected MetiTarski data (Dataset 1) is highly imbalanced (cf. Section 3.1). Prior research in domains such as computer vision and natural language processing has emphasized the negative impacts of an imbalanced dataset, demonstrating that the accuracy exhibited by models trained on imbalanced data could be deceiving and hinder the capability of the model to learn meaningful information from the data. In most cases the model will exhibit a bias towards the label (class) which it has seen the most in the training set.
As such, it is interesting to investigate the performance of models trained on Dataset 1 (the original MetiTarski dataset) and the newly produced Dataset 2, with the latter being a superset of Dataset 1. As can be noticed in Table 1, there is a significant drop in classification accuracy for all the models trained on Dataset 1, with more than 25% drop in some cases. Models trained on the “debiased” Dataset 2 retain a good performance when evaluated on Dataset 1. We believe this is due to the training data being balanced and the model potentially seeing some of these samples during training, thus increasing its decision confidence.
5 Discussion and Future Work
We have re-examined a classical dataset for ML-driven CAD variable ordering and observed issues of bias. To address this, we have applied symmetry-based data augmentation to create a debiased version of the dataset333https://github.com/coproof/variable-ordering-revisited and have shown this improves generalizability. We believe that this phenomenon is quite general, and that debiasing using formula symmetries should be a standard tool for applications of ML in computer algebra, program verification, and other fields manipulating mathematical formulas. We plan to extend this work into general-purpose tools for ML-based variable ordering and apply it to many problem domains (e.g., Gröbner bases, BDDs, SAT, etc.).
5.0.1 Acknowledgements
This material is based upon work supported by the Defense Advanced Research Projects Agency (DARPA) under Contract No. HR00112290064. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the United States Government or DARPA.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Akbarpour, B., Paulson, L.C.: Meti Tarski: An automatic theorem prover for real-valued special functions. Journal of Automated Reasoning 44 , 175–205 (2010)
- 2[2] Chen, C., Zhu, Z., Chi, H.: Variable ordering selection for cylindrical algebraic decomposition with artificial neural networks. In: Bigatti, A.M., Carette, J., Davenport, J.H., Joswig, M., de Wolff, T. (eds.) Mathematical Software – ICMS 2020. pp. 281–291. Springer International Publishing, Cham (2020)
- 3[3] Collins, G.E.: Quantifier elimination for real closed fields by cylindrical algebraic decompostion. In: Brakhage, H. (ed.) Automata Theory and Formal Languages. pp. 134–183. Springer Berlin Heidelberg, Berlin, Heidelberg (1975)
- 4[4] England, M., Florescu, D.: Comparing machine learning models to choose the variable ordering for cylindrical algebraic decomposition. In: International Conference on Intelligent Computer Mathematics. pp. 93–108. Springer (2019)
- 5[5] Geirhos, R., Rubisch, P., Michaelis, C., Bethge, M., Wichmann, F.A., Brendel, W.: Imagenet-trained CN Ns are biased towards texture; increasing shape bias improves accuracy and robustness. In: International Conference on Learning Representations (2019), https://openreview.net/forum?id=Bygh 9j 09KX
- 6[6] Geirhos, R., Temme, C.R., Rauber, J., Schütt, H.H., Bethge, M., Wichmann, F.A.: Generalisation in humans and deep neural networks. Advances in neural information processing systems 31 (2018)
- 7[7] Huang, Z., England, M., Wilson, D.J., Bridge, J., Davenport, J.H., Paulson, L.C.: Using machine learning to improve cylindrical algebraic decomposition. Mathematics in Computer Science 13 (4), 461–488 (2019)
- 8[8] Kawaguchi, K., Kaelbling, L.P., Bengio, Y.: Generalization in deep learning. ar Xiv preprint ar Xiv:1710.05468 (2017)
