Approximate Optimal Active Learning of Decision Trees
Zunchen Huang, Chenglu Jin

TL;DR
This paper introduces a SAT-based symbolic approach for active learning of binary decision trees, using approximate model counting to efficiently select near-optimal queries and reliably converge with few queries.
Contribution
It presents a novel symbolic method that encodes the entire hypothesis space and employs approximate model counting for active learning, avoiding heuristics and enumeration.
Findings
Reliable convergence to correct models with few queries
Scalable and sound approach using approximate model counting
Effective verification when model counts stagnate
Abstract
We consider the problem of actively learning an unknown binary decision tree using only membership queries, a setting in which the learner must reason about a large hypothesis space while maintaining formal guarantees. Rather than enumerating candidate trees or relying on heuristic impurity or entropy measures, we encode the entire space of bounded-depth decision trees symbolically in SAT formulas. We propose a symbolic method for active learning of decision trees, in which approximate model counting is used to estimate the reduction of the hypothesis space caused by each potential query, enabling near-optimal query selection without full model enumeration. The resulting learner incrementally strengthens a CNF representation based on observed query outcomes, and approximate model counter ApproxMC is invoked to quantify the remaining version space in a sound and scalable manner.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Explainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning
