Approximate Optimal Active Learning of Decision Trees
By: Zunchen Huang, Chenglu Jin
Potential Business Impact:
Teaches computers to learn rules faster.
We consider the problem of actively learning an unknown binary decision tree using only membership queries, a setting in which the learner must reason about a large hypothesis space while maintaining formal guarantees. Rather than enumerating candidate trees or relying on heuristic impurity or entropy measures, we encode the entire space of bounded-depth decision trees symbolically in SAT formulas. We propose a symbolic method for active learning of decision trees, in which approximate model counting is used to estimate the reduction of the hypothesis space caused by each potential query, enabling near-optimal query selection without full model enumeration. The resulting learner incrementally strengthens a CNF representation based on observed query outcomes, and approximate model counter ApproxMC is invoked to quantify the remaining version space in a sound and scalable manner. Additionally, when ApproxMC stagnates, a functional equivalence check is performed to verify that all remaining hypotheses are functionally identical. Experiments on decision trees show that the method reliably converges to the correct model using only a handful of queries, while retaining a rigorous SAT-based foundation suitable for formal analysis and verification.
Similar Papers
Exploring the Design Space of Fair Tree Learning Algorithms
Machine Learning (CS)
Makes computer decisions fair for everyone.
Discovering interpretable piecewise nonlinear model predictive control laws via symbolic decision trees
Systems and Control
Teaches robots to make smart decisions.
Learning-Augmented Algorithms for Boolean Satisfiability
Data Structures and Algorithms
Helps computers solve hard problems faster with hints.