Leveraging Predictive Equivalence in Decision Trees
By: Hayden McTavish , Zachery Boner , Jon Donnelly and more
Potential Business Impact:
Makes computer decisions clearer and more reliable.
Decision trees are widely used for interpretable machine learning due to their clearly structured reasoning process. However, this structure belies a challenge we refer to as predictive equivalence: a given tree's decision boundary can be represented by many different decision trees. The presence of models with identical decision boundaries but different evaluation processes makes model selection challenging. The models will have different variable importance and behave differently in the presence of missing values, but most optimization procedures will arbitrarily choose one such model to return. We present a boolean logical representation of decision trees that does not exhibit predictive equivalence and is faithful to the underlying decision boundary. We apply our representation to several downstream machine learning tasks. Using our representation, we show that decision trees are surprisingly robust to test-time missingness of feature values; we address predictive equivalence's impact on quantifying variable importance; and we present an algorithm to optimize the cost of reaching predictions.
Similar Papers
Efficient & Correct Predictive Equivalence for Decision Trees
Artificial Intelligence
Finds better computer decisions, faster and more accurate.
Efficient & Correct Predictive Equivalence for Decision Trees
Artificial Intelligence
Makes computer decisions more accurate and faster.
Experiments with Optimal Model Trees
Machine Learning (CS)
Makes computer predictions more accurate with simpler trees.