Locally Pareto-Optimal Interpretations for Black-Box Machine Learning Models
By: Aniruddha Joshi , Supratik Chakraborty , S Akshay and more
Potential Business Impact:
Makes AI decisions understandable and accurate.
Creating meaningful interpretations for black-box machine learning models involves balancing two often conflicting objectives: accuracy and explainability. Exploring the trade-off between these objectives is essential for developing trustworthy interpretations. While many techniques for multi-objective interpretation synthesis have been developed, they typically lack formal guarantees on the Pareto-optimality of the results. Methods that do provide such guarantees, on the other hand, often face severe scalability limitations when exploring the Pareto-optimal space. To address this, we develop a framework based on local optimality guarantees that enables more scalable synthesis of interpretations. Specifically, we consider the problem of synthesizing a set of Pareto-optimal interpretations with local optimality guarantees, within the immediate neighborhood of each solution. Our approach begins with a multi-objective learning or search technique, such as Multi-Objective Monte Carlo Tree Search, to generate a best-effort set of Pareto-optimal candidates with respect to accuracy and explainability. We then verify local optimality for each candidate as a Boolean satisfiability problem, which we solve using a SAT solver. We demonstrate the efficacy of our approach on a set of benchmarks, comparing it against previous methods for exploring the Pareto-optimal front of interpretations. In particular, we show that our approach yields interpretations that closely match those synthesized by methods offering global guarantees.
Similar Papers
Robust Counterfactual Explanations under Model Multiplicity Using Multi-Objective Optimization
Machine Learning (CS)
Makes AI decisions more trustworthy and fair.
Mixtures of Transparent Local Models
Machine Learning (CS)
Makes computer decisions easy to understand.
In-Context Multi-Objective Optimization
Machine Learning (CS)
Finds best solutions faster for complex problems.