Bayesian neural networks with interpretable priors from Mercer kernels
By: Alex Alberts, Ilias Bilionis
Potential Business Impact:
Makes smart computers understand when they're unsure.
Quantifying the uncertainty in the output of a neural network is essential for deployment in scientific or engineering applications where decisions must be made under limited or noisy data. Bayesian neural networks (BNNs) provide a framework for this purpose by constructing a Bayesian posterior distribution over the network parameters. However, the prior, which is of key importance in any Bayesian setting, is rarely meaningful for BNNs. This is because the complexity of the input-to-output map of a BNN makes it difficult to understand how certain distributions enforce any interpretable constraint on the output space. Gaussian processes (GPs), on the other hand, are often preferred in uncertainty quantification tasks due to their interpretability. The drawback is that GPs are limited to small datasets without advanced techniques, which often rely on the covariance kernel having a specific structure. To address these challenges, we introduce a new class of priors for BNNs, called Mercer priors, such that the resulting BNN has samples which approximate that of a specified GP. The method works by defining a prior directly over the network parameters from the Mercer representation of the covariance kernel, and does not rely on the network having a specific structure. In doing so, we can exploit the scalability of BNNs in a meaningful Bayesian way.
Similar Papers
Hi-fi functional priors by learning activations
Machine Learning (CS)
Makes smart computers learn better with new brain parts.
Infinite Neural Operators: Gaussian processes on functions
Machine Learning (Stat)
Makes AI better at guessing answers and learning.
Bayesian Neural Networks vs. Mixture Density Networks: Theoretical and Empirical Insights for Uncertainty-Aware Nonlinear Modeling
Computation
Helps computers guess better when they're unsure.