Total Robustness in Bayesian Nonlinear Regression for Measurement Error Problems under Model Misspecification
By: Mengqi Chen , Charita Dellaporta , Thomas B. Berrett and more
Potential Business Impact:
Makes computer predictions more accurate with messy data.
Modern regression analyses are often undermined by covariate measurement error, misspecification of the regression model, and misspecification of the measurement error distribution. We present, to the best of our knowledge, the first Bayesian nonparametric framework targeting total robustness that tackles all three challenges in general nonlinear regression. The framework assigns a Dirichlet process prior to the latent covariate-response distribution and updates it with posterior pseudo-samples of the latent covariates, thereby providing the Dirichlet process posterior with observation-informed latent inputs and yielding estimators that minimise the discrepancy between Dirichlet process realisations and the model-induced joint law. This design allows practitioners to (i) encode prior beliefs, (ii) choose between pseudo-sampling latent covariates or working directly with error-prone observations, and (iii) tune the influence of prior and data. We establish generalisation bounds that tighten whenever the prior or pseudo-sample generator aligns with the underlying data generating process, ensuring robustness without sacrificing consistency. A gradient-based algorithm enables efficient computations; simulations and two real-world studies show lower estimation error and reduced estimation sensitivity to misspecification compared to Bayesian and frequentist competitors. The framework, therefore, offers a practical and interpretable paradigm for trustworthy regression when data and models are jointly imperfect.
Similar Papers
Total Robustness in Bayesian Nonlinear Regression for Measurement Error Problems under Model Misspecification
Methodology
Makes computer predictions more trustworthy with bad data.
From Partial Exchangeability to Predictive Probability: A Bayesian Perspective on Classification
Methodology
Helps computers guess better with less data.
On Misspecified Error Distributions in Bayesian Functional Clustering: Consequences and Remedies
Methodology
Finds hidden groups in data better.