Improved Bounds for Context-Dependent Evolutionary Models Using Sequential Monte Carlo
By: Joseph Mathews, Scott C. Schmidler
Potential Business Impact:
Helps scientists understand how life changes over time.
Statistical inference in evolutionary models with site-dependence is a long-standing challenge in phylogenetics and computational biology. We consider the problem of approximating marginal sequence likelihoods under dependent-site models of biological sequence evolution. We prove a polynomial mixing time bound for a Markov chain Monte Carlo algorithm that samples the conditional distribution over latent sample paths, when the chain is initialized with a warm start. We then introduce a sequential Monte Carlo (SMC) algorithm for approximating the marginal likelihood, and show that our mixing time bound can be combined with recent importance sampling and finite-sample SMC results to obtain bounds on the finite sample approximation error of the resulting estimator. Our results show that the proposed SMC algorithm yields an efficient randomized approximation scheme for many practical problems of interest, and offers a significant improvement over a recently developed importance sampler for this problem. Our approach combines recent innovations in obtaining bounds for MCMC and SMC samplers, and may prove applicable to other problems of approximating marginal likelihoods and Bayes factors.
Similar Papers
Importance Sampling Approximation of Sequence Evolution Models with Site-Dependence
Computation
Helps scientists track how life changes over time.
Reinforced sequential Monte Carlo for amortised sampling
Machine Learning (CS)
Helps computers learn complex patterns faster.
Generative diffusion posterior sampling for informative likelihoods
Machine Learning (Stat)
Makes AI create better pictures from less data.