Inhomogeneous continuous-time Markov chains to infer flexible time-varying evolutionary rates
By: Pratyusa Datta, Philippe Lemey, Marc A. Suchard
Potential Business Impact:
Tracks how fast diseases change over time.
Reconstructing evolutionary histories and estimating the rate of evolution from molecular sequence data is of central importance in evolutionary biology and infectious disease research. We introduce a flexible Bayesian phylogenetic inference framework that accommodates changing evolutionary rates over time by modeling sequence character substitution processes as inhomogeneous continuous-time Markov chains (ICTMCs) acting along the unknown phylogeny, where the rate remains as an unknown, positive and integrable function of time. The integral of the rate function appears in the finite-time transition probabilities of the ICTMCs that must be efficiently computed for all branches of the phylogeny to evaluate the observed data likelihood. Circumventing computational challenges that arise from a fully nonparametric function, we successfully parameterize the rate function as piecewise constant with a large number of epochs that we call the polyepoch clock model. This makes the transition probability computation relatively inexpensive and continues to flexibly capture rate change over time. We employ a Gaussian Markov random field prior to achieve temporal smoothing of the estimated rate function. Hamiltonian Monte Carlo sampling enabled by scalable gradient evaluation under this model makes our framework computationally efficient. We assess the performance of the polyepoch clock model in recovering the true timescales and rates through simulations under two different evolutionary scenarios. We then apply the polyepoch clock model to examine the rates of West Nile virus, Dengue virus and influenza A/H3N2 evolution, and estimate the time-varying rate of SARS-CoV-2 spread in Europe in 2020.
Similar Papers
Nonparametric Modeling of Continuous-Time Markov Chains
Methodology
Helps scientists understand how things change over time.
Stochasticity and Practical Identifiability in Epidemic Models: A Monte Carlo Perspective
Methodology
Makes disease spread predictions more accurate.
Simulation and inference methods for non-Markovian stochastic biochemical reaction networks
Molecular Networks
Makes cell behavior models more accurate and faster.