Score: 0

Identifiability of Large Phylogenetic Mixtures for Many Phylogenetic Model Structures

Published: August 7, 2025 | arXiv ID: 2508.05832v1

By: Bryson Kagy, Seth Sullivant

Potential Business Impact:

Helps scientists understand DNA changes better.

Identifiability of phylogenetic models is a necessary condition to ensure that the model parameters can be uniquely determined from data. Mixture models are phylogenetic models where the probability distributions in the model are convex combinations of distributions in simpler phylogenetic models. Mixture models are used to model heterogeneity in the substitution process in DNA sequences. While many basic phylogenetic models are known to be identifiable, mixture models in generality have only been shown to be identifiable in certain cases. We expand the main theorem of [Rhodes, Sullivant 2012] to prove identifiability of mixture models in equivariant phylogenetic models, specifically the Jukes-Cantor, Kimura 2-parameter model, Kimura 3-parameter model and the Strand Symmetric model.