A statistical framework for comparing epidemic forests
By: Cyril Geismar , Peter J. White , Anne Cori and more
Potential Business Impact:
Compares how diseases spread to find best cures.
Inferring who infected whom in an outbreak is essential for characterising transmission dynamics and guiding public health interventions. However, this task is challenging due to limited surveillance data and the complexity of immunological and social interactions. Instead of a single definitive transmission tree, epidemiologists often consider multiple plausible trees forming \textit{epidemic forests}. Various inference methods and assumptions can yield different epidemic forests, yet no formal test exists to assess whether these differences are statistically significant. We propose such a framework using a chi-square test and permutational multivariate analysis of variance (PERMANOVA). We assessed each method's ability to distinguish simulated epidemic forests generated under different offspring distributions. While both methods achieved perfect specificity for forests with 100+ trees, PERMANOVA consistently outperformed the chi-square test in sensitivity across all epidemic and forest sizes. Implemented in the R package \textit{mixtree}, we provide the first statistical framework to robustly compare epidemic forests.
Similar Papers
Inference of epidemic networks: the effect of different data types
Computational Physics
Tracks disease spread better with more data.
A U-Statistic-based random forest approach for genetic interaction study
Genomics
Finds hidden gene links to health problems.
Empirical Likelihood for Random Forests and Ensembles
Machine Learning (Stat)
Quantifies computer predictions to show how sure they are.