Beyond Point Estimates: Toward Proper Statistical Inferencing and Reporting of Intraclass Correlation Coefficients
By: Yufeng Liu, Xiangfei Hong, Shanbao Tong
Potential Business Impact:
Makes brain scan results more trustworthy.
Reporting test-retest reliability using the intraclass correlation coefficient (ICC) has received increasing attention due to the criticisms of poor transparency and replicability in neuroimaging research, as well as many other biomedical studies. Numerous studies have thus evaluated the reliability of their findings by comparing ICCs, however, they often failed to test statistical differences between ICCs or report confidence intervals. Relying solely on point estimates may preclude valid inference about population-level differences and compromise the reliability of conclusions. To address this issue, this study systematically reviewed the use of ICC in articles published in NeuroImage from 2022 to 2024, highlighting the prevalence of misreporting and misuse of ICCs. We further provide practical guidelines for conducting appropriate statistical inference on ICCs. For practitioners in this area, we introduce an online application for statistical testing and sample size estimation when utilizing ICCs. We recalculated confidence intervals and formally tested ICC values reported in the reviewed articles, thereby reassessing the original inferences. Our results demonstrate that exclusive reliance on point estimates could lead to unreliable or even misleading conclusions. Specifically, only two of the eleven reviewed articles provided unequivocally valid statistical inferences based on ICCs, whereas two articles failed to yield any valid inference at all, raising serious concerns about the replicability of findings in this field. These results underscore the urgent need for rigorous inferential frameworks when reporting and interpreting ICCs.
Similar Papers
The Prevalence of Misreporting and Misinterpreting Correlation Coefficients in Biomedical Literature
Methodology
Fixes how scientists measure connections between things.
Intra-Class Correlation Coefficient Ignorable Clustered Randomized Trials for Detecting Treatment Effect Heterogeneity
Methodology
Lets scientists plan studies without guessing.
Stochasticity in Agentic Evaluations: Quantifying Inconsistency with Intraclass Correlation
Artificial Intelligence
Makes AI more reliable by measuring its consistency.