Hidden memory and stochastic fluctuations in science
By: Keisuke Okamura
Potential Business Impact:
Explains why some papers get famous, others don't.
Understanding the statistical laws governing citation dynamics remains a fundamental challenge in network theory and the science of science. Citation networks typically exhibit in-degree distributions well approximated by log-normal distributions, yet they also display power-law behaviour in the high-citation regime, presenting an apparent contradiction that lacks a unified explanation. Here, we identify a previously unrecognised phenomenon: the variance of the logarithm of citation counts per unit time follows a power law with respect to time since publication, scaling as $t^{H}$. This discovery introduces a new challenge while simultaneously offering a crucial clue to resolving this discrepancy. We develop a stochastic model in which latent attention to publications evolves through a memory-driven process incorporating cumulative advantage. This process is characterised by the Hurst parameter $H$, derived from fractional Brownian motion, and volatility. Our framework reconciles this contradiction by demonstrating that anti-persistent fluctuations ($H<\tfrac{1}{2}$) give rise to log-normal citation distributions, whereas persistent dynamics ($H>\tfrac{1}{2}$) favour heavy-tailed power laws. Numerical simulations confirm our model's explanatory and predictive power, interpolating between log-normal and power-law distributions while reproducing the $t^{H}$ law. Empirical analysis of arXiv e-prints further supports our theory, revealing an intrinsically anti-persistent nature with an upper bound of approximately $H=0.13$. By linking memory effects and stochastic fluctuations to broader network dynamics, our findings provide a unifying framework for understanding the evolution of collective attention in science and other attention-driven processes.
Similar Papers
Sub-exponential Growth of New Words and Names Online: A Piecewise Power-Law Model
Physics and Society
Explains how new ideas spread slower than expected.
Generalized Taylor's Law for Dependent and Heterogeneous Heavy-Tailed Data
Statistics Theory
Finds patterns in messy, connected data.
Fast Escape, Slow Convergence: Learning Dynamics of Phase Retrieval under Power-Law Data
Machine Learning (Stat)
Makes AI learn faster with tricky, uneven data.