Score: 0

An approach to Fisher-Rao metric for infinite dimensional non-parametric information geometry

Published: December 25, 2025 | arXiv ID: 2512.21451v1

By: Bing Cheng, Howell Tong

Being infinite dimensional, non-parametric information geometry has long faced an "intractability barrier" due to the fact that the Fisher-Rao metric is now a functional incurring difficulties in defining its inverse. This paper introduces a novel framework to resolve the intractability with an Orthogonal Decomposition of the Tangent Space ($T_fM=S \oplus S^{\perp}$), where S represents an observable covariate subspace. Through the decomposition, we derive the Covariate Fisher Information Matrix (cFIM), denoted as $G_f$, which is a finite-dimensional and computable representative of information extractable from the manifold's geometry. Indeed, by proving the Trace Theorem: $H_G(f)=\text{Tr}(G_f)$, we establish a rigorous foundation for the G-entropy previously introduced by us, thereby identifying it not merely as a gradient-based regularizer, but also as a fundamental geometric invariant representing the total explainable statistical information captured by the probability distribution associated with the model. Furthermore, we establish a link between $G_f$ and the second-order derivative (i.e. the curvature) of the KL-divergence, leading to the notion of Covariate Cramér-Rao Lower Bound(CRLB). We demonstrate that $G_f$ is congruent to the Efficient Fisher Information Matrix, thereby providing fundamental limits of variance for semi-parametric estimators. Finally, we apply our geometric framework to the Manifold Hypothesis, lifting the latter from a heuristic assumption into a testable condition of rank-deficiency within the cFIM. By defining the Information Capture Ratio, we provide a rigorous method for estimating intrinsic dimensionality in high-dimensional data. In short, our work bridges the gap between abstract information geometry and the demand of explainable AI, by providing a tractable path for revealing the statistical coverage and the efficiency of non-parametric models.

Category
Statistics:
Machine Learning (Stat)