Limits To (Machine) Learning
By: Zhimin Chen, Bryan Kelly, Semyon Malamud
Machine learning (ML) methods are highly flexible, but their ability to approximate the true data-generating process is fundamentally constrained by finite samples. We characterize a universal lower bound, the Limits-to-Learning Gap (LLG), quantifying the unavoidable discrepancy between a model's empirical fit and the population benchmark. Recovering the true population $R^2$, therefore, requires correcting observed predictive performance by this bound. Using a broad set of variables, including excess returns, yields, credit spreads, and valuation ratios, we find that the implied LLGs are large. This indicates that standard ML approaches can substantially understate true predictability in financial data. We also derive LLG-based refinements to the classic Hansen and Jagannathan (1991) bounds, analyze implications for parameter learning in general-equilibrium settings, and show that the LLG provides a natural mechanism for generating excess volatility.
Similar Papers
On the Fundamental Limits of LLMs at Scale
Machine Learning (CS)
Limits how much big computer brains can learn.
Robustness is Important: Limitations of LLMs for Data Fitting
Machine Learning (CS)
Computers change answers when you change names.
Failure to Mix: Large language models struggle to answer according to desired probability distributions
Machine Learning (CS)
AI models can't follow simple chance rules.