Partitioning the Sample Space for a More Precise Shannon Entropy Estimation
By: Gabriel F. A. Bastos, Jugurta Montalvão
Potential Business Impact:
Helps guess hidden information from limited data.
Reliable data-driven estimation of Shannon entropy from small data sets, where the number of examples is potentially smaller than the number of possible outcomes, is a critical matter in several applications. In this paper, we introduce a discrete entropy estimator, where we use the decomposability property in combination with estimations of the missing mass and the number of unseen outcomes to compensate for the negative bias induced by them. Experimental results show that the proposed method outperforms some classical estimators in undersampled regimes, and performs comparably with some well-established state-of-the-art estimators.
Similar Papers
Nonparametric Estimation of Joint Entropy through Partitioned Sample-Spacing Method
Statistics Theory
Measures how much information is shared.
Uncertainty Estimation using Variance-Gated Distributions
Machine Learning (CS)
Makes AI more sure about its answers.
Discrete State Diffusion Models: A Sample Complexity Perspective
Machine Learning (CS)
Teaches computers to create text and lists better.