Flexible model for varying levels of zeros and outliers in count data
By: Touqeer Ahmad, Abid Hussain
Potential Business Impact:
Better counts for tricky data with many zeros.
Count regression models are necessary for examining discrete dependent variables alongside covariates. Nonetheless, when data display outliers, overdispersion, and an abundance of zeros, traditional methods like the zero-inflated negative binomial (ZINB) model sometimes do not yield a satisfactory fit, especially in the tail regions. This research presents a versatile, heavy-tailed discrete model as a resilient substitute for the ZINB model. The suggested framework is built by extending the generalized Pareto distribution and its zero-inflated version to the discrete domain. This formulation efficiently addresses both overdispersion and zero inflation, providing increased flexibility for heavy-tailed count data. Through intensive simulation studies and real-world implementations, the proposed models are thoroughly tested to see how well they work. The results show that our models always do better than classic negative binomial and zero-inflated negative binomial regressions when it comes to goodness-of-fit. This is especially true for datasets with a lot of zeros and outliers. These results highlight the proposed framework's potential as a strong and flexible option for modeling complicated count data.
Similar Papers
Overall marginalized models for longitudinal zero-inflated count data
Methodology
Better understand health data with many zeros.
A more interpretable regression model for count data with excess of zeros
Methodology
Makes counting sick kids easier to understand.
A common zero-inflation bivariate Poisson model with comonotonic and counter-monotonic shocks
Methodology
Counts with too many zeros can now be understood.