Score: 1

Stepwise regression revisited

Published: March 6, 2025 | arXiv ID: 2503.04330v1

By: Román Salmerón Gómez, Catalina García García

Potential Business Impact:

Fixes math models that have too many numbers.

Business Areas:
Predictive Analytics Artificial Intelligence, Data and Analytics, Software

This paper shows that the degree of approximate multicollinearity in a linear regression model increases simply by including independent variables, even if these are not highly linearly related. In the current situation where it is relatively easy to find linear models with a large number of independent variables, it is shown that this issue can lead to the erroneous conclusion that there is a worrying problem of approximate multicollinearity. To avoid this situation, an adjusted variance inflation factor is proposed to compensate the presence of a large number of independent variables in the multiple linear regression model. It is shown that this proposal has a direct impact on variable selection models based on influence relationships, which translates into a new decision criterion in the individual significance contrast to be considered in stepwise regression models or even directly in a multiple linear regression model.

Repos / Data Links

Page Count
22 pages

Category
Statistics:
Methodology