How Usable is Automated Feature Engineering for Tabular Data?
By: Bastian Schäfer , Lennart Purucker , Maciej Janowski and more
Potential Business Impact:
Makes computers create better data for learning.
Tabular data, consisting of rows and columns, is omnipresent across various machine learning applications. Each column represents a feature, and features can be combined or transformed to create new, more informative features. Such feature engineering is essential to achieve peak performance in machine learning. Since manual feature engineering is expensive and time-consuming, a substantial effort has been put into automating it. Yet, existing automated feature engineering (AutoFE) methods have never been investigated regarding their usability for practitioners. Thus, we investigated 53 AutoFE methods. We found that these methods are, in general, hard to use, lack documentation, and have no active communities. Furthermore, no method allows users to set time and memory constraints, which we see as a necessity for usable automation. Our survey highlights the need for future work on usable, well-engineered AutoFE methods.
Similar Papers
LLM-FE: Automated Feature Engineering for Tabular Data with LLMs as Evolutionary Optimizers
Machine Learning (CS)
Finds better data patterns for smarter computer predictions.
The Feature Understandability Scale for Human-Centred Explainable AI: Assessing Tabular Feature Importance
Human-Computer Interaction
Helps AI explain itself using easy words.
The Feature Understandability Scale for Human-Centred Explainable AI: Assessing Tabular Feature Importance
Human-Computer Interaction
Helps AI explain itself using easy-to-understand parts.