Score: 2

How Usable is Automated Feature Engineering for Tabular Data?

Published: August 19, 2025 | arXiv ID: 2508.13932v1

By: Bastian Schäfer , Lennart Purucker , Maciej Janowski and more

Potential Business Impact:

Makes computers create better data for learning.

Business Areas:
Machine Learning Artificial Intelligence, Data and Analytics, Software

Tabular data, consisting of rows and columns, is omnipresent across various machine learning applications. Each column represents a feature, and features can be combined or transformed to create new, more informative features. Such feature engineering is essential to achieve peak performance in machine learning. Since manual feature engineering is expensive and time-consuming, a substantial effort has been put into automating it. Yet, existing automated feature engineering (AutoFE) methods have never been investigated regarding their usability for practitioners. Thus, we investigated 53 AutoFE methods. We found that these methods are, in general, hard to use, lack documentation, and have no active communities. Furthermore, no method allows users to set time and memory constraints, which we see as a necessity for usable automation. Our survey highlights the need for future work on usable, well-engineered AutoFE methods.

Repos / Data Links

Page Count
33 pages

Category
Computer Science:
Machine Learning (CS)