Improving Deep Tabular Learning
By: Sivan Sarafian, Yehudit Aperstein
Potential Business Impact:
Helps computers learn from messy data better.
Tabular data remain a dominant form of real-world information but pose persistent challenges for deep learning due to heterogeneous feature types, lack of natural structure, and limited label-preserving augmentations. As a result, ensemble models based on decision trees continue to dominate benchmark leaderboards. In this work, we introduce RuleNet, a transformer-based architecture specifically designed for deep tabular learning. RuleNet incorporates learnable rule embeddings in a decoder, a piecewise linear quantile projection for numerical features, and feature masking ensembles for robustness and uncertainty estimation. Evaluated on eight benchmark datasets, RuleNet matches or surpasses state-of-the-art tree-based methods in most cases, while remaining computationally efficient, offering a practical neural alternative for tabular prediction tasks.
Similar Papers
TABLET: Table Structure Recognition using Encoder-only Transformers
CV and Pattern Recognition
Helps computers understand messy tables faster.
Basis Transformers for Multi-Task Tabular Regression
Machine Learning (CS)
Helps computers understand messy data better.
Learning Decision Trees as Amortized Structure Inference
Machine Learning (CS)
Builds smarter computer programs that learn from data.