Can Graphs Improve Tabular Foundation Models?
By: Franck Le , Keith Grueneberg , Erich Nahum and more
Tabular data are central to many real-world systems. While recent tabular transformers and in-context learners such as SAINT, TP-BERTa, TabPFN, TabICL, and MITRA incorporate limited inter-row reasoning, most approaches still lack an explicit mechanism to model relationships among instances, even though similar samples often share related outcomes. We investigate whether introducing \emph{simple graph priors} can enhance \emph{pretrained tabular transformers}. Concretely, we introduce {BOLERO}, a lightweight, static bipartite graph head that augments {RoBERTa-Tab} (a RoBERTa-style tabular backbone pretrained with masked-token prediction.) Each instance connects to feature/value anchors; a small GNN refines row representations, while the backbone remains frozen. We evaluate on 80 classification and 64 regression datasets from the TP-BERTa benchmark suites, comparing against strong baselines including XGBoost, CatBoost, TabPFN-v2, MITRA, TabICL, TP-BERTa, and RoBERTa-Tab. To ensure statistically sound conclusions, we follow best practices for multi-dataset evaluation: pairwise Wilcoxon signed-rank tests on per-dataset score differences and effect sizes (median improvement with confidence intervals), rather than mean-rank post-hoc tests that depend on the competitor pool. BOLERO achieves the highest number of statistically significant wins across both classification and regression, demonstrating that lightweight graph priors meaningfully improve pretrained tabular transformers.
Similar Papers
Towards a Relationship-Aware Transformer for Tabular Data
Machine Learning (CS)
Helps computers learn from related data better.
Generalization Can Emerge in Tabular Foundation Models From a Single Table
Machine Learning (CS)
Teaches computers to learn from just one example.
Tab-PET: Graph-Based Positional Encodings for Tabular Transformers
Machine Learning (CS)
Helps computers learn better from messy data.