Score: 2

iLTM: Integrated Large Tabular Model

Published: November 20, 2025 | arXiv ID: 2511.15941v1

By: David Bonet , Marçal Comajoan Cara , Alvaro Calafell and more

BigTech Affiliations: Stanford University

Potential Business Impact:

Makes computers learn from data much better.

Business Areas:

Machine Learning Artificial Intelligence, Data and Analytics, Software

Tabular data underpins decisions across science, industry, and public services. Despite rapid progress, advances in deep learning have not fully carried over to the tabular domain, where gradient-boosted decision trees (GBDTs) remain a default choice in practice. We present iLTM, an integrated Large Tabular Model that unifies tree-derived embeddings, dimensionality-agnostic representations, a meta-trained hypernetwork, multilayer perceptrons (MLPs), and retrieval within a single architecture. Pretrained on more than 1,800 heterogeneous classification datasets, iLTM achieves consistently superior performance across tabular classification and regression tasks, from small datasets to large and high-dimensional tasks. After light fine-tuning, the meta-trained hypernetwork transfers to regression targets, matching or surpassing strong baselines. Extensive experiments show that iLTM outperforms well-tuned GBDTs and leading deep tabular models while requiring less task-specific tuning. By bridging the gap between tree-based and neural methods, iLTM offers a new framework for tabular foundation models for robust, adaptable, and scalable tabular learning.

Large Language Models as Universal Predictors? An Empirical Study on Small Tabular Datasets

Artificial Intelligence

Lets computers learn from small data sets.

24 Aug 2025 1

88%

MachineLearningLM: Continued Pretraining Language Models on Millions of Synthetic Tabular Prediction Tasks Scales In-Context ML

Computation and Language

Teaches computers to learn from many examples.

8 Sep 2025 1

88%

TabGemma: Text-Based Tabular ICL via LLM using Continued Pretraining and Retrieval

Machine Learning (CS)

Helps computers understand and predict from mixed data.

5 Nov 2025 1

View PDF Login to Bookmark

Country of Origin

🇺🇸 United States

Repos / Data Links

github.com

Page Count

34 pages

iLTM: Integrated Large Tabular Model

Makes computers learn from data much better.

Technical Abstract

Large Language Models as Universal Predictors? An Empirical Study on Small Tabular Datasets

MachineLearningLM: Continued Pretraining Language Models on Millions of Synthetic Tabular Prediction Tasks Scales In-Context ML

TabGemma: Text-Based Tabular ICL via LLM using Continued Pretraining and Retrieval