Score: 0

XNNTab -- Interpretable Neural Networks for Tabular Data using Sparse Autoencoders

Published: December 15, 2025 | arXiv ID: 2512.13442v1

By: Khawla Elhadri, Jörg Schlötterer, Christin Seifert

In data-driven applications relying on tabular data, where interpretability is key, machine learning models such as decision trees and linear regression are applied. Although neural networks can provide higher predictive performance, they are not used because of their blackbox nature. In this work, we present XNNTab, a neural architecture that combines the expressiveness of neural networks and interpretability. XNNTab first learns highly non-linear feature representations, which are decomposed into monosemantic features using a sparse autoencoder (SAE). These features are then assigned human-interpretable concepts, making the overall model prediction intrinsically interpretable. XNNTab outperforms interpretable predictive models, and achieves comparable performance to its non-interpretable counterparts.