Score: 0

In Search of Grandmother Cells: Tracing Interpretable Neurons in Tabular Representations

Published: January 7, 2026 | arXiv ID: 2601.03657v1

By: Ricardo Knauer, Erik Rodner

Potential Business Impact:

Finds "idea cells" inside smart computer programs.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Foundation models are powerful yet often opaque in their decision-making. A topic of continued interest in both neuroscience and artificial intelligence is whether some neurons behave like grandmother cells, i.e., neurons that are inherently interpretable because they exclusively respond to single concepts. In this work, we propose two information-theoretic measures that quantify the neuronal saliency and selectivity for single concepts. We apply these metrics to the representations of TabPFN, a tabular foundation model, and perform a simple search across neuron-concept pairs to find the most salient and selective pair. Our analysis provides the first evidence that some neurons in such models show moderate, statistically significant saliency and selectivity for high-level concepts. These findings suggest that interpretable neurons can emerge naturally and that they can, in some cases, be identified without resorting to more complex interpretability techniques.

XNNTab -- Interpretable Neural Networks for Tabular Data using Sparse Autoencoders

Machine Learning (CS)

Lets smart computers explain their decisions.

15 Dec 2025 0

87%

Faithful and Stable Neuron Explanations for Trustworthy Mechanistic Interpretability

Artificial Intelligence

Makes AI's "thinking" understandable and trustworthy.

19 Dec 2025 1

87%

Towards Interpretable Deep Neural Networks for Tabular Data

Machine Learning (CS)

Explains computer decisions made from data.

10 Sep 2025 3

View PDF Login to Bookmark

Page Count

9 pages

In Search of Grandmother Cells: Tracing Interpretable Neurons in Tabular Representations

Finds "idea cells" inside smart computer programs.

Technical Abstract

XNNTab -- Interpretable Neural Networks for Tabular Data using Sparse Autoencoders

Faithful and Stable Neuron Explanations for Trustworthy Mechanistic Interpretability

Towards Interpretable Deep Neural Networks for Tabular Data