Score: 1

Unveiling Latent Knowledge in Chemistry Language Models through Sparse Autoencoders

Published: December 8, 2025 | arXiv ID: 2512.08077v1

By: Jaron Cohen, Alexander G. Hasson, Sara Tanovic

Potential Business Impact:

Unlocks AI's hidden chemical knowledge for faster discoveries.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Since the advent of machine learning, interpretability has remained a persistent challenge, becoming increasingly urgent as generative models support high-stakes applications in drug and material discovery. Recent advances in large language model (LLM) architectures have yielded chemistry language models (CLMs) with impressive capabilities in molecular property prediction and molecular generation. However, how these models internally represent chemical knowledge remains poorly understood. In this work, we extend sparse autoencoder techniques to uncover and examine interpretable features within CLMs. Applying our methodology to the Foundation Models for Materials (FM4M) SMI-TED chemistry foundation model, we extract semantically meaningful latent features and analyse their activation patterns across diverse molecular datasets. Our findings reveal that these models encode a rich landscape of chemical concepts. We identify correlations between specific latent features and distinct domains of chemical knowledge, including structural motifs, physicochemical properties, and pharmacological drug classes. Our approach provides a generalisable framework for uncovering latent knowledge in chemistry-focused AI systems. This work has implications for both foundational understanding and practical deployment; with the potential to accelerate computational chemistry research.

How LLMs Learn: Tracing Internal Representations with Sparse Autoencoders

Computation and Language

Helps computers learn languages and ideas better.

9 Mar 2025 1

88%

On the Theoretical Foundation of Sparse Dictionary Learning in Mechanistic Interpretability

Machine Learning (CS)

Unlocks AI's hidden thoughts for better understanding.

5 Dec 2025 0

88%

Resurrecting the Salmon: Rethinking Mechanistic Interpretability with Domain-Specific Sparse Autoencoders

Machine Learning (CS)

Helps AI understand medical words better.

12 Aug 2025 0

View PDF Login to Bookmark

Country of Origin

🇬🇧 United Kingdom

Repos / Data Links

github.com github.com

Page Count

18 pages

Unveiling Latent Knowledge in Chemistry Language Models through Sparse Autoencoders

Unlocks AI's hidden chemical knowledge for faster discoveries.

Technical Abstract

How LLMs Learn: Tracing Internal Representations with Sparse Autoencoders

On the Theoretical Foundation of Sparse Dictionary Learning in Mechanistic Interpretability

Resurrecting the Salmon: Rethinking Mechanistic Interpretability with Domain-Specific Sparse Autoencoders