Score: 0

What We Don't C: Representations for scientific discovery beyond VAEs

Published: November 12, 2025 | arXiv ID: 2511.09433v1

By: Brian Rogers , Micah Bowles , Chris J. Lintott and more

Potential Business Impact:

Find hidden patterns in complex data.

Business Areas:
Machine Learning Artificial Intelligence, Data and Analytics, Software

Accessing information in learned representations is critical for scientific discovery in high-dimensional domains. We introduce a novel method based on latent flow matching with classifier-free guidance that disentangles latent subspaces by explicitly separating information included in conditioning from information that remains in the residual representation. Across three experiments -- a synthetic 2D Gaussian toy problem, colored MNIST, and the Galaxy10 astronomy dataset -- we show that our method enables access to meaningful features of high dimensional data. Our results highlight a simple yet powerful mechanism for analyzing, controlling, and repurposing latent representations, providing a pathway toward using generative models for scientific exploration of what we don't capture, consider, or catalog.

Country of Origin
🇬🇧 United Kingdom

Page Count
10 pages

Category
Computer Science:
Artificial Intelligence