What We Don't C: Representations for scientific discovery beyond VAEs
By: Brian Rogers , Micah Bowles , Chris J. Lintott and more
Potential Business Impact:
Find hidden patterns in complex data.
Accessing information in learned representations is critical for scientific discovery in high-dimensional domains. We introduce a novel method based on latent flow matching with classifier-free guidance that disentangles latent subspaces by explicitly separating information included in conditioning from information that remains in the residual representation. Across three experiments -- a synthetic 2D Gaussian toy problem, colored MNIST, and the Galaxy10 astronomy dataset -- we show that our method enables access to meaningful features of high dimensional data. Our results highlight a simple yet powerful mechanism for analyzing, controlling, and repurposing latent representations, providing a pathway toward using generative models for scientific exploration of what we don't capture, consider, or catalog.
Similar Papers
Both Semantics and Reconstruction Matter: Making Representation Encoders Ready for Text-to-Image Generation and Editing
CV and Pattern Recognition
Makes AI create better, more detailed pictures.
Physically Interpretable Representation Learning with Gaussian Mixture Variational AutoEncoder (GM-VAE)
Machine Learning (CS)
Finds hidden patterns in messy science data.
Learning Minimal Representations of Many-Body Physics from Snapshots of a Quantum Simulator
Quantum Physics
Teaches computers to find hidden patterns in quantum experiments.