Understanding Cross-Model Perceptual Invariances Through Ensemble Metamers
By: Lukas Boehm , Jonas Leo Mueller , Christoffer Loeffler and more
Potential Business Impact:
Makes AI see the world like humans.
Understanding the perceptual invariances of artificial neural networks is essential for improving explainability and aligning models with human vision. Metamers - stimuli that are physically distinct yet produce identical neural activations - serve as a valuable tool for investigating these invariances. We introduce a novel approach to metamer generation by leveraging ensembles of artificial neural networks, capturing shared representational subspaces across diverse architectures, including convolutional neural networks and vision transformers. To characterize the properties of the generated metamers, we employ a suite of image-based metrics that assess factors such as semantic fidelity and naturalness. Our findings show that convolutional neural networks generate more recognizable and human-like metamers, while vision transformers produce realistic but less transferable metamers, highlighting the impact of architectural biases on representational invariances.
Similar Papers
Exploring Synergistic Ensemble Learning: Uniting CNNs, MLP-Mixers, and Vision Transformers to Enhance Image Classification
CV and Pattern Recognition
Makes computers see better by mixing smart brains.
Perceptual Reality Transformer: Neural Architectures for Simulating Neurological Perception Conditions
Neurons and Cognition
Lets you see like people with brain conditions.
Exploring specialization and sensitivity of convolutional neural networks in the context of simultaneous image augmentations
Machine Learning (Stat)
Helps computers explain their decisions like doctors.