Can We Test Consciousness Theories on AI? Ablations, Markers, and Robustness
By: Yin Jun Phua
The search for reliable indicators of consciousness has fragmented into competing theoretical camps (Global Workspace Theory (GWT), Integrated Information Theory (IIT), and Higher-Order Theories (HOT)), each proposing distinct neural signatures. We adopt a synthetic neuro-phenomenology approach: constructing artificial agents that embody these mechanisms to test their functional consequences through precise architectural ablations impossible in biological systems. Across three experiments, we report dissociations suggesting these theories describe complementary functional layers rather than competing accounts. In Experiment 1, a no-rewire Self-Model lesion abolishes metacognitive calibration while preserving first-order task performance, yielding a synthetic blindsight analogue consistent with HOT predictions. In Experiment 2, workspace capacity proves causally necessary for information access: a complete workspace lesion produces qualitative collapse in access-related markers, while partial reductions show graded degradation, consistent with GWT's ignition framework. In Experiment 3, we uncover a broadcast-amplification effect: GWT-style broadcasting amplifies internal noise, creating extreme fragility. The B2 agent family is robust to the same latent perturbation; this robustness persists in a Self-Model-off / workspace-read control, cautioning against attributing the effect solely to $z_{\text{self}}$ compression. We also report an explicit negative result: raw perturbational complexity (PCI-A) decreases under the workspace bottleneck, cautioning against naive transfer of IIT-adjacent proxies to engineered agents. These results suggest a hierarchical design principle: GWT provides broadcast capacity, while HOT provides quality control. We emphasize that our agents are not conscious; they are reference implementations for testing functional predictions of consciousness theories.
Similar Papers
Testing the Machine Consciousness Hypothesis
Artificial Intelligence
Makes computers understand themselves by talking.
Artificial Consciousness as Interface Representation
Artificial Intelligence
Tests if computers can feel or think like us.
Consciousness as a Functor
Artificial Intelligence
Explains how thoughts move between conscious and unconscious.