Extracting Robust Register Automata from Neural Networks over Data Sequences
By: Chih-Duo Hong , Hongjian Jiang , Anthony W. Lin and more
Potential Business Impact:
Lets computers understand and check complex AI.
Automata extraction is a method for synthesising interpretable surrogates for black-box neural models that can be analysed symbolically. Existing techniques assume a finite input alphabet, and thus are not directly applicable to data sequences drawn from continuous domains. We address this challenge with deterministic register automata (DRAs), which extend finite automata with registers that store and compare numeric values. Our main contribution is a framework for robust DRA extraction from black-box models: we develop a polynomial-time robustness checker for DRAs with a fixed number of registers, and combine it with passive and active automata learning algorithms. This combination yields surrogate DRAs with statistical robustness and equivalence guarantees. As a key application, we use the extracted automata to assess the robustness of neural networks: for a given sequence and distance metric, the DRA either certifies local robustness or produces a concrete counterexample. Experiments on recurrent neural networks and transformer architectures show that our framework reliably learns accurate automata and enables principled robustness evaluation. Overall, our results demonstrate that robust DRA extraction effectively bridges neural network interpretability and formal reasoning without requiring white-box access to the underlying network.
Similar Papers
Passive Learning of Lattice Automata from Recurrent Neural Networks
Formal Languages and Automata Theory
Finds hidden patterns in complex data.
A Rigorous Benchmark with Multidimensional Evaluation for Deep Research Agents: From Answers to Reports
Artificial Intelligence
Helps AI agents solve hard problems better.
HyperDAS: Towards Automating Mechanistic Interpretability with Hypernetworks
Computation and Language
Helps understand how computer brains think.