Teaching Language Models Mechanistic Explainability Through Arrow-Pushing
By: Théo A. Neukomm, Zlatko Jončev, Philippe Schwaller
Chemical reaction mechanisms provide crucial insight into synthesizability, yet current Computer-Assisted Synthesis Planning (CASP) systems lack mechanistic grounding. We introduce a computational framework for teaching language models to predict chemical reaction mechanisms through arrow pushing formalism, a century-old notation that tracks electron flow while respecting conservation laws. We developed MechSMILES, a compact textual format encoding molecular structure and electron flow, and trained language models on four mechanism prediction tasks of increasing complexity using mechanistic reaction datasets, such as mech-USPTO-31k and FlowER. Our models achieve more than 95\% top-3 accuracy on elementary step prediction and scores that surpass 73\% on mech-USPTO-31k, and 93\% on FlowER dataset for the retrieval of complete reaction mechanisms on our hardest task. This mechanistic understanding enables three key applications. First, our models serve as post-hoc validators for CASP systems, filtering chemically implausible transformations. Second, they enable holistic atom-to-atom mapping that tracks all atoms, including hydrogens. Third, they extract catalyst-aware reaction templates that distinguish recycled catalysts from spectator species. By grounding predictions in physically meaningful electron moves that ensure conservation of mass and charge, this work provides a pathway toward more explainable and chemically valid computational synthesis planning, while providing an architecture-agnostic framework for the benchmarking of mechanism prediction.
Similar Papers
Interpretable Deep Learning for Polar Mechanistic Reaction Prediction
Machine Learning (CS)
Helps predict how chemicals will mix and change.
DeepMech: A Machine Learning Framework for Chemical Reaction Mechanism Prediction
Chemical Physics
**Helps scientists predict how chemicals mix.**
Chemical reasoning in LLMs unlocks strategy-aware synthesis planning and reaction mechanism elucidation
Artificial Intelligence
Computers plan chemical reactions like expert scientists.