Graph-of-Causal Evolution: Challenging Chain-of-Model for Reasoning
By: Libo Wang
Potential Business Impact:
Helps AI remember long-ago information better.
In view of the problem that each subchain in the chain-of-model (CoM) relies only on the information of the previous subchain and may lose long-range dependencies due to the causal mask blocking the global context flow between multi-level subchains, this work proposes a graph of causal evolution (GoCE). Its core principle is to map the implicit token representation into a differentiable and sparse causal adjacency matrix, then permeate causal constraints through each layer of calculation using causal-masked attention and causal-MoE. By combining intervention consistency loss test and self-evolution gate, the dynamic balance between causal structure learning and adaptive updating of transformer architecture is realized. The researcher built experimental environments in sandboxes built with Claude Sonnet 4, o4-mini-high, and DeepSeek R1 respectively with the transformer variant architecture introduced in GoCE. It is evaluated on publicly available datasets including CLUTRR, CLADDER, EX-FEVER, and CausalQA and compared with the baseline LLMs. The finding proves that GoCE strengthens the transformer's ability to capture long-range causal dependencies, while the ability to self-evolve is improved. It not only surpasses the design of CoM in terms of design principles, but also provides experience for future research on causal learning and continuous adaptive improvement.
Similar Papers
Chain-of-Model Learning for Language Model
Computation and Language
Makes computer models learn faster and be different sizes.
Dynamic Reasoning Chains through Depth-Specialized Mixture-of-Experts in Transformer Architectures
Computation and Language
Computers solve problems faster and smarter.
CoE: Chain-of-Explanation via Automatic Visual Concept Circuit Description and Polysemanticity Quantification
CV and Pattern Recognition
Makes AI understand *why* it makes choices.