Measuring Similarity in Causal Graphs: A Framework for Semantic and Structural Analysis
By: Ning-Yuan Georgia Liu, Flower Yang, Mohammad S. Jalali
Potential Business Impact:
Compares AI's ideas about how things work.
Causal graphs are commonly used to understand and model complex systems. Researchers often construct these graphs from different perspectives, leading to significant variations for the same problem. Comparing causal graphs is, therefore, essential for evaluating assumptions, integrating insights, and resolving disagreements. The rise of AI tools has further amplified this need, as they are increasingly used to generate hypothesized causal graphs by synthesizing information from various sources such as prior research and community inputs, providing the potential for automating and scaling causal modeling for complex systems. Similar to humans, these tools also produce inconsistent results across platforms, versions, and iterations. Despite its importance, research on causal graph comparison remains scarce. Existing methods often focus solely on structural similarities, assuming identical variable names, and fail to capture nuanced semantic relationships, which is essential for causal graph comparison. We address these gaps by investigating methods for comparing causal graphs from both semantic and structural perspectives. First, we reviewed over 40 existing metrics and, based on predefined criteria, selected nine for evaluation from two threads of machine learning: four semantic similarity metrics and five learning graph kernels. We discuss the usability of these metrics in simple examples to illustrate their strengths and limitations. We then generated a synthetic dataset of 2,000 causal graphs using generative AI based on a reference diagram. Our findings reveal that each metric captures a different aspect of similarity, highlighting the need to use multiple metrics.
Similar Papers
Shaky Structures: The Wobbly World of Causal Graphs in Software Analytics
Software Engineering
Makes computer programs' "cause maps" unreliable.
Evaluating Knowledge Graph Complexity via Semantic, Spectral, and Structural Metrics for Link Prediction
Machine Learning (CS)
Finds better ways to measure how hard data is.
Causal DAG Summarization (Full Version)
Machine Learning (CS)
Simplifies complex cause-and-effect maps for easier study.