Score: 0

LLM-based Agents for Automated Confounder Discovery and Subgroup Analysis in Causal Inference

Published: August 10, 2025 | arXiv ID: 2508.07221v1

By: Po-Han Lee , Yu-Cheng Lin , Chan-Tung Ku and more

Potential Business Impact:

Helps doctors find best treatments by finding hidden causes.

Estimating individualized treatment effects from observational data presents a persistent challenge due to unmeasured confounding and structural bias. Causal Machine Learning (causal ML) methods, such as causal trees and doubly robust estimators, provide tools for estimating conditional average treatment effects. These methods have limited effectiveness in complex real-world environments due to the presence of latent confounders or those described in unstructured formats. Moreover, reliance on domain experts for confounder identification and rule interpretation introduces high annotation cost and scalability concerns. In this work, we proposed Large Language Model-based agents for automated confounder discovery and subgroup analysis that integrate agents into the causal ML pipeline to simulate domain expertise. Our framework systematically performs subgroup identification and confounding structure discovery by leveraging the reasoning capabilities of LLM-based agents, which reduces human dependency while preserving interpretability. Experiments on real-world medical datasets show that our proposed approach enhances treatment effect estimation robustness by narrowing confidence intervals and uncovering unrecognized confounding biases. Our findings suggest that LLM-based agents offer a promising path toward scalable, trustworthy, and semantically aware causal inference.

Using LLMs to Directly Guess Conditional Expectations Can Improve Efficiency in Causal Estimation

Machine Learning (CS)

AI guesses help find what causes things better.

9 Oct 2025 0

90%

Technical Report: Facilitating the Adoption of Causal Inference Methods Through LLM-Empowered Co-Pilot

Machine Learning (CS)

Helps doctors find best treatments from patient data.

14 Aug 2025 0

90%

Latent Variable Modeling for Robust Causal Effect Estimation

Machine Learning (CS)

Find hidden causes of effects in data.

27 Aug 2025 0

View PDF Login to Bookmark

Country of Origin

🇹🇼 Taiwan, Province of China

Page Count

6 pages

LLM-based Agents for Automated Confounder Discovery and Subgroup Analysis in Causal Inference

Helps doctors find best treatments by finding hidden causes.

Technical Abstract

Using LLMs to Directly Guess Conditional Expectations Can Improve Efficiency in Causal Estimation

Technical Report: Facilitating the Adoption of Causal Inference Methods Through LLM-Empowered Co-Pilot

Latent Variable Modeling for Robust Causal Effect Estimation