Reinforcement Learning enhanced Online Adaptive Clinical Decision Support via Digital Twin powered Policy and Treatment Effect optimized Reward
By: Xinyu Qin, Ruiheng Yu, Lu Wang
Potential Business Impact:
Helps doctors choose the best medicine for patients.
Clinical decision support must adapt online under safety constraints. We present an online adaptive tool where reinforcement learning provides the policy, a patient digital twin provides the environment, and treatment effect defines the reward. The system initializes a batch-constrained policy from retrospective data and then runs a streaming loop that selects actions, checks safety, and queries experts only when uncertainty is high. Uncertainty comes from a compact ensemble of five Q-networks via the coefficient of variation of action values with a $\tanh$ compression. The digital twin updates the patient state with a bounded residual rule. The outcome model estimates immediate clinical effect, and the reward is the treatment effect relative to a conservative reference with a fixed z-score normalization from the training split. Online updates operate on recent data with short runs and exponential moving averages. A rule-based safety gate enforces vital ranges and contraindications before any action is applied. Experiments in a synthetic clinical simulator show low latency, stable throughput, a low expert query rate at fixed safety, and improved return against standard value-based baselines. The design turns an offline policy into a continuous, clinician-supervised system with clear controls and fast adaptation.
Similar Papers
Reinforcement Learning for Target Zone Blood Glucose Control
Machine Learning (CS)
Helps diabetes machines give better insulin doses.
Beyond Prediction: Reinforcement Learning as the Defining Leap in Healthcare AI
Machine Learning (CS)
AI learns to make the best medical choices.
Efficient Model-Based Reinforcement Learning for Robot Control via Online Learning
Robotics
Teaches robots to learn by doing, faster.