Generative Sequential Notification Optimization via Multi-Objective Decision Transformers
By: Borja Ocejo , Ruofan Wang , Ke Liu and more
Potential Business Impact:
Shows people better messages, not annoying ones.
Notifications are an important communication channel for delivering timely and relevant information. Optimizing their delivery involves addressing complex sequential decision-making challenges under constraints such as message utility and user fatigue. Offline reinforcement learning (RL) methods, such as Conservative Q-Learning (CQL), have been applied to this problem but face practical challenges at scale, including instability, sensitivity to distribution shifts, limited reproducibility, and difficulties with explainability in high-dimensional recommendation settings. We present a Decision Transformer (DT) based framework that reframes policy learning as return-conditioned supervised learning, improving robustness, scalability, and modeling flexibility. Our contributions include a real-world comparison with CQL, a multi-reward design suitable for non-episodic tasks, a quantile regression approach to return-to-go conditioning, and a production-ready system with circular buffer-based sequence processing for near-real-time inference. Extensive offline and online experiments in a deployed notification system show that our approach improves notification utility and overall session activity while minimizing user fatigue. Compared to a multi-objective CQL-based agent, the DT-based approach achieved a +0.72% increase in sessions for notification decision-making at LinkedIn by making notification recommendation more relevant.
Similar Papers
A Comparison Between Decision Transformers and Traditional Offline Reinforcement Learning Algorithms
Machine Learning (CS)
Lets computers learn from past actions better.
Online Finetuning Decision Transformers with Pure RL Gradients
Machine Learning (CS)
Teaches AI to learn from its own actions.
Robust Adversarial Reinforcement Learning in Stochastic Games via Sequence Modeling
Machine Learning (CS)
Makes smart robots safer from tricky challenges.