Markov Decision Processing Networks
By: Sanidhay Bhambay, Thirupathaiah Vasantam, Neil Walton
Potential Business Impact:
Helps systems decide how to serve customers faster.
We introduce Markov Decision Processing Networks (MDPNs) as a multiclass queueing network model where service is a controlled, finite-state Markov process. The model exhibits a decision-dependent service process where actions taken influence future service availability. Viewed as a two-sided queueing model, this captures settings such as assemble-to-order systems, ride-hailing platforms, cross-skilled call centers, and quantum switches. We first characterize the capacity region of MDPNs. Unlike classical switched networks, the MDPN capacity region depends on the long-run mix of service states induced by the control of the underlying service process. We show, via a counterexample, that MaxWeight is not throughput-optimal in this class, demonstrating the distinction between MDPNs and classical queueing models. To bridge this gap, we design a weighted average reward policy, a multiobjective MDP that leverages a two-timescale separation at the fluid scale. We prove throughput-optimality of the resulting policy. The techniques yield a clear capacity region description and apply to a broad family of two-sided matching systems.
Similar Papers
A QoS Framework for Service Provision in Multi-Infrastructure-Sharing Networks
Networking and Internet Architecture
Makes internet faster and more reliable.
A QoS Framework for Service Provision in Multi-Infrastructure-Sharing Networks
Networking and Internet Architecture
Makes computer networks share fairly and reliably.
Model-Based Reinforcement Learning in Discrete-Action Non-Markovian Reward Decision Processes
Machine Learning (CS)
Teaches computers to learn from past events.