Reinforcement learning with timed constraints for robotics motion planning
By: Zhaoan Wang , Junchao Li , Mahdi Mohammad and more
Potential Business Impact:
Robots learn to finish tasks on time, even when things change.
Robotic systems operating in dynamic and uncertain environments increasingly require planners that satisfy complex task sequences while adhering to strict temporal constraints. Metric Interval Temporal Logic (MITL) offers a formal and expressive framework for specifying such time-bounded requirements; however, integrating MITL with reinforcement learning (RL) remains challenging due to stochastic dynamics and partial observability. This paper presents a unified automata-based RL framework for synthesizing policies in both Markov Decision Processes (MDPs) and Partially Observable Markov Decision Processes (POMDPs) under MITL specifications. MITL formulas are translated into Timed Limit-Deterministic Generalized Büchi Automata (Timed-LDGBA) and synchronized with the underlying decision process to construct product timed models suitable for Q-learning. A simple yet expressive reward structure enforces temporal correctness while allowing additional performance objectives. The approach is validated in three simulation studies: a $5 \times 5$ grid-world formulated as an MDP, a $10 \times 10$ grid-world formulated as a POMDP, and an office-like service-robot scenario. Results demonstrate that the proposed framework consistently learns policies that satisfy strict time-bounded requirements under stochastic transitions, scales to larger state spaces, and remains effective in partially observable environments, highlighting its potential for reliable robotic planning in time-critical and uncertain settings.
Similar Papers
MightyPPL: Verification of MITL with Past and Pnueli Modalities
Formal Languages and Automata Theory
Checks computer programs for timing mistakes.
Logic-based Task Representation and Reward Shaping in Multiagent Reinforcement Learning
Multiagent Systems
Teaches robots to work together faster.
Efficient Verification of Metric Temporal Properties with Past in Pointwise Semantics
Formal Languages and Automata Theory
Checks computer programs for time-related bugs.