Score: 0

Reinforcement Learning Methods for Neighborhood Selection in Local Search

Published: January 12, 2026 | arXiv ID: 2601.07948v1

By: Yannick Molinghen , Augustin Delecluse , Renaud De Landtsheer and more

Reinforcement learning has recently gained traction as a means to improve combinatorial optimization methods, yet its effectiveness within local search metaheuristics specifically remains comparatively underexamined. In this study, we evaluate a range of reinforcement learning-based neighborhood selection strategies -- multi-armed bandits (upper confidence bound, $ε$-greedy) and deep reinforcement learning methods (proximal policy optimization, double deep $Q$-network) -- and compare them against multiple baselines across three different problems: the traveling salesman problem, the pickup and delivery problem with time windows, and the car sequencing problem. We show how search-specific characteristics, particularly large variations in cost due to constraint violation penalties, necessitate carefully designed reward functions to provide stable and informative learning signals. Our extensive experiments reveal that algorithm performance varies substantially across problems, although that $ε$-greedy consistently ranks among the best performers. In contrast, the computational overhead of deep reinforcement learning approaches only makes them competitive with a substantially longer runtime. These findings highlight both the promise and the practical limitations of deep reinforcement learning in local search.

An exploration for higher efficiency in multi objective optimisation with reinforcement learning

Artificial Intelligence

Teaches computers to solve hard problems better.

11 Dec 2025 1

87%

Heuristics for Combinatorial Optimization via Value-based Reinforcement Learning: A Unified Framework and Analysis

Machine Learning (Stat)

Helps computers solve hard puzzles faster and better.

9 Dec 2025 0

87%

Random is Faster than Systematic in Multi-Objective Local Search

Neural and Evolutionary Computing

Randomly checking options finds better solutions faster.

9 Jan 2026 0

View PDF Login to Bookmark

Reinforcement Learning Methods for Neighborhood Selection in Local Search

Technical Abstract

An exploration for higher efficiency in multi objective optimisation with reinforcement learning

Heuristics for Combinatorial Optimization via Value-based Reinforcement Learning: A Unified Framework and Analysis

Random is Faster than Systematic in Multi-Objective Local Search