VAGPO: Vision-augmented Asymmetric Group Preference Optimization for the Routing Problems
By: Shiyan Liu, Bohan Tan, Yan Jin
Potential Business Impact:
Solves huge delivery routes fast without retraining
The routing problems such as the Traveling Salesman Problem (TSP) and the Capacitated Vehicle Routing Problem (CVRP) are well-known combinatorial optimization challenges with broad practical relevance. Recent data-driven optimization methods have made significant progress, yet they often face limitations in training efficiency and generalization to large-scale instances. In this paper, we propose a novel Vision-Augmented Asymmetric Group Preference Optimization (VAGPO) approach for solving the routing problems. By leveraging ResNet-based visual encoding and Transformer-based sequential modeling, VAGPO captures both spatial structure and temporal dependencies. Furthermore, we introduce an asymmetric group preference optimization strategy that significantly accelerates convergence compared to commonly used policy gradient methods. Experimental results on TSP and CVRP benchmarks show that the proposed VAGPO not only achieves highly competitive solution quality but also exhibits strong generalization to larger instances (up to 1000 nodes) without re-training, highlighting its effectiveness in both learning efficiency and scalability.
Similar Papers
Seeing What Matters: Visual Preference Policy Optimization for Visual Generation
CV and Pattern Recognition
Makes AI pictures better by fixing small mistakes.
Vehicle Routing Problems via Quantum Graph Attention Network Deep Reinforcement Learning
Machine Learning (CS)
Finds best delivery routes using quantum computers.
Anchoring Values in Temporal and Group Dimensions for Flow Matching Model Alignment
Machine Learning (CS)
Makes AI draw better pictures by fixing mistakes.