Score: 1

Environment Agnostic Goal-Conditioning, A Study of Reward-Free Autonomous Learning

Published: November 6, 2025 | arXiv ID: 2511.04598v1

By: Hampus Åström, Elin Anna Topp, Jacek Malec

Potential Business Impact:

Lets robots learn any task without rewards.

Business Areas:

Autonomous Vehicles Transportation

In this paper we study how transforming regular reinforcement learning environments into goal-conditioned environments can let agents learn to solve tasks autonomously and reward-free. We show that an agent can learn to solve tasks by selecting its own goals in an environment-agnostic way, at training times comparable to externally guided reinforcement learning. Our method is independent of the underlying off-policy learning algorithm. Since our method is environment-agnostic, the agent does not value any goals higher than others, leading to instability in performance for individual goals. However, in our experiments, we show that the average goal success rate improves and stabilizes. An agent trained with this method can be instructed to seek any observations made in the environment, enabling generic training of agents prior to specific use cases.

Why Goal-Conditioned Reinforcement Learning Works: Relation to Dual Control

Machine Learning (CS)

Teaches robots to reach any goal.

6 Dec 2025 1

87%

Autonomous Learning From Success and Failure: Goal-Conditioned Supervised Learning with Negative Feedback

Machine Learning (CS)

Helps robots learn from mistakes, not just wins.

3 Sep 2025 0

87%

Self-Supervised Goal-Reaching Results in Multi-Agent Cooperation and Exploration

Machine Learning (CS)

Robots learn to work together to reach goals.

12 Sep 2025 2

View PDF Login to Bookmark

Repos / Data Links

github.com

Page Count

11 pages

Environment Agnostic Goal-Conditioning, A Study of Reward-Free Autonomous Learning

Lets robots learn any task without rewards.

Technical Abstract

Why Goal-Conditioned Reinforcement Learning Works: Relation to Dual Control

Autonomous Learning From Success and Failure: Goal-Conditioned Supervised Learning with Negative Feedback

Self-Supervised Goal-Reaching Results in Multi-Agent Cooperation and Exploration