Score: 0

JEPA for RL: Investigating Joint-Embedding Predictive Architectures for Reinforcement Learning

Published: April 23, 2025 | arXiv ID: 2504.16591v1

By: Tristan Kenneweg, Philip Kenneweg, Barbara Hammer

Potential Business Impact:

Teaches robots to learn from watching.

Business Areas:

Image Recognition Data and Analytics, Software

Joint-Embedding Predictive Architectures (JEPA) have recently become popular as promising architectures for self-supervised learning. Vision transformers have been trained using JEPA to produce embeddings from images and videos, which have been shown to be highly suitable for downstream tasks like classification and segmentation. In this paper, we show how to adapt the JEPA architecture to reinforcement learning from images. We discuss model collapse, show how to prevent it, and provide exemplary data on the classical Cart Pole task.

SparseJEPA: Sparse Representation Learning of Joint Embedding Predictive Architectures

Machine Learning (CS)

Makes AI understand pictures better and more clearly.

22 Apr 2025 0

92%

Self-Supervised Representation Learning with Joint Embedding Predictive Architecture for Automotive LiDAR Object Detection

Robotics

Helps self-driving cars see better with less power.

9 Jan 2025 1

92%

VL-JEPA: Joint Embedding Predictive Architecture for Vision-language

CV and Pattern Recognition

Helps computers understand pictures and words better.

11 Dec 2025 0

View PDF Login to Bookmark

Page Count

6 pages

JEPA for RL: Investigating Joint-Embedding Predictive Architectures for Reinforcement Learning

Teaches robots to learn from watching.

Technical Abstract

SparseJEPA: Sparse Representation Learning of Joint Embedding Predictive Architectures

Self-Supervised Representation Learning with Joint Embedding Predictive Architecture for Automotive LiDAR Object Detection

VL-JEPA: Joint Embedding Predictive Architecture for Vision-language