Score: 2

Learning to Generate Pointing Gestures in Situated Embodied Conversational Agents

Published: September 15, 2025 | arXiv ID: 2509.12507v1

By: Anna Deichler , Siyang Wang , Simon Alexanderson and more

Potential Business Impact:

Robots learn to point and talk naturally.

Business Areas:
Motion Capture Media and Entertainment, Video

One of the main goals of robotics and intelligent agent research is to enable natural communication with humans in physically situated settings. While recent work has focused on verbal modes such as language and speech, non-verbal communication is crucial for flexible interaction. We present a framework for generating pointing gestures in embodied agents by combining imitation and reinforcement learning. Using a small motion capture dataset, our method learns a motor control policy that produces physically valid, naturalistic gestures with high referential accuracy. We evaluate the approach against supervised learning and retrieval baselines in both objective metrics and a virtual reality referential game with human users. Results show that our system achieves higher naturalness and accuracy than state-of-the-art supervised models, highlighting the promise of imitation-RL for communicative gesture generation and its potential application to robots.

Country of Origin
πŸ‡ΈπŸ‡ͺ Sweden

Repos / Data Links

Page Count
25 pages

Category
Computer Science:
Robotics