Score: 0

Few-Shot Inference of Human Perceptions of Robot Performance in Social Navigation Scenarios

Published: December 17, 2025 | arXiv ID: 2512.16019v1

By: Qiping Zhang , Nathan Tsoi , Mofeed Nagib and more

Understanding how humans evaluate robot behavior during human-robot interactions is crucial for developing socially aware robots that behave according to human expectations. While the traditional approach to capturing these evaluations is to conduct a user study, recent work has proposed utilizing machine learning instead. However, existing data-driven methods require large amounts of labeled data, which limits their use in practice. To address this gap, we propose leveraging the few-shot learning capabilities of Large Language Models (LLMs) to improve how well a robot can predict a user's perception of its performance, and study this idea experimentally in social navigation tasks. To this end, we extend the SEAN TOGETHER dataset with additional real-world human-robot navigation episodes and participant feedback. Using this augmented dataset, we evaluate the ability of several LLMs to predict human perceptions of robot performance from a small number of in-context examples, based on observed spatio-temporal cues of the robot and surrounding human motion. Our results demonstrate that LLMs can match or exceed the performance of traditional supervised learning models while requiring an order of magnitude fewer labeled instances. We further show that prediction performance can improve with more in-context examples, confirming the scalability of our approach. Additionally, we investigate what kind of sensor-based information an LLM relies on to make these inferences by conducting an ablation study on the input features considered for performance prediction. Finally, we explore the novel application of personalized examples for in-context learning, i.e., drawn from the same user being evaluated, finding that they further enhance prediction accuracy. This work paves the path to improving robot behavior in a scalable manner through user-centered feedback.

Few-shot Vision-based Human Activity Recognition with MLLM-based Visual Reinforcement Learning

Robotics

Teaches computers to recognize actions from few pictures.

14 Aug 2025 0

89%

Building Knowledge from Interactions: An LLM-Based Architecture for Adaptive Tutoring and Social Reasoning

Robotics

Robots learn to teach and remember like humans.

2 Apr 2025 0

89%

Vision-and-Language Navigation with Analogical Textual Descriptions in LLMs

Artificial Intelligence

Helps robots understand places better to find their way.

29 Sep 2025 1

View PDF Login to Bookmark

Few-Shot Inference of Human Perceptions of Robot Performance in Social Navigation Scenarios

Technical Abstract

Few-shot Vision-based Human Activity Recognition with MLLM-based Visual Reinforcement Learning

Building Knowledge from Interactions: An LLM-Based Architecture for Adaptive Tutoring and Social Reasoning

Vision-and-Language Navigation with Analogical Textual Descriptions in LLMs