The Wilhelm Tell Dataset of Affordance Demonstrations
By: Rachel Ringe , Mihai Pomarlan , Nikolaos Tsiogkas and more
Potential Business Impact:
Robots learn to do chores by watching videos.
Affordances - i.e. possibilities for action that an environment or objects in it provide - are important for robots operating in human environments to perceive. Existing approaches train such capabilities on annotated static images or shapes. This work presents a novel dataset for affordance learning of common household tasks. Unlike previous approaches, our dataset consists of video sequences demonstrating the tasks from first- and third-person perspectives, along with metadata about the affordances that are manifested in the task, and is aimed towards training perception systems to recognize affordance manifestations. The demonstrations were collected from several participants and in total record about seven hours of human activity. The variety of task performances also allows studying preparatory maneuvers that people may perform for a task, such as how they arrange their task space, which is also relevant for collaborative service robots.
Similar Papers
Visual Affordances: Enabling Robots to Understand Object Functionality
CV and Pattern Recognition
Helps robots understand how to use objects.
RoboAfford++: A Generative AI-Enhanced Dataset for Multimodal Affordance Learning in Robotic Manipulation and Navigation
Robotics
Helps robots understand how to grab and move things.
Egocentric Instruction-oriented Affordance Prediction via Large Multimodal Model
Robotics
Lets robots handle objects based on instructions