Score: 0

CRAFT-E: A Neuro-Symbolic Framework for Embodied Affordance Grounding

Published: December 3, 2025 | arXiv ID: 2512.04231v1

By: Zhou Chen , Joe Lin , Carson Bulgin and more

Potential Business Impact:

Robot learns to pick up objects for tasks.

Business Areas:

Robotics Hardware, Science and Engineering, Software

Assistive robots operating in unstructured environments must understand not only what objects are, but what they can be used for. This requires grounding language-based action queries to objects that both afford the requested function and can be physically retrieved. Existing approaches often rely on black-box models or fixed affordance labels, limiting transparency, controllability, and reliability for human-facing applications. We introduce CRAFT-E, a modular neuro-symbolic framework that composes a structured verb-property-object knowledge graph with visual-language alignment and energy-based grasp reasoning. The system generates interpretable grounding paths that expose the factors influencing object selection and incorporates grasp feasibility as an integral part of affordance inference. We further construct a benchmark dataset with unified annotations for verb-object compatibility, segmentation, and grasp candidates, and deploy the full pipeline on a physical robot. CRAFT-E achieves competitive performance in static scenes, ImageNet-based functional retrieval, and real-world trials involving 20 verbs and 39 objects. The framework remains robust under perceptual noise and provides transparent, component-level diagnostics. By coupling symbolic reasoning with embodied perception, CRAFT-E offers an interpretable and customizable alternative to end-to-end models for affordance-grounded object selection, supporting trustworthy decision-making in assistive robotic systems.

AffordBot: 3D Fine-grained Embodied Reasoning via Multimodal Large Language Models

CV and Pattern Recognition

Helps robots understand how to use objects.

13 Nov 2025 1

88%

Affordance-R1: Reinforcement Learning for Generalizable Affordance Reasoning in Multimodal Large Language Model

Robotics

Teaches robots how to use different objects.

8 Aug 2025 1

87%

Affordance-R1: Reinforcement Learning for Generalizable Affordance Reasoning in Multimodal Large Language Model

Robotics

Robots learn to use many different objects better.

8 Aug 2025 1

View PDF Login to Bookmark

Country of Origin

🇺🇸 United States

Page Count

20 pages

CRAFT-E: A Neuro-Symbolic Framework for Embodied Affordance Grounding

Robot learns to pick up objects for tasks.

Technical Abstract

AffordBot: 3D Fine-grained Embodied Reasoning via Multimodal Large Language Models

Affordance-R1: Reinforcement Learning for Generalizable Affordance Reasoning in Multimodal Large Language Model

Affordance-R1: Reinforcement Learning for Generalizable Affordance Reasoning in Multimodal Large Language Model