Affordance-Based Disambiguation of Surgical Instructions for Collaborative Robot-Assisted Surgery
By: Ana Davila, Jacinto Colan, Yasuhisa Hasegawa
Potential Business Impact:
Robot helps surgeons by understanding their spoken words.
Effective human-robot collaboration in surgery is affected by the inherent ambiguity of verbal communication. This paper presents a framework for a robotic surgical assistant that interprets and disambiguates verbal instructions from a surgeon by grounding them in the visual context of the operating field. The system employs a two-level affordance-based reasoning process that first analyzes the surgical scene using a multimodal vision-language model and then reasons about the instruction using a knowledge base of tool capabilities. To ensure patient safety, a dual-set conformal prediction method is used to provide a statistically rigorous confidence measure for robot decisions, allowing it to identify and flag ambiguous commands. We evaluated our framework on a curated dataset of ambiguous surgical requests from cholecystectomy videos, demonstrating a general disambiguation rate of 60% and presenting a method for safer human-robot interaction in the operating room.
Similar Papers
Affordance-Based Disambiguation of Surgical Instructions for Collaborative Robot-Assisted Surgery
Robotics
Robot surgeon understands doctor's spoken commands better.
Enhancing Speech Instruction Understanding and Disambiguation in Robotics via Speech Prosody
Robotics
Robots understand your voice commands better.
Egocentric Instruction-oriented Affordance Prediction via Large Multimodal Model
Robotics
Lets robots handle objects based on instructions