Decoding the Surgical Scene: A Scoping Review of Scene Graphs in Surgery
By: Angelo Henriques , Korab Hoxha , Daniel Zapp and more
Potential Business Impact:
Helps robots understand surgery for safer operations.
Scene graphs (SGs) provide structured relational representations crucial for decoding complex, dynamic surgical environments. This PRISMA-ScR-guided scoping review systematically maps the evolving landscape of SG research in surgery, charting its applications, methodological advancements, and future directions. Our analysis reveals rapid growth, yet uncovers a critical 'data divide': internal-view research (e.g., triplet recognition) almost exclusively uses real-world 2D video, while external-view 4D modeling relies heavily on simulated data, exposing a key translational research gap. Methodologically, the field has advanced from foundational graph neural networks to specialized foundation models that now significantly outperform generalist large vision-language models in surgical contexts. This progress has established SGs as a cornerstone technology for both analysis, such as workflow recognition and automated safety monitoring, and generative tasks like controllable surgical simulation. Although challenges in data annotation and real-time implementation persist, they are actively being addressed through emerging techniques. Surgical SGs are maturing into an essential semantic bridge, enabling a new generation of intelligent systems to improve surgical safety, efficiency, and training.
Similar Papers
SG2VID: Scene Graphs Enable Fine-Grained Control for Video Synthesis
CV and Pattern Recognition
Makes surgery training videos more realistic and controllable.
KeySG: Hierarchical Keyframe-Based 3D Scene Graphs
CV and Pattern Recognition
Helps robots understand and navigate complex places.
Universal Scene Graph Generation
CV and Pattern Recognition
Lets computers understand pictures using different kinds of clues.