Mapping User Trust in Vision Language Models: Research Landscape, Challenges, and Prospects
By: Agnese Chiatti , Sara Bernardini , Lara Shibelski Godoy Piccolo and more
Potential Business Impact:
Helps people know when to trust AI that sees and talks.
The rapid adoption of Vision Language Models (VLMs), pre-trained on large image-text and video-text datasets, calls for protecting and informing users about when to trust these systems. This survey reviews studies on trust dynamics in user-VLM interactions, through a multi-disciplinary taxonomy encompassing different cognitive science capabilities, collaboration modes, and agent behaviours. Literature insights and findings from a workshop with prospective VLM users inform preliminary requirements for future VLM trust studies.
Similar Papers
Trust in Vision-Language Models: Insights from a Participatory User Workshop
Human-Computer Interaction
Helps people know when to trust AI image and video descriptions.
Through Their Eyes: User Perceptions on Sensitive Attribute Inference of Social Media Videos by Visual Language Models
Human-Computer Interaction
AI can guess private things about you from photos.
Zero-shot image privacy classification with Vision-Language Models
CV and Pattern Recognition
Makes computers better at guessing private pictures.