Interaction-Augmented Instruction: Modeling the Synergy of Prompts and Interactions in Human-GenAI Collaboration
By: Leixian Shen , Yifang Wang , Huamin Qu and more
Potential Business Impact:
Helps AI understand instructions better with clicks.
Text prompt is the most common way for human-generative AI (GenAI) communication. Though convenient, it is challenging to convey fine-grained and referential intent. One promising solution is to combine text prompts with precise GUI interactions, like brushing and clicking. However, there lacks a formal model to model synergistic designs between prompts and interactions, hindering their comparison and innovation. To fill this gap, via an iterative and deductive process, we develop the Interaction-Augmented Instruction (IAI) model, a compact entity-relation graph formalizing how the combination of interactions and text prompts enhances human-generative AI communication. With the model, we distill twelve recurring and composable atomic interaction paradigms from prior tools, verifying our model's capability to facilitate systematic design characterization and comparison. Case studies further demonstrate the model's utility in applying, refining, and extending these paradigms. These results illustrate our IAI model's descriptive, discriminative, and generative power for shaping future GenAI systems.
Similar Papers
Prompting Generative AI with Interaction-Augmented Instructions
Human-Computer Interaction
Makes AI understand your instructions better.
Expanding the Generative AI Design Space through Structured Prompting and Multimodal Interfaces
Human-Computer Interaction
Helps small businesses make ads easily.
I Prompt, it Generates, we Negotiate. Exploring Text-Image Intertextuality in Human-AI Co-Creation of Visual Narratives with VLMs
Human-Computer Interaction
Helps people tell stories with AI pictures.