Score: 0

ImageTalk: Designing a Multimodal AAC Text Generation System Driven by Image Recognition and Natural Language Generation

Published: December 10, 2025 | arXiv ID: 2512.09610v1

By: Boyin Yang, Puming Jiang, Per Ola Kristensson

Potential Business Impact:

Helps people with speech problems talk faster.

Business Areas:
Natural Language Processing Artificial Intelligence, Data and Analytics, Software

People living with Motor Neuron Disease (plwMND) frequently encounter speech and motor impairments that necessitate a reliance on augmentative and alternative communication (AAC) systems. This paper tackles the main challenge that traditional symbol-based AAC systems offer a limited vocabulary, while text entry solutions tend to exhibit low communication rates. To help plwMND articulate their needs about the system efficiently and effectively, we iteratively design and develop a novel multimodal text generation system called ImageTalk through a tailored proxy-user-based and an end-user-based design phase. The system demonstrates pronounced keystroke savings of 95.6%, coupled with consistent performance and high user satisfaction. We distill three design guidelines for AI-assisted text generation systems design and outline four user requirement levels tailored for AAC purposes, guiding future research in this field.

Country of Origin
🇬🇧 United Kingdom

Page Count
24 pages

Category
Computer Science:
Human-Computer Interaction