Score: 2

What Do Prosody and Text Convey? Characterizing How Meaningful Information is Distributed Across Multiple Channels

Published: December 18, 2025 | arXiv ID: 2512.16832v1

By: Aditya Yadavalli , Tiago Pimentel , Tamar I Regev and more

BigTech Affiliations: Massachusetts Institute of Technology

Potential Business Impact:

Finds what emotions speech melody tells us.

Business Areas:
Text Analytics Data and Analytics, Software

Prosody -- the melody of speech -- conveys critical information often not captured by the words or text of a message. In this paper, we propose an information-theoretic approach to quantify how much information is expressed by prosody alone and not by text, and crucially, what that information is about. Our approach applies large speech and language models to estimate the mutual information between a particular dimension of an utterance's meaning (e.g., its emotion) and any of its communication channels (e.g., audio or text). We then use this approach to quantify how much information is conveyed by audio and text about sarcasm, emotion, and questionhood, using speech from television and podcasts. We find that for sarcasm and emotion the audio channel -- and by implication the prosodic channel -- transmits over an order of magnitude more information about these features than the text channel alone, at least when long-term context beyond the current sentence is unavailable. For questionhood, prosody provides comparatively less additional information. We conclude by outlining a program applying our approach to more dimensions of meaning, communication channels, and languages.

Country of Origin
πŸ‡ΊπŸ‡Έ πŸ‡¨πŸ‡­ Switzerland, United States

Page Count
16 pages

Category
Computer Science:
Computation and Language