Score: 0

Multilingual Sentiment Analysis of Summarized Texts: A Cross-Language Study of Text Shortening Effects

Published: March 31, 2025 | arXiv ID: 2504.00265v1

By: Mikhail Krasitskii , Grigori Sidorov , Olga Kolesnikova and more

Potential Business Impact:

Summaries keep opinions clear, especially for tricky languages.

Business Areas:
Text Analytics Data and Analytics, Software

Summarization significantly impacts sentiment analysis across languages with diverse morphologies. This study examines extractive and abstractive summarization effects on sentiment classification in English, German, French, Spanish, Italian, Finnish, Hungarian, and Arabic. We assess sentiment shifts post-summarization using multilingual transformers (mBERT, XLM-RoBERTa, T5, and BART) and language-specific models (FinBERT, AraBERT). Results show extractive summarization better preserves sentiment, especially in morphologically complex languages, while abstractive summarization improves readability but introduces sentiment distortion, affecting sentiment accuracy. Languages with rich inflectional morphology, such as Finnish, Hungarian, and Arabic, experience greater accuracy drops than English or German. Findings emphasize the need for language-specific adaptations in sentiment analysis and propose a hybrid summarization approach balancing readability and sentiment preservation. These insights benefit multilingual sentiment applications, including social media monitoring, market analysis, and cross-lingual opinion mining.

Page Count
22 pages

Category
Computer Science:
Computation and Language