Large-Scale Aspect-Based Sentiment Analysis with Reasoning-Infused LLMs
By: Paweł Liskowski, Krzysztof Jankowski
Potential Business Impact:
Helps computers understand feelings in reviews better.
We introduce Arctic-ABSA, a collection of powerful models for real-life aspect-based sentiment analysis (ABSA). Our models are tailored to commercial needs, trained on a large corpus of public data alongside carefully generated synthetic data, resulting in a dataset 20 times larger than SemEval14. We extend typical ABSA models by expanding the number of sentiment classes from the standard three (positive, negative, neutral) to five, adding mixed and unknown classes, while also jointly predicting overall text sentiment and supporting multiple languages. We experiment with reasoning injection by fine-tuning on Chain-of-Thought (CoT) examples and introduce a novel reasoning pretraining technique for encoder-only models that significantly improves downstream fine-tuning and generalization. Our 395M-parameter encoder and 8B-parameter decoder achieve up to 10 percentage points higher accuracy than GPT-4o and Claude 3.5 Sonnet, while setting new state-of-the-art results on the SemEval14 benchmark. A single multilingual model maintains 87-91% accuracy across six languages without degrading English performance. We release ABSA-mix, a large-scale benchmark aggregating 17 public ABSA datasets across 92 domains.
Similar Papers
Large Language Models for Czech Aspect-Based Sentiment Analysis
Computation and Language
Helps computers understand feelings about specific things.
Advancing Cross-lingual Aspect-Based Sentiment Analysis with LLMs and Constrained Decoding for Sequence-to-Sequence Models
Computation and Language
Helps computers understand opinions in any language.
Learning to Extract Cross-Domain Aspects and Understanding Sentiments Using Large Language Models
Computation and Language
Finds what people like or dislike about products.