Score: 1

Pooling Attention: Evaluating Pretrained Transformer Embeddings for Deception Classification

Published: November 28, 2025 | arXiv ID: 2511.22977v1

By: Sumit Mamtani, Abhijeet Bhure

BigTech Affiliations: Mercari

Potential Business Impact:

Finds fake news using smart computer language.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

This paper investigates fake news detection as a downstream evaluation of Transformer representations, benchmarking encoder-only and decoder-only pre-trained models (BERT, GPT-2, Transformer-XL) as frozen embedders paired with lightweight classifiers. Through controlled preprocessing comparing pooling versus padding and neural versus linear heads, results demonstrate that contextual self-attention encodings consistently transfer effectively. BERT embeddings combined with logistic regression outperform neural baselines on LIAR dataset splits, while analyses of sequence length and aggregation reveal robustness to truncation and advantages from simple max or average pooling. This work positions attention-based token encoders as robust, architecture-centric foundations for veracity tasks, isolating Transformer contributions from classifier complexity.

Advancing Text Classification with Large Language Models and Neural Attention Mechanisms

Computation and Language

Helps computers understand and sort text better.

10 Dec 2025 0

88%

Generalization Gaps in Political Fake News Detection: An Empirical Study on the LIAR Dataset

Computation and Language

Helps computers spot fake news better.

20 Dec 2025 1

87%

Attention Needs to Focus: A Unified Perspective on Attention Allocation

Machine Learning (CS)

Makes AI focus better, avoiding mistakes.

1 Jan 2026 0

View PDF Login to Bookmark

Country of Origin

🇯🇵 Japan

Page Count

6 pages

Pooling Attention: Evaluating Pretrained Transformer Embeddings for Deception Classification

Finds fake news using smart computer language.

Technical Abstract

Advancing Text Classification with Large Language Models and Neural Attention Mechanisms

Generalization Gaps in Political Fake News Detection: An Empirical Study on the LIAR Dataset

Attention Needs to Focus: A Unified Perspective on Attention Allocation