Score: 0

Differential Robustness in Transformer Language Models: Empirical Evaluation Under Adversarial Text Attacks

Published: September 5, 2025 | arXiv ID: 2509.09706v1

By: Taniya Gidatkar, Oluwaseun Ajao, Matthew Shardlow

Potential Business Impact:

Makes AI smarter and harder to trick.

Business Areas:

Text Analytics Data and Analytics, Software

This study evaluates the resilience of large language models (LLMs) against adversarial attacks, specifically focusing on Flan-T5, BERT, and RoBERTa-Base. Using systematically designed adversarial tests through TextFooler and BERTAttack, we found significant variations in model robustness. RoBERTa-Base and FlanT5 demonstrated remarkable resilience, maintaining accuracy even when subjected to sophisticated attacks, with attack success rates of 0%. In contrast. BERT-Base showed considerable vulnerability, with TextFooler achieving a 93.75% success rate in reducing model accuracy from 48% to just 3%. Our research reveals that while certain LLMs have developed effective defensive mechanisms, these safeguards often require substantial computational resources. This study contributes to the understanding of LLM security by identifying existing strengths and weaknesses in current safeguarding approaches and proposes practical recommendations for developing more efficient and effective defensive strategies.

Adversarial Attack Classification and Robustness Testing for Large Language Models for Code

Software Engineering

Makes computer code safer from tricky words.

9 Jun 2025 0

89%

An Investigation of Large Language Models and Their Vulnerabilities in Spam Detection

Cryptography and Security

New AI can't always spot tricky spam emails.

14 Apr 2025 1

88%

Adversarial Question Answering Robustness: A Multi-Level Error Analysis and Mitigation Study

Computation and Language

Makes AI better at answering questions, even tricky ones.

6 Jan 2026 0

View PDF Login to Bookmark

Country of Origin

🇬🇧 United Kingdom

Page Count

8 pages

Differential Robustness in Transformer Language Models: Empirical Evaluation Under Adversarial Text Attacks

Makes AI smarter and harder to trick.

Technical Abstract

Adversarial Attack Classification and Robustness Testing for Large Language Models for Code

An Investigation of Large Language Models and Their Vulnerabilities in Spam Detection

Adversarial Question Answering Robustness: A Multi-Level Error Analysis and Mitigation Study