Score: 0

Adversarial Cooperative Rationalization: The Risk of Spurious Correlations in Even Clean Datasets

Published: May 4, 2025 | arXiv ID: 2505.02118v5

By: Wei Liu , Zhongyu Niu , Lang Gao and more

Potential Business Impact:

Teaches computers to pick important facts, not fake ones.

Business Areas:

A/B Testing Data and Analytics

This study investigates the self-rationalization framework constructed with a cooperative game, where a generator initially extracts the most informative segment from raw input, and a subsequent predictor utilizes the selected subset for its input. The generator and predictor are trained collaboratively to maximize prediction accuracy. In this paper, we first uncover a potential caveat: such a cooperative game could unintentionally introduce a sampling bias during rationale extraction. Specifically, the generator might inadvertently create an incorrect correlation between the selected rationale candidate and the label, even when they are semantically unrelated in the original dataset. Subsequently, we elucidate the origins of this bias using both detailed theoretical analysis and empirical evidence. Our findings suggest a direction for inspecting these correlations through attacks, based on which we further introduce an instruction to prevent the predictor from learning the correlations. Through experiments on six text classification datasets and two graph classification datasets using three network architectures (GRUs, BERT, and GCN), we show that our method not only significantly outperforms recent rationalization methods, but also achieves comparable or even better results than a representative LLM (llama3.1-8b-instruct).

Learnable Game-theoretic Policy Optimization for Data-centric Self-explanation Rationalization

Artificial Intelligence

Helps AI explain its answers better.

15 Oct 2025 1

86%

CORA: Coalitional Rational Advantage Decomposition for Multi-Agent Policy Gradients

Multiagent Systems

Helps robot teams work together better.

3 Jun 2025 1

85%

Mitigating Spurious Correlations Between Question and Answer via Chain-of-Thought Correctness Perception Distillation

Computation and Language

Teaches small AI to think better by fixing its mistakes.

6 Sep 2025 0

View PDF Login to Bookmark

Country of Origin

🇨🇳 China

Page Count

20 pages

Adversarial Cooperative Rationalization: The Risk of Spurious Correlations in Even Clean Datasets

Teaches computers to pick important facts, not fake ones.

Technical Abstract

Learnable Game-theoretic Policy Optimization for Data-centric Self-explanation Rationalization

CORA: Coalitional Rational Advantage Decomposition for Multi-Agent Policy Gradients

Mitigating Spurious Correlations Between Question and Answer via Chain-of-Thought Correctness Perception Distillation