Causal Negative Sampling via Diffusion Model for Out-of-Distribution Recommendation
By: Chu Zhao , Eneng Yang , Yizhou Dang and more
Potential Business Impact:
Makes online suggestions more accurate and fair.
Heuristic negative sampling enhances recommendation performance by selecting negative samples of varying hardness levels from predefined candidate pools to guide the model toward learning more accurate decision boundaries. However, our empirical and theoretical analyses reveal that unobserved environmental confounders (e.g., exposure or popularity biases) in candidate pools may cause heuristic sampling methods to introduce false hard negatives (FHNS). These misleading samples can encourage the model to learn spurious correlations induced by such confounders, ultimately compromising its generalization ability under distribution shifts. To address this issue, we propose a novel method named Causal Negative Sampling via Diffusion (CNSDiff). By synthesizing negative samples in the latent space via a conditional diffusion process, CNSDiff avoids the bias introduced by predefined candidate pools and thus reduces the likelihood of generating FHNS. Moreover, it incorporates a causal regularization term to explicitly mitigate the influence of environmental confounders during the negative sampling process, leading to robust negatives that promote out-of-distribution (OOD) generalization. Comprehensive experiments under four representative distribution shift scenarios demonstrate that CNSDiff achieves an average improvement of 13.96% across all evaluation metrics compared to state-of-the-art baselines, verifying its effectiveness and robustness in OOD recommendation tasks.
Similar Papers
Diverse Negative Sampling for Implicit Collaborative Filtering
Information Retrieval
Recommends better by learning from more different "no" answers.
Diffusion-NPO: Negative Preference Optimization for Better Preference Aligned Generation of Diffusion Models
CV and Pattern Recognition
Makes AI art avoid bad ideas better.
MDNS: Masked Diffusion Neural Sampler via Stochastic Optimal Control
Machine Learning (CS)
Creates computer programs that guess better from many choices.