From Newborn to Impact: Bias-Aware Citation Prediction
By: Mingfei Lu , Mengjia Wu , Jiawei Xu and more
Potential Business Impact:
Helps predict which new science papers will be important.
As a key to accessing research impact, citation dynamics underpins research evaluation, scholarly recommendation, and the study of knowledge diffusion. Citation prediction is particularly critical for newborn papers, where early assessment must be performed without citation signals and under highly long-tailed distributions. We identify two key research gaps: (i) insufficient modeling of implicit factors of scientific impact, leading to reliance on coarse proxies; and (ii) a lack of bias-aware learning that can deliver stable predictions on lowly cited papers. We address these gaps by proposing a Bias-Aware Citation Prediction Framework, which combines multi-agent feature extraction with robust graph representation learning. First, a multi-agent x graph co-learning module derives fine-grained, interpretable signals, such as reproducibility, collaboration network, and text quality, from metadata and external resources, and fuses them with heterogeneous-network embeddings to provide rich supervision even in the absence of early citation signals. Second, we incorporate a set of robust mechanisms: a two-stage forward process that routes explicit factors through an intermediate exposure estimate, GroupDRO to optimize worst-case group risk across environments, and a regularization head that performs what-if analyses on controllable factors under monotonicity and smoothness constraints. Comprehensive experiments on two real-world datasets demonstrate the effectiveness of our proposed model. Specifically, our model achieves around a 13% reduction in error metrics (MALE and RMSLE) and a notable 5.5% improvement in the ranking metric (NDCG) over the baseline methods.
Similar Papers
Academic Literature Recommendation in Large-scale Citation Networks Enhanced by Large Language Models
Applications
Finds the best science papers for researchers.
In-depth Research Impact Summarization through Fine-Grained Temporal Citation Analysis
Digital Libraries
Summarizes how science papers change ideas.
An Agent-based Model of Citation Behavior
Social and Information Networks
Helps scientists get more credit for their work.