Hard negative sampling in hyperedge prediction
By: Zhenyu Deng, Tao Zhou, Yilin Bi
Potential Business Impact:
Finds hidden connections by making smart guesses.
Hypergraph, which allows each hyperedge to encompass an arbitrary number of nodes, is a powerful tool for modeling multi-entity interactions. Hyperedge prediction is a fundamental task that aims to predict future hyperedges or identify existent but unobserved hyperedges based on those observed. In link prediction for simple graphs, most observed links are treated as positive samples, while all unobserved links are considered as negative samples. However, this full-sampling strategy is impractical for hyperedge prediction, due to the number of unobserved hyperedges in a hypergraph significantly exceeds the number of observed ones. Therefore, one has to utilize some negative sampling methods to generate negative samples, ensuring their quantity is comparable to that of positive samples. In current hyperedge prediction, randomly selecting negative samples is a routine practice. But through experimental analysis, we discover a critical limitation of random selecting that the generated negative samples are too easily distinguishable from positive samples. This leads to premature convergence of the model and reduces the accuracy of prediction. To overcome this issue, we propose a novel method to generate negative samples, named as hard negative sampling (HNS). Unlike traditional methods that construct negative hyperedges by selecting node sets from the original hypergraph, HNS directly synthesizes negative samples in the hyperedge embedding space, thereby generating more challenging and informative negative samples. Our results demonstrate that HNS significantly enhances both accuracy and robustness of the prediction. Moreover, as a plug-and-play technique, HNS can be easily applied in the training of various hyperedge prediction models based on representation learning.
Similar Papers
HyperSearch: Prediction of New Hyperedges through Unconstrained yet Efficient Search
Social and Information Networks
Finds hidden connections in groups of things.
Sampling nodes and hyperedges via random walks on large hypergraphs
Social and Information Networks
Helps study big groups by looking at small parts.
Causal Negative Sampling via Diffusion Model for Out-of-Distribution Recommendation
Machine Learning (CS)
Makes online suggestions more accurate and fair.