Score: 1

Long-Tailed Learning for Generalized Category Discovery

Published: June 8, 2025 | arXiv ID: 2506.06965v1

By: Cuong Manh Hoang

Potential Business Impact:

Finds new things even when there are few.

Business Areas:
Machine Learning Artificial Intelligence, Data and Analytics, Software

Generalized Category Discovery (GCD) utilizes labeled samples of known classes to discover novel classes in unlabeled samples. Existing methods show effective performance on artificial datasets with balanced distributions. However, real-world datasets are always imbalanced, significantly affecting the effectiveness of these methods. To solve this problem, we propose a novel framework that performs generalized category discovery in long-tailed distributions. We first present a self-guided labeling technique that uses a learnable distribution to generate pseudo-labels, resulting in less biased classifiers. We then introduce a representation balancing process to derive discriminative representations. By mining sample neighborhoods, this process encourages the model to focus more on tail classes. We conduct experiments on public datasets to demonstrate the effectiveness of the proposed framework. The results show that our model exceeds previous state-of-the-art methods.

Country of Origin
🇰🇷 Korea, Republic of

Page Count
10 pages

Category
Computer Science:
Artificial Intelligence