Score: 2

TaxoAdapt: Aligning LLM-Based Multidimensional Taxonomy Construction to Evolving Research Corpora

Published: June 12, 2025 | arXiv ID: 2506.10737v1

By: Priyanka Kargupta , Nan Zhang , Yunyi Zhang and more

Potential Business Impact:

Organizes science papers by how they change.

Business Areas:

Semantic Search Internet Services

The rapid evolution of scientific fields introduces challenges in organizing and retrieving scientific literature. While expert-curated taxonomies have traditionally addressed this need, the process is time-consuming and expensive. Furthermore, recent automatic taxonomy construction methods either (1) over-rely on a specific corpus, sacrificing generalizability, or (2) depend heavily on the general knowledge of large language models (LLMs) contained within their pre-training datasets, often overlooking the dynamic nature of evolving scientific domains. Additionally, these approaches fail to account for the multi-faceted nature of scientific literature, where a single research paper may contribute to multiple dimensions (e.g., methodology, new tasks, evaluation metrics, benchmarks). To address these gaps, we propose TaxoAdapt, a framework that dynamically adapts an LLM-generated taxonomy to a given corpus across multiple dimensions. TaxoAdapt performs iterative hierarchical classification, expanding both the taxonomy width and depth based on corpus' topical distribution. We demonstrate its state-of-the-art performance across a diverse set of computer science conferences over the years to showcase its ability to structure and capture the evolution of scientific fields. As a multidimensional method, TaxoAdapt generates taxonomies that are 26.51% more granularity-preserving and 50.41% more coherent than the most competitive baselines judged by LLMs.

Context-Aware Hierarchical Taxonomy Generation for Scientific Papers via LLM-Guided Multi-Aspect Clustering

Computation and Language

Organizes science papers into helpful, detailed lists.

23 Sep 2025 1

89%

Transforming Expert Knowledge into Scalable Ontology via Large Language Models

Artificial Intelligence

Helps computers understand and connect different ideas.

10 Jun 2025 1

88%

LLMTaxo: Leveraging Large Language Models for Constructing Taxonomy of Factual Claims from Social Media

Computation and Language

Organizes social media facts into easy-to-understand topics.

11 Apr 2025 0

View PDF Login to Bookmark

Country of Origin

🇺🇸 United States

Repos / Data Links

github.com

Page Count

17 pages

TaxoAdapt: Aligning LLM-Based Multidimensional Taxonomy Construction to Evolving Research Corpora

Organizes science papers by how they change.

Technical Abstract

Context-Aware Hierarchical Taxonomy Generation for Scientific Papers via LLM-Guided Multi-Aspect Clustering

Transforming Expert Knowledge into Scalable Ontology via Large Language Models

LLMTaxo: Leveraging Large Language Models for Constructing Taxonomy of Factual Claims from Social Media