Score: 1

Context-Aware Hierarchical Taxonomy Generation for Scientific Papers via LLM-Guided Multi-Aspect Clustering

Published: September 23, 2025 | arXiv ID: 2509.19125v1

By: Kun Zhu , Lizi Liao , Yuxuan Gu and more

Potential Business Impact:

Organizes science papers into helpful, detailed lists.

Business Areas:

Semantic Search Internet Services

The rapid growth of scientific literature demands efficient methods to organize and synthesize research findings. Existing taxonomy construction methods, leveraging unsupervised clustering or direct prompting of large language models (LLMs), often lack coherence and granularity. We propose a novel context-aware hierarchical taxonomy generation framework that integrates LLM-guided multi-aspect encoding with dynamic clustering. Our method leverages LLMs to identify key aspects of each paper (e.g., methodology, dataset, evaluation) and generates aspect-specific paper summaries, which are then encoded and clustered along each aspect to form a coherent hierarchy. In addition, we introduce a new evaluation benchmark of 156 expert-crafted taxonomies encompassing 11.6k papers, providing the first naturally annotated dataset for this task. Experimental results demonstrate that our method significantly outperforms prior approaches, achieving state-of-the-art performance in taxonomy coherence, granularity, and interpretability.

TaxoAdapt: Aligning LLM-Based Multidimensional Taxonomy Construction to Evolving Research Corpora

Computation and Language

Organizes science papers by how they change.

12 Jun 2025 2

89%

Transforming Expert Knowledge into Scalable Ontology via Large Language Models

Artificial Intelligence

Helps computers understand and connect different ideas.

10 Jun 2025 1

88%

A Scalable Unsupervised Framework for multi-aspect labeling of Multilingual and Multi-Domain Review Data

Computation and Language

Helps computers understand reviews in any language.

14 May 2025 1

View PDF Login to Bookmark

Country of Origin

🇸🇬 Singapore

Page Count

19 pages

Context-Aware Hierarchical Taxonomy Generation for Scientific Papers via LLM-Guided Multi-Aspect Clustering

Organizes science papers into helpful, detailed lists.

Technical Abstract

TaxoAdapt: Aligning LLM-Based Multidimensional Taxonomy Construction to Evolving Research Corpora

Transforming Expert Knowledge into Scalable Ontology via Large Language Models

A Scalable Unsupervised Framework for multi-aspect labeling of Multilingual and Multi-Domain Review Data