Score: 2

Leveraging Language Semantics for Collaborative Filtering with TextGCN and TextGCN-MLP: Zero-Shot vs In-Domain Performance

Published: October 14, 2025 | arXiv ID: 2510.12461v1

By: Andrei Chernov, Haroon Wahab, Oleg Novitskij

Potential Business Impact:

Helps computers suggest movies you'll like.

Business Areas:
Text Analytics Data and Analytics, Software

In recent years, various approaches have been proposed to leverage large language models (LLMs) for incorporating textual information about items into recommender systems. Existing methods primarily focus on either fine-tuning LLMs to generate recommendations or integrating LLM-based embeddings into downstream models. In this work, we follow the latter direction and propose \textbf{TextGCN}, which applies parameter-free graph convolution layers directly over LLM-based item-title embeddings, instead of learning ID-based embeddings as in traditional methods. By combining language semantics with graph message passing, this architecture achieves state-of-the-art zero-shot performance, significantly outperforming prior approaches. Furthermore, we introduce \textbf{TextGCN-MLP}, which extends TextGCN with a trainable multilayer perceptron trained using a contrastive loss, achieving state-of-the-art in-domain performance on recommendation benchmarks. However, the zero-shot performance of TextGCN-MLP remains lower than that of TextGCN, highlighting the trade-off between in-domain specialization and zero-shot generalization. We release our code on github at \href{https://github.com/ChernovAndrey/TFCE}{github.com/ChernovAndrey/TFCE}.

Repos / Data Links

Page Count
20 pages

Category
Computer Science:
Information Retrieval