Score: 1

PCA-RAG: Principal Component Analysis for Efficient Retrieval-Augmented Generation

Published: April 11, 2025 | arXiv ID: 2504.08386v1

By: Arman Khaledian, Amirreza Ghadiridehkordi, Nariman Khaledian

Potential Business Impact:

Makes smart computer answers faster and smaller.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Retrieval-Augmented Generation (RAG) has emerged as a powerful paradigm for grounding large language models in external knowledge sources, improving the precision of agents responses. However, high-dimensional language model embeddings, often in the range of hundreds to thousands of dimensions, can present scalability challenges in terms of storage and latency, especially when processing massive financial text corpora. This paper investigates the use of Principal Component Analysis (PCA) to reduce embedding dimensionality, thereby mitigating computational bottlenecks without incurring large accuracy losses. We experiment with a real-world dataset and compare different similarity and distance metrics under both full-dimensional and PCA-compressed embeddings. Our results show that reducing vectors from 3,072 to 110 dimensions provides a sizeable (up to $60\times$) speedup in retrieval operations and a $\sim 28.6\times$ reduction in index size, with only moderate declines in correlation metrics relative to human-annotated similarity scores. These findings demonstrate that PCA-based compression offers a viable balance between retrieval fidelity and resource efficiency, essential for real-time systems such as Zanista AI's \textit{Newswitch} platform. Ultimately, our study underscores the practicality of leveraging classical dimensionality reduction techniques to scale RAG architectures for knowledge-intensive applications in finance and trading, where speed, memory efficiency, and accuracy must jointly be optimized.

Simple Methods Defend RAG Systems Well Against Real-World Attacks

Computation and Language

Keeps chatbots from answering unknown questions.

4 Aug 2025 2

88%

Rethinking Retrieval: From Traditional Retrieval Augmented Generation to Agentic and Non-Vector Reasoning Systems in the Financial Domain for Large Language Models

Computation and Language

Answers money questions using company reports.

22 Nov 2025 0

88%

Optimization of embeddings storage for RAG systems using quantization and dimensionality reduction techniques

Information Retrieval

Shrinks AI's memory needs, keeping it smart.

30 Apr 2025 0

View PDF Login to Bookmark

Repos / Data Links

huggingface.co

Page Count

19 pages

PCA-RAG: Principal Component Analysis for Efficient Retrieval-Augmented Generation

Makes smart computer answers faster and smaller.

Technical Abstract

Simple Methods Defend RAG Systems Well Against Real-World Attacks

Rethinking Retrieval: From Traditional Retrieval Augmented Generation to Agentic and Non-Vector Reasoning Systems in the Financial Domain for Large Language Models

Optimization of embeddings storage for RAG systems using quantization and dimensionality reduction techniques