CSR-RAG: An Efficient Retrieval System for Text-to-SQL on the Enterprise Scale
By: Rajpreet Singh , Novak Boškov , Lawrence Drabeck and more
Potential Business Impact:
Finds the right data in huge computer lists.
Natural language to SQL translation (Text-to-SQL) is one of the long-standing problems that has recently benefited from advances in Large Language Models (LLMs). While most academic Text-to-SQL benchmarks request schema description as a part of natural language input, enterprise-scale applications often require table retrieval before SQL query generation. To address this need, we propose a novel hybrid Retrieval Augmented Generation (RAG) system consisting of contextual, structural, and relational retrieval (CSR-RAG) to achieve computationally efficient yet sufficiently accurate retrieval for enterprise-scale databases. Through extensive enterprise benchmarks, we demonstrate that CSR-RAG achieves up to 40% precision and over 80% recall while incurring a negligible average query generation latency of only 30ms on commodity data center hardware, which makes it appropriate for modern LLM-based enterprise-scale systems.
Similar Papers
Structured RAG for Answering Aggregative Questions
Computation and Language
Helps computers answer questions using many documents.
Efficient Knowledge Graph Construction and Retrieval from Unstructured Text for Large-Scale RAG Systems
Artificial Intelligence
Helps computers understand complex company information faster.
CG-RAG: Research Question Answering by Citation Graph Retrieval-Augmented LLMs
Information Retrieval
Helps computers find answers in science papers.