Securing RAG: A Risk Assessment and Mitigation Framework
By: Lukas Ammann , Sara Ott , Christoph R. Landolt and more
Potential Business Impact:
Protects private information when computers answer questions.
Retrieval Augmented Generation (RAG) has emerged as the de facto industry standard for user-facing NLP applications, offering the ability to integrate data without re-training or fine-tuning Large Language Models (LLMs). This capability enhances the quality and accuracy of responses but also introduces novel security and privacy challenges, particularly when sensitive data is integrated. With the rapid adoption of RAG, securing data and services has become a critical priority. This paper first reviews the vulnerabilities of RAG pipelines, and outlines the attack surface from data pre-processing and data storage management to integration with LLMs. The identified risks are then paired with corresponding mitigations in a structured overview. In a second step, the paper develops a framework that combines RAG-specific security considerations, with existing general security guidelines, industry standards, and best practices. The proposed framework aims to guide the implementation of robust, compliant, secure, and trustworthy RAG systems.
Similar Papers
RAG Security and Privacy: Formalizing the Threat Model and Attack Surface
Cryptography and Security
Protects secret information from AI that reads documents.
Privacy-Aware RAG: Secure and Isolated Knowledge Retrieval
Cryptography and Security
Keeps secret AI information safe from hackers.
Privacy Challenges and Solutions in Retrieval-Augmented Generation-Enhanced LLMs for Healthcare Chatbots: A Review of Applications, Risks, and Future Directions
Cryptography and Security
Keeps patient information safe in AI.