Adapting Large Language Models to Emerging Cybersecurity using Retrieval Augmented Generation
By: Arnabh Borah, Md Tanvirul Alam, Nidhi Rastogi
Potential Business Impact:
Helps computers spot new cyber threats faster.
Security applications are increasingly relying on large language models (LLMs) for cyber threat detection; however, their opaque reasoning often limits trust, particularly in decisions that require domain-specific cybersecurity knowledge. Because security threats evolve rapidly, LLMs must not only recall historical incidents but also adapt to emerging vulnerabilities and attack patterns. Retrieval-Augmented Generation (RAG) has demonstrated effectiveness in general LLM applications, but its potential for cybersecurity remains underexplored. In this work, we introduce a RAG-based framework designed to contextualize cybersecurity data and enhance LLM accuracy in knowledge retention and temporal reasoning. Using external datasets and the Llama-3-8B-Instruct model, we evaluate baseline RAG, an optimized hybrid retrieval approach, and conduct a comparative analysis across multiple performance metrics. Our findings highlight the promise of hybrid retrieval in strengthening the adaptability and reliability of LLMs for cybersecurity tasks.
Similar Papers
Large Language Models for Explainable Threat Intelligence
Computation and Language
Finds computer dangers and shows how it knows.
When Retrieval Succeeds and Fails: Rethinking Retrieval-Augmented Generation for LLMs
Computation and Language
Helps smart computers learn new things faster.
Secure Retrieval-Augmented Generation against Poisoning Attacks
Cryptography and Security
Stops bad info from tricking smart computer programs.