Log Anomaly Detection with Large Language Models via Knowledge-Enriched Fusion
By: Anfeng Peng, Ajesh Koyatan Chathoth, Stephen Lee
System logs are a critical resource for monitoring and managing distributed systems, providing insights into failures and anomalous behavior. Traditional log analysis techniques, including template-based and sequence-driven approaches, often lose important semantic information or struggle with ambiguous log patterns. To address this, we present EnrichLog, a training-free, entry-based anomaly detection framework that enriches raw log entries with both corpus-specific and sample-specific knowledge. EnrichLog incorporates contextual information, including historical examples and reasoning derived from the corpus, to enable more accurate and interpretable anomaly detection. The framework leverages retrieval-augmented generation to integrate relevant contextual knowledge without requiring retraining. We evaluate EnrichLog on four large-scale system log benchmark datasets and compare it against five baseline methods. Our results show that EnrichLog consistently improves anomaly detection performance, effectively handles ambiguous log entries, and maintains efficient inference. Furthermore, incorporating both corpus- and sample-specific knowledge enhances model confidence and detection accuracy, making EnrichLog well-suited for practical deployments.
Similar Papers
FusionLog: Cross-System Log-based Anomaly Detection via Fusion of General and Proprietary Knowledge
Machine Learning (CS)
Finds computer problems without needing examples.
LogPurge: Log Data Purification for Anomaly Detection via Rule-Enhanced Filtering
Software Engineering
Cleans computer logs to find problems faster.
AnomalyGen: An Automated Semantic Log Sequence Generation Framework with LLM for Anomaly Detection
Software Engineering
Creates realistic computer error logs for better detection.