A Word is Worth 4-bit: Efficient Log Parsing with Binary Coded Decimal Recognition
By: Prerak Srivastava, Giulio Corallo, Sergey Rybalko
Potential Business Impact:
Finds hidden computer problems by reading code details.
System-generated logs are typically converted into categorical log templates through parsing. These templates are crucial for generating actionable insights in various downstream tasks. However, existing parsers often fail to capture fine-grained template details, leading to suboptimal accuracy and reduced utility in downstream tasks requiring precise pattern identification. We propose a character-level log parser utilizing a novel neural architecture that aggregates character embeddings. Our approach estimates a sequence of binary-coded decimals to achieve highly granular log templates extraction. Our low-resource character-level parser, tested on revised Loghub-2k and a manually annotated industrial dataset, matches LLM-based parsers in accuracy while outperforming semantic parsers in efficiency.
Similar Papers
Adaptive and Efficient Log Parsing as a Cloud Service
Software Engineering
Cleans up computer messages 840% faster.
System Log Parsing with Large Language Models: A Review
Machine Learning (CS)
Helps computers understand computer error messages better.
Logics-Parsing Technical Report
CV and Pattern Recognition
Reads messy documents, like newspapers, perfectly.