Score: 1

Optimized Log Parsing with Syntactic Modifications

Published: October 30, 2025 | arXiv ID: 2510.26793v1

By: Nafid Enan, Gias Uddin

Potential Business Impact:

Makes computer logs easier to understand.

Business Areas:
Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Logs provide valuable insights into system runtime and assist in software development and maintenance. Log parsing, which converts semi-structured log data into structured log data, is often the first step in automated log analysis. Given the wide range of log parsers utilizing diverse techniques, it is essential to evaluate them to understand their characteristics and performance. In this paper, we conduct a comprehensive empirical study comparing syntax- and semantic-based log parsers, as well as single-phase and two-phase parsing architectures. Our experiments reveal that semantic-based methods perform better at identifying the correct templates and syntax-based log parsers are 10 to 1,000 times more efficient and provide better grouping accuracy although they fall short in accurate template identification. Moreover, two-phase architecture consistently improves accuracy compared to single-phase architecture. Based on the findings of this study, we propose SynLog+, a template identification module that acts as the second phase in a two-phase log parsing architecture. SynLog+ improves the parsing accuracy of syntax-based and semantic-based log parsers by 236\% and 20\% on average, respectively, with virtually no additional runtime cost.

Country of Origin
🇨🇦 Canada

Repos / Data Links

Page Count
22 pages

Category
Computer Science:
Software Engineering