Interleaved Learning and Exploration: A Self-Adaptive Fuzz Testing Framework for MLIR
By: Zeyu Sun , Jingjing Liang , Weiyi Wang and more
Potential Business Impact:
Finds hidden computer code mistakes faster.
MLIR (Multi-Level Intermediate Representation) has rapidly become a foundational technology for modern compiler frameworks, enabling extensibility across diverse domains. However, ensuring the correctness and robustness of MLIR itself remains challenging. Existing fuzzing approaches-based on manually crafted templates or rule-based mutations-struggle to generate sufficiently diverse and semantically valid test cases, making it difficult to expose subtle or deep-seated bugs within MLIR's complex and evolving code space. In this paper, we present FLEX, a novel self-adaptive fuzzing framework for MLIR. FLEX leverages neural networks for program generation, a perturbed sampling strategy to encourage diversity, and a feedback-driven augmentation loop that iteratively improves its model using both crashing and non-crashing test cases. Starting from a limited seed corpus, FLEX progressively learns valid syntax and semantics and autonomously produces high-quality test inputs. We evaluate FLEX on the upstream MLIR compiler against four state-of-the-art fuzzers. In a 30-day campaign, FLEX discovers 80 previously unknown bugs-including multiple new root causes and parser bugs-while in 24-hour fixed-revision comparisons, it detects 53 bugs (over 3.5x as many as the best baseline) and achieves 28.2% code coverage, outperforming the next-best tool by 42%. Ablation studies further confirm the critical role of both perturbed generation and diversity augmentation in FLEX's effectiveness.
Similar Papers
Hybrid Fuzzing with LLM-Guided Input Mutation and Semantic Feedback
Cryptography and Security
Finds computer bugs faster using smart guessing.
Semantic-Aware Fuzzing: An Empirical Framework for LLM-Guided, Reasoning-Driven Input Mutation
Software Engineering
Finds hidden bugs in smart devices.
Targeted Testing of Compiler Optimizations via Grammar-Level Composition Styles
Software Engineering
Finds hidden computer code errors by testing one part at a time.