ArchXBench: A Complex Digital Systems Benchmark Suite for LLM Driven RTL Synthesis
By: Suresh Purini , Siddhant Garg , Mudit Gaur and more
Potential Business Impact:
AI designs complex computer chips automatically.
Modern SoC datapaths include deeply pipelined, domain-specific accelerators, but their RTL implementation and verification are still mostly done by hand. While large language models (LLMs) exhibit advanced code-generation abilities for programming languages like Python, their application to Verilog-like RTL remains in its nascent stage. This is reflected in the simple arithmetic and control circuits currently used to evaluate generative capabilities in existing benchmarks. In this paper, we introduce ArchXBench, a six-level benchmark suite that encompasses complex arithmetic circuits and other advanced digital subsystems drawn from domains such as cryptography, image processing, machine learning, and signal processing. Architecturally, some of these designs are purely combinational, others are multi-cycle or pipelined, and many require hierarchical composition of modules. For each benchmark, we provide a problem description, design specification, and testbench, enabling rapid research in the area of LLM-driven agentic approaches for complex digital systems design. Using zero-shot prompting with Claude Sonnet 4, GPT 4.1, o4-mini-high, and DeepSeek R1 under a pass@5 criterion, we observed that o4-mini-high successfully solves the largest number of benchmarks, 16 out of 30, spanning Levels 1, 2, and 3. From Level 4 onward, however, all models consistently fail, highlighting a clear gap in the capabilities of current state-of-the-art LLMs and prompting/agentic approaches.
Similar Papers
Automating Hardware Design and Verification from Architectural Papers via a Neural-Symbolic Graph Framework
Computation and Language
Builds computer chips from research papers.
QuArch: A Benchmark for Evaluating LLM Reasoning in Computer Architecture
Hardware Architecture
Tests AI's smarts about how computers work.
Platform-Agnostic Modular Architecture for Quantum Benchmarking
Quantum Physics
Makes different quantum computers work together better.