ForgeHLS: A Large-Scale, Open-Source Dataset for High-Level Synthesis
By: Zedong Peng , Zeju Li , Mingzhe Gao and more
Potential Business Impact:
Creates better computer chips from code faster.
High-Level Synthesis (HLS) plays a crucial role in modern hardware design by transforming high-level code into optimized hardware implementations. However, progress in applying machine learning (ML) to HLS optimization has been hindered by a shortage of sufficiently large and diverse datasets. To bridge this gap, we introduce ForgeHLS, a large-scale, open-source dataset explicitly designed for ML-driven HLS research. ForgeHLS comprises over 400k diverse designs generated from 846 kernels covering a broad range of application domains, consuming over 200k CPU hours during dataset construction. Each kernel includes systematically automated pragma insertions (loop unrolling, pipelining, array partitioning), combined with extensive design space exploration using Bayesian optimization. Compared to existing datasets, ForgeHLS significantly enhances scale, diversity, and design coverage. We further define and evaluate representative downstream tasks in Quality of Result (QoR) prediction and automated pragma exploration, clearly demonstrating ForgeHLS utility for developing and improving ML-based HLS optimization methodologies. The dataset and code are public at https://github.com/zedong-peng/ForgeHLS.
Similar Papers
ForgeHLS: A Large-Scale, Open-Source Dataset for High-Level Synthesis
Hardware Architecture
Creates better computer chips from code.
HLStrans: Dataset for LLM-Driven C-to-HLS Hardware Code Synthesis
Hardware Architecture
Helps computers design faster, better chips from code.
ForgeBench: A Machine Learning Benchmark Suite and Auto-Generation Framework for Next-Generation HLS Tools
Hardware Architecture
Makes computer chips for AI design faster.