MHRC-Bench: A Multilingual Hardware Repository-Level Code Completion benchmark
By: Qingyun Zou , Jiahao Cui , Nuo Chen and more
Potential Business Impact:
Helps computers write code for computer chips.
Large language models (LLMs) have achieved strong performance on code completion tasks in general-purpose programming languages. However, existing repository-level code completion benchmarks focus almost exclusively on software code and largely overlook hardware description languages. In this work, we present \textbf{MHRC-Bench}, consisting of \textbf{MHRC-Bench-Train} and \textbf{MHRC-Bench-Eval}, the first benchmark designed for multilingual hardware code completion at the repository level. Our benchmark targets completion tasks and covers three major hardware design coding styles. Each completion target is annotated with code-structure-level and hardware-oriented semantic labels derived from concrete syntax tree analysis. We conduct a comprehensive evaluation of models on MHRC-Bench-Eval. Comprehensive evaluation results and analysis demonstrate the effectiveness of MHRC-Bench.
Similar Papers
MRG-Bench: Evaluating and Exploring the Requirements of Context for Repository-Level Code Generation
Software Engineering
Tests if AI can write code for different languages.
AutoCodeBench: Large Language Models are Automatic Code Benchmark Generators
Computation and Language
Makes computers write code in many languages.
MRCEval: A Comprehensive, Challenging and Accessible Machine Reading Comprehension Benchmark
Computation and Language
Tests how well computers understand what they read.