Score: 1

MARVEL: An End-to-End Framework for Generating Model-Class Aware Custom RISC-V Extensions for Lightweight AI

Published: August 3, 2025 | arXiv ID: 2508.01800v1

By: Ajay Kumar M , Cian O'Mahoney , Pedro Kreutz Werle and more

Potential Business Impact:

Makes smart devices run AI faster, using less power.

Deploying deep neural networks (DNNs) on resource-constrained IoT devices remains a challenging problem, often requiring hardware modifications tailored to individual AI models. Existing accelerator-generation tools, such as AMD's FINN, do not adequately address extreme resource limitations faced by IoT endpoints operating in bare-metal environments without an operating system (OS). To overcome these constraints, we propose MARVEL-an automated, end-to-end framework that generates custom RISC-V ISA extensions tailored to specific DNN model classes, with a primary focus on convolutional neural networks (CNNs). The proposed method profiles high-level DNN representations in Python and generates an ISA-extended RISC-V core with associated compiler tools for efficient deployment. The flow leverages (1) Apache TVM for translating high-level Python-based DNN models into optimized C code, (2) Synopsys ASIP Designer for identifying compute-intensive kernels, modeling, and generating a custom RISC-V and (3) Xilinx Vivado for FPGA implementation. Beyond a model class specific RISC-V, our approach produces an optimized bare-metal C implementation, eliminating the need for an OS or extensive software dependencies. Unlike conventional deployment pipelines relying on TensorFlow/PyTorch runtimes, our solution enables seamless execution in highly resource-constrained environments. We evaluated the flow on popular DNN models such as LeNet-5*, MobileNetV1, ResNet50, VGG16, MobileNetV2 and DenseNet121 using the Synopsys trv32p3 RISC-V core as a baseline. Results show a 2x speedup in inference and upto 2x reduction in energy per inference at a 28.23% area overhead when implemented on an AMD Zynq UltraScale+ ZCU104 FPGA platform.

MaRVIn: A Cross-Layer Mixed-Precision RISC-V Framework for DNN Inference, from ISA Extension to Hardware Acceleration

Machine Learning (CS)

Makes smart computers run faster and use less power.

18 Sep 2025 1

88%

Bare-Metal RISC-V + NVDLA SoC for Efficient Deep Learning Inference

Hardware Architecture

Makes smart devices run AI much faster.

22 Aug 2025 1

88%

FPGA-Accelerated RISC-V ISA Extensions for Efficient Neural Network Inference on Edge Devices

Hardware Architecture

Makes smart devices run faster and use less power.

10 Nov 2025 0

View PDF Login to Bookmark

Country of Origin

🇮🇪 Ireland

Repos / Data Links

github.com

Page Count

13 pages

MARVEL: An End-to-End Framework for Generating Model-Class Aware Custom RISC-V Extensions for Lightweight AI

Makes smart devices run AI faster, using less power.

Technical Abstract

MaRVIn: A Cross-Layer Mixed-Precision RISC-V Framework for DNN Inference, from ISA Extension to Hardware Acceleration

Bare-Metal RISC-V + NVDLA SoC for Efficient Deep Learning Inference

FPGA-Accelerated RISC-V ISA Extensions for Efficient Neural Network Inference on Edge Devices