Unlimited Vector Processing for Wireless Baseband Based on RISC-V Extension
By: Limin Jiang , Yi Shi , Yihao Shen and more
Potential Business Impact:
Makes computers do math faster for phones.
Wireless baseband processing (WBP) serves as an ideal scenario for utilizing vector processing, which excels in managing data-parallel operations due to its parallel structure. However, conventional vector architectures face certain constraints such as limited vector register sizes, reliance on power-of-two vector length multipliers, and vector permutation capabilities tied to specific architectures. To address these challenges, we have introduced an instruction set extension (ISE) based on RISC-V known as unlimited vector processing (UVP). This extension enhances both the flexibility and efficiency of vector computations. UVP employs a novel programming model that supports non-power-of-two register groupings and hardware strip-mining, thus enabling smooth handling of vectors of varying lengths while reducing the software strip-mining burden. Vector instructions are categorized into symmetric and asymmetric classes, complemented by specialized load/store strategies to optimize execution. Moreover, we present a hardware implementation of UVP featuring sophisticated hazard detection mechanisms, optimized pipelines for symmetric tasks such as fixed-point multiplication and division, and a robust permutation engine for effective asymmetric operations. Comprehensive evaluations demonstrate that UVP significantly enhances performance, achieving up to 3.0$\times$ and 2.1$\times$ speedups in matrix multiplication and fast Fourier transform (FFT) tasks, respectively, when measured against lane-based vector architectures. Our synthesized RTL for a 16-lane configuration using SMIC 40nm technology spans 0.94 mm$^2$ and achieves an area efficiency of 21.2 GOPS/mm$^2$.
Similar Papers
Efficient Implementation of RISC-V Vector Permutation Instructions
Hardware Architecture
Speeds up computer math and secret codes.
Retrofitting Control Flow Graphs in LLVM IR for Auto Vectorization
Programming Languages
Makes computer programs run much faster.
Flexing RISC-V Instruction Subset Processors (RISPs) to Extreme Edge
Hardware Architecture
Makes tiny computer chips for new gadgets.