A Magnified View into Heterogeneous-ISA Thread Migration Performance without State Transformation
By: Nikolaos Mavrogeorgis , Christos Vasiladiotis , Pei Mu and more
Heterogeneous-ISA processor designs have attracted considerable research interest. However, unlike their homogeneous-ISA counterparts, explicit software support for bridging ISA heterogeneity is required. The lack of a compilation toolchain ready to support heterogeneous-ISA targets has been a major factor hindering research in this exciting emerging area. For any such compiler, "getting right" the mechanics involved in state transformation upon migration and doing this efficiently is of critical importance. In particular, any runtime conversion of the current program stack from one architecture to another would be prohibitively expensive. In this paper, we design and develop Unifico, a new multi-ISA compiler that generates binaries that maintain the same stack layout during their execution on either architecture. Unifico avoids the need for runtime stack transformation, thus eliminating overheads associated with ISA migration. Additional responsibilities of the Unifico compiler backend include maintenance of a uniform ABI and virtual address space across ISAs. Unifico is implemented using the LLVM compiler infrastructure, and we are currently targeting the x86-64 and ARMv8 ISAs. We have evaluated Unifico across a range of compute-intensive NAS benchmarks and show its minimal impact on overall execution time, where less than 6% (10%) overhead is introduced on average for high-end (low-end) processors. We also analyze the performance impact of Unifico's key design features and demonstrate that they can be further optimized to mitigate this impact. When compared against the state-of-the-art Popcorn compiler, Unifico reduces binary size overhead from ~200% to ~10%, whilst eliminating the stack transformation overhead during ISA migration.
Similar Papers
Instruction Set Migration at Warehouse Scale
Software Engineering
Helps computers switch to new brains faster.
A Multi-level Compiler Backend for Accelerated Micro-kernels Targeting RISC-V ISA Extensions
Programming Languages
Makes AI run much faster on new chips.
A WASM-Subset Stack Architecture for Low-cost FPGAs using Open-Source EDA Flows
Hardware Architecture
Makes tiny computers run programs using less space.