Improving a Parallel C++ Intel AVX-512 SIMD Linear Genetic Programming Interpreter
By: William B. Langdon
Potential Business Impact:
Makes computer programs run much faster.
We extend recent 256 SSE vector work to 512 AVX giving a four fold speedup. We use MAGPIE (Machine Automated General Performance Improvement via Evolution of software) to speedup a C++ linear genetic programming interpreter. Local search is provided with three alternative hand optimised codes, revision history and the Intel 512 bit AVX512VL documentation as C++ XML. Magpie is applied to the new Single Instruction Multiple Data (SIMD) parallel interpreter for Peter Nordin's linear genetic programming GPengine. Linux mprotect sandboxes whilst performance is given by perf instruction count. In both cases, in a matter of hours local search reliably sped up 114 or 310 lines of manually written parallel SIMD code for the Intel Advanced Vector Extensions (AVX) by 2 percent.
Similar Papers
Retrofitting Control Flow Graphs in LLVM IR for Auto Vectorization
Programming Languages
Makes computer programs run much faster.
Hardware-Aware Neural Network Compilation with Learned Optimization: A RISC-V Accelerator Approach
Hardware Architecture
Makes computer chips run faster and use less power.
High-Performance and Power-Efficient Emulation of Matrix Multiplication using INT8 Matrix Engines
Distributed, Parallel, and Cluster Computing
Makes AI learn faster and use less power.