Variable-sized Blocks for Locality-aware SpMV [conference paper]

Maroon and gold lines with circles mimicking a circuit board

Conference

International Symposium on Code Generation and Optimization (CGO) - February 27 - March 3, 2021

Authors

Naveen Namashivayam, Sanyam Mehta, Pen-Chung Yew (professor)

Abstract

Blocking is an important optimization option available to mitigate the data movement overhead and improve the temporal locality in SpMV, a sparse BLAS kernel with irregular memory reference pattern. In this work, we propose an analytical model to determine the effective block size for highly irregular sparse matrices by factoring the distribution of non-zeros in the sparse dataset. As a result, the blocks generated by our scheme are variable-sized as opposed to constant-sized in most existing SpMV algorithms. We demonstrate our blocking scheme using Compressed Vector Blocks (CVB), a new column-based blocked data format, on Intel Xeon Skylake-X multicore processor. We evaluated the performance of CVB-based SpMV with variable-sized blocks using extensive set of matrices from Stanford Network Analysis Platform (SNAP). Our evaluation shows a speedup of up to 2.62X (with an average of 1.73X) and 2.02X (with an average of 1.18X) over the highly vendor tuned SpMV implementation in Intel's Math Kernel Library (MKL) on single and multiple Intel Xeon cores respectively.

Link to full paper

Variable-Sized Blocks for Locality-Aware SpMV

Keywords

compiler optimization