Scalable and efficient tools for multi-level tiling


In the era of many-core systems, application performance will come from parallelism and data locality. Effective exploitation of these require explicit (re)structuring of the applications. Multilevel (or hierarchical) tiling is one such structuring technique used in almost all high-performance implementations. Lack of tool support has limited the use of multi-level tiling to program optimization experts. We present solutions to two fundamental problems in multi-level tiling, viz., optimal tile size selection and parameterized tiled loop generation. Our solutions provide scalable and efficient tools for multi-level tiling. Parameterized tiled code refers to tiled loops where the tile sizes are not (fixed) compile-time constants but are left as symbolic parameters. It can enable selection and adaptation of tile sizes across a spectrum of stages through compilation to run-time. We introduce two polyhedral sets, viz., inset and outset, and use them to develop a variety of scalable and efficient multi-level tiled loop generation algorithms. The generation efficiency and code quality are demonstrated on a variety of benchmarks such as stencil computations and matrix subroutines from BLAS. Our technique can generate tiled loop nests with parameterized, fixed or mixed tile sizes, thereby providing a one-size-fits all solution ideal for inclusion in production compilers. Optimal tile size selection (TSS) refers to the selection of tile sizes that optimize some cost (e.g., execution time) model. We show that these cost models share a fundamental mathematical property, viz., positivity, that allows us to reduce optimal TSS to convex optimization problems. Almost all TSS models proposed in the literature for parallelism, caches, and registers, lend themselves to this reduction. We present the reduction of five different TSS models proposed in the literature by different authors in a variety of tiling contexts. Our convex optimization based TSS framework is the first one to provide a solution that is both efficient and scalable to multiple levels of tiling.


code generation
compiler optimization
convex optimization
loop tiling
computer science


