Automatic parallelization of "inherently" sequential nested loop programs
Most automatic parallelizers are based on detection of independent operations, and most of them cannot do anything if there is a true dependence between operations. However, there exists a class of programs for which this can be surmounted based on the nature of the operations. The standard and obvious cases are reductions and scans, which normally occur within loops. Existing work that deals with complicated reductions and scans normally focuses on the formalism, not the implementation. To help eliminate the gap between the formalism and implementation, we present a method for automatically ...
(For more, see "View full record.")