Contents Next

1   Introduction

1.1   Motivation for Scan Detection

Most automatic parallelizers, whether industrial or academic, are based on the detection of independent operations. Operations are independent if they do not share modified data, or, in other words, if they have no memory conflicts. Detection of memory conflicts is mainly a syntactical process, in which the actual data transformations are ignored. There is another source of parallelism, which relies on semantical information, namely the detection of reductions. A reduction is the application of an associative binary operator to a vector of values, the result being one scalar. Reductions are quite frequent in scientific codes and are implemented efficiently on most parallel computers. Scans are similar to reductions, but the intermediate results are kept, thus giving a vector. Since there is no single rule for detecting associative operators, scan detection must use pattern-matching, and pattern-matching must be preceded by a normalization phase. Our method is based on a representation of the program as a system of recurrence equations. This allows the detection of reductions in loops nests of arbitrary depth and on multi-dimensional arrays. Scans and reductions are represented in symbolic form with the help of the Scan operator. When scans and reductions have been detected, one may use the results to construct more efficient parallel programs. While partial solutions are known [18], the design of a universal method is beyond the scope of this paper.

1.2   The Dataflow Graph

In the area of automatic parallelization a very useful structure is the DataFlow Graph (DFG). This structure is more accurate than the classical Dependence Graph (DG) which describes the pairs of statements having memory conflicts. Some memory conflicts are due to memory reuse. They can be satisfied either by enforcing sequential execution, or by expanding the data space of the program. In the DFG, all dependences which can be satisfied by data expansion have been removed. Hence, the DFG has the potential to expose more parallelism than the ordinary Dependence Graph.

The DFG gives, for each reference to a scalar or array element in an operation, the source operation, i.e. the operation that defined the value of the scalar or array element we are interested in.
Consider, for example, the following code:
DO i=1,n
 s(i)=a(i)*b(i)     (i1)
END DO
s(0)=0              (i2)
DO i=1,n
 s(i)=s(i-1)+s(i)   (i3)
END DO

 
Its Dataflow Graph as given by our automatic analysis tool, PAF, is:
Source of s(i-1) in i3 ;
   * if [ i-2>=0 ] then (i3,i-1)
   * if [ i-1=0 ] then (i2)
Source of s(i) in i3 ;
   * if [ true ] then (i1,i)

 
Note that the source of s(i-1) in i3 is a conditional expression. Such expressions occur frequently in Dataflow Analysis and are called quasts (Quasi-Affine Selection Trees) in what follows.
The results of Array Dataflow Analysis can be presented in many ways, all equivalent up to syntactical embellishments: source functions, a single assignment code, or a system of affine recurrence equations (SARE). The later presentation is the one that fits best with our aims in this paper; we will use the Alpha notation to represent SAREs [13]. The important point here is not notation, but the fact that an imperative program can be mechanically converted to a SARE provided it meets the following constraints:
The DFG of our example program can be expressed as the following SARE:
    i1[i] =  1<=i<=n  : a[i] * b[i] ;

    i3[i] = case
               i=0  : 0 ;
               1<=i<=n  : i3[i-1] + i1[i] ;
            esac

 
In this notation, conditions such as { 1<=i<=n } define iterations domains, while the subscripts expressions like i-1 are shorthand notations for dependence functions, in this case l i. (i-1).
The reason for using SAREs as the basis of our work is that SAREs are referentially transparent, which means that a variable can be freely replaced by its value as given by the right hand side of the equation that defines it. This is the main tool in our reduction detection method.

Let us emphasize the fact that the DFG and SARE are equivalent representations. SAREs are more self-contained and are easier to handle in an automatic system, while DFGs are susceptible of an intuitive graphical representation. In this paper, we will freely shift from one representation to the other according to the needs of the exposition.


Contents Next