Previous Contents Next

3   Description of the Problem

3.1   General Remarks

The problem we study in this paper is the evaluation of the communication volume in a HPF program before compilation. The goal is to help the programmer understand the communications generated by his program and find a good data distribution (or a more efficient way to code his algorithm).

The communications we consider are at the PROCESSORS level: as in HPF, we use an abstract target machine. We stay at the language level, thus allowing to find a data distribution that is well suited to the problem, and thus retaining portability. Our aim is not to find the best data distribution for a given machine, but a good one for any machine (and compiler).

In the future, if some compiler optimization techniques such as overlap areas to vectorize communications are used by most compilers, we could adapt our evaluation techniques to these optimizations. In a first step, though, we remain at the language level to validate our approach. We believe indeed that a good data distribution at the language level should not be a bad one for any compiler, regardless of the compilation techniques used.

3.2   Modelization

We consider that the only communication generation statements are storage statements, we do not take into account I/O statements. For each of these storage statements, the surrounding loops define an iteration domain that ``shapes'' the communication pattern.

Given the mathematical tools we use in the following, we have to restrict ourselves to loop bounds that define a polyhedron, that is loop bounds defined as extrema of affine functions of the surrounding loop indices and parameters. For the same reason, we restrict the array access functions to affine functions. This class of loops contains most of the linear algebra routines.

To simplify the following discussion, we make the hypotheses below without loss of generality: Given the previous restrictions, a storage statement S is represented by:
W
 
S
(f
 
W
 
S
(I))= f
 
S
(R
 
S
(f
 
R
 
S
(I)))     (1)
where the iteration vector I has value in the iteration domain D S defined by the surrounding loops. W S and R S are arrays and fW S and fR S are affine access functions.

We call operation an instance S(I) of a statement S for a given value I of the iteration vector.

The number of communications generated by a given statement is the number of array elements that need to be communicated for some operation generated by this statement. There is a communication when the array elements being read are not on the same virtual processor than the one the written element is. As array elements can be replicated, we will focus on the template elements.


Previous Contents Next