Extended Description
Created an attachment (id=969)
source code to reproduce fault
When I compile the AWLF unit test matmulAKD.sac (matmul.sac from
the SAC demos library, with AKD array shapes), we end up with
5 WLs instead of 3.
The problem has to do with the presence of the guard primitive
_prod_matches_prod_shape_VxA_ around a reshape() operation. which
is used to ensure that the reshape is a "conforming reshape",
in the sense that it preserves element count.
AWLF depends heavily on being to get at array shape elements through
scalarization and N_array nodes, and the above guard inhibits this.
The original (matmulAKD.sac) source code generates the array
arguments X and Y this way:
X = reshape( [ rows, cols], iota( rows * cols));
Y = transpose( X);
Compilation with -doawlf results in use of -ecc, thereby producing
these guarded expressions:
_uprf_3739, _uprf_3740 = _non_neg_val_S_( _isaa_4461_size1);
_idc_476 = [ _uprf_3739, _uprf_3739 ];
_idc_477, _icc_474_pred = _prod_matches_prod_shape_VxA_( _idc_476, xx);
_pinl_2634__icc_1518 = _idx_sel_( [1], _idc_477);
_uprf_3728, _uprf_3729 = _non_neg_val_S_( _pinl_2634__icc_1518);
_uprf_9774 = _sub_SxS_( _uprf_3739, _uprf_3728);
Note that the two arguments to the subtract are really the same
value.
We must not, in general, look past the guards to see the N_array _idc_476,
or other things will break, so the idx_sel_ does not CF out
of existence, as it would if no guard was present.
All(?) the other guards that appear in commonly generated code are
rank-0, and are scalarized by PRFUNR, but the above guard is
not amenable to such treatment.
I do not have, pending more coffee, any bright ideas on how to resolve this,
but did perform two experiments that make me dislike the SAC reshape()
even more:
1. I eliminated the SAC stdlib reshape() from the picture entirely,
generating (different) X and Y this way:
X = genarray ( [ rows, cols], 0.1);
Y = 1.5 + transpose( X);
That folded very nicely, as desired, although it did generate
Y as: genarray( [cols, rows], 1.6);
This example generated 4 WLs: two for generating X and Y,
and a nested pair of WLs for the actual inner product.
Oddly enough, it also generated an _reshape_VxA_() call on
the array shape for X, but it's not readily apparent where
it came from.
2. I replaced the stdlib reshape() with the APEX reshape() code
shown below. That resulted in 3 WLs, which is as good as
we can expect.
So, what does this tell us, and what can we do?
1. SAC reshape() operations, even though constrained to conforming
reshapes, are evil in two ways: First, as we know, they
inhibit WLF and AWLF on the reshape() result arrays.
Second, which I did not know, was that they inhibit other AWLF
operations, because of the inability of sac2c to simplify
shape expressions, due to the presence of the above guard
in the shape vector data flow.
2. I plan to eliminate _reshape_VxA_() from the AWLF unit tests,
wherever they are used, as above, to generate synthetic array
arguments.
3. I hereby campaign, again,for SAC source code access to WLIDX.
We might adopt a syntax such as this, extending the WL generator:
(. <= IV=[I,J,K]=[,RI] <= .)
Where RI denotes Ravel Index.
The index_generator_utilities.c code already contains functions
to convert among these forms.
Such access would let us write the above synthetic array generator as
a trivial WL, with RI as the result element. It would also
simplify the APEX reshape() function, by eliminating the need for
the mixed-radix base value operation performed by V2O in the following
code:
inline int V2O( int[.] shp, int[.] iv)
{ /* Vector iv to offset into array of shape shp */
/* See V2O.dws workspace */
offset = 0;
wt = 1;
for( i=shape(shp)[0]-1; i>=0; i--) {
offset = offset + ( wt * iv[i]);
wt = wt * shp[i];
}
return( offset);
}
inline int[.] O2V( int[.] shp, int offset)
{ /* Offset into array of shape shp to index vector */
/* See V2O.dws workspace */
iv = genarray( shape(shp), 1);
wts = iv;
for( i=shape(shp)[0]-2; i>=0; i--) {
wts[i] = wts[i+1] * shp[i+1];
}
for( i=shape(shp)[0]-1; i>=0; i--) {
iv[i] = _mod_SxS_( offset/wts[i], shp[i]);
offset = offset - (iv[i]*wts[i]);
}
return( iv);
}
inline double[*] APEXreshape(int[.] x, double[*] y)
{ /* APEX vector x reshape, with potential item reuse */
z = with {
( . <= iv <= .) {
offset = V2O( toi( x), iv);
offset = _mod_SxS_( offset, prod( shape(y)));
el = y[ O2V( shape( y), offset)];
} : el;
} : genarray( toi(x), 0.0d);
return( z);
}
4. I am going to look further at the _prod_matches_prod_shape_VxA_()
guard, and see if I can come up with a clean way to replace it
by something that can be scalarized.