Distributed memory backend does not handle strides correctly
Compiler version sac2c 2.0.0-Tintigny-9-g5362
. The following program should return 1, but when compiled and run as
sac2c_p -target distmem -trace d -distmem_min_elems 100 stride.sac && mpirun -n 2 ./a.out
, it returns 0
. This is because a non-distributed loop starting at size_t start = ...
is transformed to min(ShrayStart(x), start)
. In this case ShrayStart(x) = 500
and start = 1;
for the second partition. This gives new start 500
, which is not equal to 1 mod 3.
int *(int a, int b)
{
return _mul_SxS_(a, b);
}
int /(int a, int b)
{
return _div_SxS_(a, b);
}
int +(int a, int b)
{
return _add_SxS_(a, b);
}
int -(int a, int b)
{
return _sub_SxS_(a, b);
}
int main()
{
n = 1000;
x = {[i] -> 2 * i | [0] <= [i] < [n] step [3];
[i] -> 2 * i - 1 | [1] <= [i] < [n] step [3]};
y = _sel_VxA_([511], x);
equal = _eq_SxS_(y, 2 * 511 - 1);
return _toi_S_(equal);
}
For general index set [l] <= [i] < [u] step [s] width [w]
, or $S_{slw} = {l + is + j \mid i \in \mathbb{Z}, \quad 0 \leq j < w}$ we need to compute $[ShrayStart(x), ShrayEnd(x)) \cap S_{slw}$. If $w \neq 1$, this is not again of the form $S_{slw}$, which is problematic. I think we should give an error for now. If $w = 1$, then we do have $[ShrayStart(x), ShrayEnd(x)) \cap S_{sl1} = S_{sl'1}$ for $l'$ the smallest integer of the form $l' = l + is$ such that $l' >= ShrayStart(x)$, which is $l' = l + \lceil \frac{ShrayStart(x) - l}{s} \rceil \cdot s$.