sac2c issueshttps://gitlab.sac-home.org/sac-group/sac2c/-/issues2017-11-19T20:26:08Zhttps://gitlab.sac-home.org/sac-group/sac2c/-/issues/1168sac2c gets uppity about object filename after Mull2017-11-19T20:26:08ZRobert Berneckysac2c gets uppity about object filename after Mull| | |
| --- | --- |
| Bugzilla Link | [1128](http://bugs.sac-home.org/show_bug.cgi?id=1128) |
| Created on | Aug 07, 2014 18:03 |
| Version | svn |
| OS | Linux |
| Architecture | PC |
## Extended Description
<pre>This puzzles me. I ...| | |
| --- | --- |
| Bugzilla Link | [1128](http://bugs.sac-home.org/show_bug.cgi?id=1128) |
| Created on | Aug 07, 2014 18:03 |
| Version | svn |
| OS | Linux |
| Architecture | PC |
## Extended Description
<pre>This puzzles me. I have no idea why naming an object file should
affect itslinkability. In the following, the only difference
is that one compile specifies the name of the object file, while
the other uses the default value:
sac@rattler:~/sac/testsuite/optimizations/awlf$ rm a.out sumrotateiotaAKSAKD.sac.exe
sac@rattler:~/sac/testsuite/optimizations/awlf$ sac2c -v0 -target seq -doawlf -nowlf -noctz -doscwlf sumrotateiotaAKSAKD.sac
sac@rattler:~/sac/testsuite/optimizations/awlf$ a.out; echo $?
0
sac@rattler:~/sac/testsuite/optimizations/awlf$ sac2c -v0 -target seq -doawlf -nowlf -noctz -doscwlf sumrotateiotaAKSAKD.sac -o sumrotateiotaAKSAKD.sac.exe
sac@rattler:~/sac/testsuite/optimizations/awlf$ ./sumrotateiotaAKSAKD.sac.exe ; echo $?./sumrotateiotaAKSAKD.sac.exe: error while loading shared libraries: libArrayMod.so: cannot open shared object file: No such file or directory
127
Compiling with "-o a.out " also causes the loader to fail.
sac2c -V
sac2c v1.00-beta (Haggis And Apple)
product rev 18604
(Wed Aug 6 16:04:14 EDT 2014 by sac)</pre>BugZillaBugZillahttps://gitlab.sac-home.org/sac-group/sac2c/-/issues/1167Post-Mull DevCamp oddities2017-11-19T20:26:05ZRobert BerneckyPost-Mull DevCamp oddities| | |
| --- | --- |
| Bugzilla Link | [1126](http://bugs.sac-home.org/show_bug.cgi?id=1126) |
| Created on | Jul 16, 2014 19:11 |
| Version | svn |
| OS | Linux |
| Architecture | PC |
## Extended Description
<pre>I'm trying to get m...| | |
| --- | --- |
| Bugzilla Link | [1126](http://bugs.sac-home.org/show_bug.cgi?id=1126) |
| Created on | Jul 16, 2014 19:11 |
| Version | svn |
| OS | Linux |
| Architecture | PC |
## Extended Description
<pre>I'm trying to get my development box here back on its wheels, and
have run into a number of anomalies that I'll report here:
1. cd sac2c; make clean:
Removing compilation target: bin/sacprapolyhedral
/bin/bash: bin/sac2c: No such file or directory
./src/makefiles/rtlibs.mkf:23: ** Warning: SAC2CRC=${SAC2CRC:-setup/sac2crc} LD_LIBRARY_PATH=lib${LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH} DYLD_LIBRARY_PATH=lib${DYLD_LIBRARY_PATH:+:$DYLD_LIBRARY_PATH} bin/sac2c does not appear to work - cannot determine SBI data, skipping
make[1]: Nothing to be done for `clean'.
/bin/bash: bin/sac2c: No such file or directory
./src/makefiles/rtlibs.mkf:23: ** Warning: SAC2CRC=${SAC2CRC:-setup/sac2crc} LD_LIBRARY_PATH=lib${LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH} DYLD_LIBRARY_PATH=lib${DYLD_LIBRARY_PATH:+:$DYLD_LIBRARY_PATH} bin/sac2c does not appear to work - cannot determine SBI data, skipping
make[1]: Nothing to be done for `clean'.
2. cd sac2c; make:
Codeveloper code: annotate_memory_transfers.c
..
Compiling executionmode.c: In function ‘CUTEMassign’:
cuda/cuda_tag_executionmode.c:597:18: warning: comparison between ‘cudaexecmode_t’ and ‘enum <anonymous>’ [-Wenum-compare]
if( old_mode == CUDA_DEVICE_SINGLE &&
^
Compiling developer code: prepare_forloop_generation.c
Compiling developer code: cuda_create_cells.c
Compiling developer code: minimize_cudast_transfers.c
Compiling developer code: single_thread_kernels.c
Compiling developer code: adjust_stknl_rets.c
cuda/cuda_create_cells.c: In function ‘CUCCassign’:
cuda/cuda_create_cells.c:90:34: warning: comparison between ‘mtexecmode_t’ and ‘enum <anonymous>’ [-Wenum-compare]
if( ASSIGN_EXECMODE( arg_node) == CUDA_DEVICE_SINGLE) {
^
cuda/cuda_create_cells.c:97:60: warning: comparison between ‘mtexecmode_t’ and ‘enum <anonymous>’ [-Wenum-compare]
ASSIGN_EXECMODE( ASSIGN_NEXT( last_cellassign)) == CUDA_DEVICE_SINGLE) {
...
3. cd sac2c; make
Linking bin/sac2c-d (developer version)
'/home/sac/.sac2crc' not found or not readable, skipping.
MT_MODE = 0 in target, forcing -numthreads to 1.
'/home/sac/.sac2crc' not found or not readable, skipping.
MT_MODE = 0 in target, forcing -numthreads to 1.
'/home/sac/.sac2crc' not found or not readable, skipping.
MT_MODE = 0 in target, forcing -numthreads to 1.
'/home/sac/.sac2crc' not found or not readable, skipping.
MT_MODE = 0 in target, forcing -numthreads to 1.
'/home/sac/.sac2crc' not found or not readable, skipping.
MT_MODE = 0 in target, forcing -numthreads to 1.
'/home/sac/.sac2crc' not found or not readable, skipping.
MT_MODE = 0 in target, forcing -numthreads to 1.
** INFO: target 'seq' does *not* support PHM.
[I don't see the rationale for a connection between 'seq" and PHM, btw.]
4. a bit further on...
LD bin/csimt
'/home/sac/.sac2crc' not found or not readable, skipping.
'/home/sac/.sac2crc' not found or not readable, skipping.
'/home/sac/.sac2crc' not found or not readable, skipping.
'/home/sac/.sac2crc' not found or not readable, skipping.
'/home/sac/.sac2crc' not found or not readable, skipping.
'/home/sac/.sac2crc' not found or not readable, skipping.
** INFO: target 'mt_pth' does support PHM.
5, I think I had DISLIN working here before, and it still lives
in /usr/local/dislin. However, stdlib comes up with this:
Making all for target mt_pth
** Note: modules Dislin DislinBars DislinQuick DislinCanvas DislinPage DislinPlot3d DislinSystem disabled due to configuration.
6. Once I got stdlib built, I tried a few unit tests:
cd ~/sac/testsuite/optimizations/rcopt
sac2 bug1107.sac
/usr/local/lib/sac2c/18604/rt/host/seq/libsac.so: undefined reference to `SAC_HM_ShowDiagnostics'
collect2: error: ld returned 1 exit status
abort: System failed to execute shell command
abort: gcc -std=gnu99 /tmp/SAC_ZGPsKC/a.out.o
abort: -L/usr/local/lib/sac2c/18604/modlibs/host/seq
abort: -Wl,-rpath,/usr/local/lib/sac2c/18604/modlibs/host/seq
abort: -L/usr/local/lib/sac2c/18604/modlibs/host/seq
abort: -Wl,-rpath,/usr/local/lib/sac2c/18604/modlibs/host/seq -L./host/seq
abort: -Wl,-rpath,./host/seq -L/usr/local/lib/sac2c/18604/rt/host/seq
abort: -Wl,-rpath,/usr/local/lib/sac2c/18604/rt/host/seq -lArrayMod
abort: -lArrayTransformMod -lConstantsMod -lArrayArithMod -lArrayBasicsMod -lBoolMod
abort: -lScalarArithMod -lsacpreludeMod -L/usr/local/dislin
abort: -Wl,-rpath,/usr/local/dislin -L/opt/local/lib -Wl,-rpath,/opt/local/lib
abort: -lsacphmc -lsac -lsacphmc -o a.out
abort: with exit code 1
^
I am guessing that HM is for Heap Manager.
7. Let's try some other unit tests:
cd sac/testsuite/optimizations/polylib
sac2c -v0 -target seq -O3 guard_val_lt_val_S.sac -o guard_val_lt_val_S.sac.exe -check c -doawlf -nowlf -dopogo -noggs
sac@rattler:~/sac/testsuite/optimizations/polylib$ guard_val_lt_val_S.sac.exe ; echo $?guard_val_lt_val_S.sac.exe: error while loading shared libraries: libsac.so: cannot open shared object file: No such file or directory
These failures are all new post-Mull.
The host and target system is an Ubuntu 14.10LTS system.
sac2c -V
sac2c v1.00-beta (Haggis And Apple)
product rev 18604
(Tue Jul 15 15:06:15 EDT 2014 by sac)</pre>BugZillaBugZillahttps://gitlab.sac-home.org/sac-group/sac2c/-/issues/1166Typecasting fails2017-11-19T20:26:02ZArtem ShinkarovTypecasting fails| | |
| --- | --- |
| Bugzilla Link | [1124](http://bugs.sac-home.org/show_bug.cgi?id=1124) |
| Created on | Jul 10, 2014 17:57 |
| Version | svn |
| OS | Linux |
| Architecture | PC |
| Attachments | ![feedsmall](/uploads/951458f2202c...| | |
| --- | --- |
| Bugzilla Link | [1124](http://bugs.sac-home.org/show_bug.cgi?id=1124) |
| Created on | Jul 10, 2014 17:57 |
| Version | svn |
| OS | Linux |
| Architecture | PC |
| Attachments | ![feedsmall](/uploads/951458f2202c2c123f5d581da067ed1a/feedsmall.bmp), [rgb.sac](/uploads/4fb75265545a2cd4ec733ec00f9fabae/rgb.sac), [program.sac](/uploads/7732fcf19dde9e4ea1c9ec931d7115bd/program.sac) |
## Extended Description
<pre>Here is an experiment:
We define a type called rgb in the module called rgb.sac like this:
$ cat -n rgb.sac
1 module rgb;
2
3 export all;
4 typedef int[3] rgb;
5
6 int[.] shape (rgb[.,.] a)
7 {
8 return Array::drop ([Array::- 1], Array::shape ((int[.,.,.])a));
9 }
We also define the shape function.
Now the main progam looks like this:
$ cat -n program.sac
1 use String: {string};
2 use BMP: all;
3 use rgb: all;
4
5 rgb[.,.] my_readBMP( string name)
6 {
7 img = BMP::readBMP( name);
8 StdIO::print (Color8::shape (img));
9 b = (int[.,.,.])img;
10 StdIO::print (Array::shape (b));
11 a = (rgb[.,.])b;
12 StdIO::print (shape (a));
13 return a;
14 }
15
16 int main()
17 {
18 img = my_readBMP("feedsmall.bmp");
19 StdIO::print (shape (img));
20 return 0;
21 }
and the output of the program is this:
$ ./a.out
Dimension: 1
Shape : < 2>
<10 10 >
Dimension: 1
Shape : < 3>
<10 10 3 >
Dimension: 1
Shape : < 0>
<>
Dimension: 1
Shape : < 3>
<10 10 22851456 >
So, as you can see, the shape information after the typecast to rgb[.,.] type is wrong. If we inline the shape in rgb, the problem goes away. If we introduce object references inside the shape, the problem goes away:
int[.] shape (rgb[.,.] a)
{
sh= Array::drop ([Array::- 1], Array::shape ((int[.,.,.])a));
StdIO::print (sh);
return sh;
}
O_o</pre>BugZillaBugZillahttps://gitlab.sac-home.org/sac-group/sac2c/-/issues/1163Revealed guards do not rename dominated successors2017-11-19T20:25:51ZRobert BerneckyRevealed guards do not rename dominated successors| | |
| --- | --- |
| Bugzilla Link | [1119](http://bugs.sac-home.org/show_bug.cgi?id=1119) |
| Created on | Mar 08, 2014 21:00 |
| Version | svn |
| OS | Linux |
| Architecture | PC |
## Extended Description
<pre>I think this proble...| | |
| --- | --- |
| Bugzilla Link | [1119](http://bugs.sac-home.org/show_bug.cgi?id=1119) |
| Created on | Mar 08, 2014 21:00 |
| Version | svn |
| OS | Linux |
| Architecture | PC |
## Extended Description
<pre>I think this problem has been around forever, but it is certainly with us
as of:
sac2c v1.00-beta (Haggis And Apple)
developer rev 18449 linux-gnu_x86_64
(Sat Mar 8 13:24:26 EST 2014 by sac)
In ~/sac/testsuite/optimizations/constraintchecks/ipbb.sac,
we have something like:
x = condfun(... colx);
...
colx' = val_le_SxS_( 0, colx);
Eventually, condfun gets inlined, and with it comes a guard:
colx2 = _non_neg_val( colx);
x = blah;
...
colx' = val_le_SxS_( 0, colx);
The latter guard can not be removed. However, since the guard
on colx dominates the second guard, we could arrange to rename
colx --> colx2, in all code dominated by the first guard,</pre>BugZillaBugZillahttps://gitlab.sac-home.org/sac-group/sac2c/-/issues/1162-dopra requires CUDA2017-11-19T20:25:48ZRobert Bernecky-dopra requires CUDA| | |
| --- | --- |
| Bugzilla Link | [1118](http://bugs.sac-home.org/show_bug.cgi?id=1118) |
| Created on | Mar 04, 2014 20:25 |
| Version | svn |
| OS | Linux |
| Architecture | PC |
## Extended Description
<pre>DOPRA invokes tools...| | |
| --- | --- |
| Bugzilla Link | [1118](http://bugs.sac-home.org/show_bug.cgi?id=1118) |
| Created on | Mar 04, 2014 20:25 |
| Version | svn |
| OS | Linux |
| Architecture | PC |
## Extended Description
<pre>DOPRA invokes tools/cuda/polyhedra, but does nothing to ensure
that said binary exists.
In fact, it looks like it will blindly call it, and crash in PRA.
Furthermore, /configure appears to REQUIRE the builder to write:
./configure --enable-cuda
and even that is not enough, as my system still does not build that
binary after that.
I also think we will soon have to include polylib as a mandatory
part of the build.</pre>BugZillaBugZillahttps://gitlab.sac-home.org/sac-group/sac2c/-/issues/1161IVEXP overly conservative in setting AVIS_MIN for _aplmod_()2017-11-19T20:25:45ZRobert BerneckyIVEXP overly conservative in setting AVIS_MIN for _aplmod_()| | |
| --- | --- |
| Bugzilla Link | [1113](http://bugs.sac-home.org/show_bug.cgi?id=1113) |
| Created on | Jan 27, 2014 21:22 |
| Version | svn |
| OS | Linux |
| Architecture | PC |
## Extended Description
<pre>This fault has been...| | |
| --- | --- |
| Bugzilla Link | [1113](http://bugs.sac-home.org/show_bug.cgi?id=1113) |
| Created on | Jan 27, 2014 21:22 |
| Version | svn |
| OS | Linux |
| Architecture | PC |
## Extended Description
<pre>This fault has been around since the invention of F_aplmod_().
The problem is that IVEXP fails to set AVIS_MIN when the modulus
is known to be positive.
This caused an AWLF unit test failure for relax*sac, due to
the presence of things such as rotate( -1, mat).
The following test case shows the problem, when compiled with -doawlf -nowlf.
The _ge_() should be CF'd away, because the aplmod() result is guaranteed
to be non-negative.
int[*] id(int[*] y)
{
return(y);
}
int main()
{
x = id( 5);
x = _max_SxS_( x, 1);
z = _aplmod_SxS_( -1, x);
z = _ge_SxS_( z, 0);
z = _sub_SxS_( toi( z), 1);
return(z);
}
Fix coming as soon as I test it...</pre>BugZillaBugZillahttps://gitlab.sac-home.org/sac-group/sac2c/-/issues/1157WLSIMP might need DL/AS/AL call following in CF unit test cubslcrash.sac2017-11-19T20:25:33ZRobert BerneckyWLSIMP might need DL/AS/AL call following in CF unit test cubslcrash.sac| | |
| --- | --- |
| Bugzilla Link | [1089](http://bugs.sac-home.org/show_bug.cgi?id=1089) |
| Created on | Sep 26, 2013 21:29 |
| Version | svn |
| OS | Linux |
| Architecture | PC |
| Attachments | [crud2.sac](/uploads/054506ef0d48e...| | |
| --- | --- |
| Bugzilla Link | [1089](http://bugs.sac-home.org/show_bug.cgi?id=1089) |
| Created on | Sep 26, 2013 21:29 |
| Version | svn |
| OS | Linux |
| Architecture | PC |
| Attachments | [crud2.sac](/uploads/054506ef0d48e40f8b1e39b36e954c34/crud2.sac) |
## Extended Description
<pre>Created an attachment (id=987)
source code to reproduce fault
_pinl_2868__flat_33 = _add_SxS_( N__SSA0_1, -1);
_wlsimp_17660 = _sub_SxS_( N__SSA0_1, _pinl_2868__flat_33);
This code is generated by WLSIMP, leaving a GENWIDTH value
with a non-constant value, although the value is actually the constant 1.
I'm not sure if this affects anything downstream, but if so,
some additional traversals are called for after WLSIMP.
Here's how to reproduce it in Build #18310.
sac2c crud2.sac -doawlf -nowlf -bopt -v1 >crud</pre>BugZillaBugZillahttps://gitlab.sac-home.org/sac-group/sac2c/-/issues/1156AWLF unit test relaxRotateOnlyAKD.sac eventually dies in DL with segfault2017-11-19T20:25:30ZRobert BerneckyAWLF unit test relaxRotateOnlyAKD.sac eventually dies in DL with segfault| | |
| --- | --- |
| Bugzilla Link | [1084](http://bugs.sac-home.org/show_bug.cgi?id=1084) |
| Created on | May 20, 2013 23:50 |
| Version | svn |
| OS | Linux |
| Architecture | PC |
| Attachments | [relaxRotateOnlyAKD.sac](/uploads/...| | |
| --- | --- |
| Bugzilla Link | [1084](http://bugs.sac-home.org/show_bug.cgi?id=1084) |
| Created on | May 20, 2013 23:50 |
| Version | svn |
| OS | Linux |
| Architecture | PC |
| Attachments | [relaxRotateOnlyAKD.sac](/uploads/0423a12eda22a91fb204703dd02b0cb1/relaxRotateOnlyAKD.sac) |
## Extended Description
<pre>Created an attachment (id=982)
source code to reproduce fault
sac2c -V
sac2c v1.00-beta (Haggis And Apple)
product rev 18158 linux-gnu_x86_64
(Mon May 20 18:07:17 EDT 2013 by sac)
sac2c relaxRotateOnlyAKD.sac -doawlf -nowlf -v4 -#d,SSE
This small unit test dies in -bopt:saacyc:dl:12 on a segfault.
Memory usage climbs continuously, and DL does more and more "work"
on each iteration.</pre>BugZillaBugZillahttps://gitlab.sac-home.org/sac-group/sac2c/-/issues/1155Guarded reshapes inhibit AWLF in matmulAKD.sac2017-11-19T20:25:26ZRobert BerneckyGuarded reshapes inhibit AWLF in matmulAKD.sac| | |
| --- | --- |
| Bugzilla Link | [1069](http://bugs.sac-home.org/show_bug.cgi?id=1069) |
| Created on | Apr 25, 2013 14:24 |
| Version | svn |
| OS | Linux |
| Architecture | PC |
| Attachments | [matmulAKD.sac](/uploads/154694ab6...| | |
| --- | --- |
| Bugzilla Link | [1069](http://bugs.sac-home.org/show_bug.cgi?id=1069) |
| Created on | Apr 25, 2013 14:24 |
| Version | svn |
| OS | Linux |
| Architecture | PC |
| Attachments | [matmulAKD.sac](/uploads/154694ab6f79317860cdf43b3bf8406e/matmulAKD.sac), [matmulNoReshapeAKD.sac](/uploads/00d84433791fe2b2d36a85354246097a/matmulNoReshapeAKD.sac), [matmulAPEXReshapeAKD.sac](/uploads/ed4e9153aa7d054c7d9d4d3efc294c0a/matmulAPEXReshapeAKD.sac) |
## Extended Description
<pre>Created an attachment (id=969)
source code to reproduce fault
When I compile the AWLF unit test matmulAKD.sac (matmul.sac from
the SAC demos library, with AKD array shapes), we end up with
5 WLs instead of 3.
The problem has to do with the presence of the guard primitive
_prod_matches_prod_shape_VxA_ around a reshape() operation. which
is used to ensure that the reshape is a "conforming reshape",
in the sense that it preserves element count.
AWLF depends heavily on being to get at array shape elements through
scalarization and N_array nodes, and the above guard inhibits this.
The original (matmulAKD.sac) source code generates the array
arguments X and Y this way:
X = reshape( [ rows, cols], iota( rows * cols));
Y = transpose( X);
Compilation with -doawlf results in use of -ecc, thereby producing
these guarded expressions:
_uprf_3739, _uprf_3740 = _non_neg_val_S_( _isaa_4461_size1);
_idc_476 = [ _uprf_3739, _uprf_3739 ];
_idc_477, _icc_474_pred = _prod_matches_prod_shape_VxA_( _idc_476, xx);
_pinl_2634__icc_1518 = _idx_sel_( [1], _idc_477);
_uprf_3728, _uprf_3729 = _non_neg_val_S_( _pinl_2634__icc_1518);
_uprf_9774 = _sub_SxS_( _uprf_3739, _uprf_3728);
Note that the two arguments to the subtract are really the same
value.
We must not, in general, look past the guards to see the N_array _idc_476,
or other things will break, so the idx_sel_ does not CF out
of existence, as it would if no guard was present.
All(?) the other guards that appear in commonly generated code are
rank-0, and are scalarized by PRFUNR, but the above guard is
not amenable to such treatment.
I do not have, pending more coffee, any bright ideas on how to resolve this,
but did perform two experiments that make me dislike the SAC reshape()
even more:
1. I eliminated the SAC stdlib reshape() from the picture entirely,
generating (different) X and Y this way:
X = genarray ( [ rows, cols], 0.1);
Y = 1.5 + transpose( X);
That folded very nicely, as desired, although it did generate
Y as: genarray( [cols, rows], 1.6);
This example generated 4 WLs: two for generating X and Y,
and a nested pair of WLs for the actual inner product.
Oddly enough, it also generated an _reshape_VxA_() call on
the array shape for X, but it's not readily apparent where
it came from.
2. I replaced the stdlib reshape() with the APEX reshape() code
shown below. That resulted in 3 WLs, which is as good as
we can expect.
So, what does this tell us, and what can we do?
1. SAC reshape() operations, even though constrained to conforming
reshapes, are evil in two ways: First, as we know, they
inhibit WLF and AWLF on the reshape() result arrays.
Second, which I did not know, was that they inhibit other AWLF
operations, because of the inability of sac2c to simplify
shape expressions, due to the presence of the above guard
in the shape vector data flow.
2. I plan to eliminate _reshape_VxA_() from the AWLF unit tests,
wherever they are used, as above, to generate synthetic array
arguments.
3. I hereby campaign, again,for SAC source code access to WLIDX.
We might adopt a syntax such as this, extending the WL generator:
(. <= IV=[I,J,K]=[,RI] <= .)
Where RI denotes Ravel Index.
The index_generator_utilities.c code already contains functions
to convert among these forms.
Such access would let us write the above synthetic array generator as
a trivial WL, with RI as the result element. It would also
simplify the APEX reshape() function, by eliminating the need for
the mixed-radix base value operation performed by V2O in the following
code:
inline int V2O( int[.] shp, int[.] iv)
{ /* Vector iv to offset into array of shape shp */
/* See V2O.dws workspace */
offset = 0;
wt = 1;
for( i=shape(shp)[0]-1; i>=0; i--) {
offset = offset + ( wt * iv[i]);
wt = wt * shp[i];
}
return( offset);
}
inline int[.] O2V( int[.] shp, int offset)
{ /* Offset into array of shape shp to index vector */
/* See V2O.dws workspace */
iv = genarray( shape(shp), 1);
wts = iv;
for( i=shape(shp)[0]-2; i>=0; i--) {
wts[i] = wts[i+1] * shp[i+1];
}
for( i=shape(shp)[0]-1; i>=0; i--) {
iv[i] = _mod_SxS_( offset/wts[i], shp[i]);
offset = offset - (iv[i]*wts[i]);
}
return( iv);
}
inline double[*] APEXreshape(int[.] x, double[*] y)
{ /* APEX vector x reshape, with potential item reuse */
z = with {
( . <= iv <= .) {
offset = V2O( toi( x), iv);
offset = _mod_SxS_( offset, prod( shape(y)));
el = y[ O2V( shape( y), offset)];
} : el;
} : genarray( toi(x), 0.0d);
return( z);
}
4. I am going to look further at the _prod_matches_prod_shape_VxA_()
guard, and see if I can come up with a clean way to replace it
by something that can be scalarized.</pre>BugZillaBugZillahttps://gitlab.sac-home.org/sac-group/sac2c/-/issues/1154Dismal performance of indexed reference in LACFUNs, e.g. Livermore Loop loop152017-11-19T20:25:21ZRobert BerneckyDismal performance of indexed reference in LACFUNs, e.g. Livermore Loop loop15| | |
| --- | --- |
| Bugzilla Link | [1066](http://bugs.sac-home.org/show_bug.cgi?id=1066) |
| Created on | Apr 19, 2013 21:28 |
| Version | svn |
| OS | Linux |
| Architecture | PC |
| Attachments | [loop15.sac](/uploads/3c3160c8cfc6...| | |
| --- | --- |
| Bugzilla Link | [1066](http://bugs.sac-home.org/show_bug.cgi?id=1066) |
| Created on | Apr 19, 2013 21:28 |
| Version | svn |
| OS | Linux |
| Architecture | PC |
| Attachments | [loop15.sac](/uploads/3c3160c8cfc61060dc55b0cf5edf90ca/loop15.sac) |
## Extended Description
<pre>Created an attachment (id=968)
source code to reproduce fault
I have been looking at the performance, or lack thereof,
of Livermore Loop loop15. It currently runs about 2 minutes,
vs. 6 seconds for the C code.
It contains code like this:
ret2 = with {
([0,0] <= iv < [5,99]) {
if( VF[iv+1] >= VF[iv+[1,0]]) {
if( VH[iv+[2,1]] > VH[iv+1]) {
val = sqrt( VGs[iv+1] + sq( max( VH[iv+1], VH[iv+[2,1]])))
* 0.053d / VF[iv+1];
} else {
val = sqrt( VGs[iv+1] + sq( max( VH[iv+1], VH[iv+[2,1]])))
* 0.073d / VF[iv+1];
}
} else {
if( VH[iv+[2,1]] > VH[iv+1]) {
val = sqrt( VGs[iv+1] + sq( max( VH[iv+[1,0]], VH[iv+[2,0]])))
* 0.053d / VF[iv+1];
} else {
val = sqrt( VGs[iv+1] + sq( max( VH[iv+[1,0]], VH[iv+[2,0]])))
* 0.073d / VF[iv+1];
}
}
} : val;
...
You get the idea...
I think what happens is that NONE of the code in the CONDFUNs is WL-folded.
Furthermore, there is no chance to use WLIDX in the LACFUNs.
The immediate fix for the sac code here is this. Consider
the last IF() code block. That can be written so that the LACFUN
has no indexing, and the indexing stuff can remain in the WL's basic
block:
numer = ( VH[iv+]2,1]] > VH[iv+1]) ? 0.53d : 0.73d;
val = sqrt( VGs[iv+1] + sq( max( VH[iv+[1,0]], VH[iv+[2,0]])))
* numer / VF[iv+1];
This is not, however, a panacea, because other applications are
not so amenable to this sort of refactoring. I.e., consider
binary search, heapsort, and the like.
Some redesigns we might consider, aside from scrapping the whole
LACFUN idea, include:
- pushing wlidx into LACFUNs. (Perhaps this is already done, but
I did not see evidence of it.)
- making LIR fancier for CONDFUNs. I.e., in the above ultimate IF(),
the val= blocks are nearly identical in both legs, so the identical
parts could be moved out of the LACFUN.
I think the latter offers the biggest immediate advantages.
This bug also explains a lot about why many real-world SAC applications
don't work nearly as well as we expect: I.e., our (my) naive expectation
is that scalar-oriented SAC code should perform as well as the
equivalent C code.</pre>BugZillaBugZillahttps://gitlab.sac-home.org/sac-group/sac2c/-/issues/1152Loss of guard information inhibits optimizations under -ecc/-check c/-doawlf2017-11-19T20:25:14ZRobert BerneckyLoss of guard information inhibits optimizations under -ecc/-check c/-doawlf| | |
| --- | --- |
| Bugzilla Link | [1060](http://bugs.sac-home.org/show_bug.cgi?id=1060) |
| Created on | Apr 12, 2013 23:04 |
| Version | svn |
| OS | Linux |
| Architecture | PC |
## Extended Description
<pre>This is a problem t...| | |
| --- | --- |
| Bugzilla Link | [1060](http://bugs.sac-home.org/show_bug.cgi?id=1060) |
| Created on | Apr 12, 2013 23:04 |
| Version | svn |
| OS | Linux |
| Architecture | PC |
## Extended Description
<pre>This is a problem that arises from time to time, and I just ran into
another case of it, in the CF unit test SAACFprf_reshapeAKD.sac.
The code of interest is this:
v = id(100);
vec1 = iota(v);
vec2 = _reshape_VxA_( [v], vec1);
[Replacing [v] by _shape_A_(vec1) changes nothing here, BTW.]
We end up with this IL (-doawlf -ecc):
Vec1 = [ v ];
v2, p = _non_neg_val_S_( v);
Vec3 = [ v2 ];
if( Vec1 == Vec3) --> optimize
I need to prove that V1≡V3, if I am to remove the reshape.
The guard is introduced by the inlining of iota(), so the
code as it stands is correct, in the sense that we don't have
any CSE/VP/CP sort of problems.
I was looking for a PMly way to do the V1≡V3 check, but
that's local, in the sense that I'd need to insert them
anywhere I need to make such a check.
A superior approach might be a traversal
that, within each basic block moves guards upwards
until they reach the definition point of their arguments.
That would give us:
v2, p = _non_neg_val_S_( v);
Vec1 = [ v2 ];
Vec3 = [ v2 ];
if( Vec1 == Vec3) --> optimize
This would work fine in the above case, but it is not
a perfect solution, inasmuch as we may have a guard
such as: _shape_matches_dim( X, Y) and have some rubbish
between the definition points of X and Y.
Nonetheless, it is fairly easy to implement (the last paragraph
being the hardest part), and would probably get us 90% of the
way there.
Better ideas welcome.</pre>BugZillaBugZillahttps://gitlab.sac-home.org/sac-group/sac2c/-/issues/1151print.c gets fussy when -check c enabled2017-11-19T20:25:10ZRobert Berneckyprint.c gets fussy when -check c enabled| | |
| --- | --- |
| Bugzilla Link | [1059](http://bugs.sac-home.org/show_bug.cgi?id=1059) |
| Created on | Apr 11, 2013 16:41 |
| Version | svn |
| OS | Linux |
| Architecture | PC |
| Attachments | [bug525.sac](/uploads/bfd91cbea920...| | |
| --- | --- |
| Bugzilla Link | [1059](http://bugs.sac-home.org/show_bug.cgi?id=1059) |
| Created on | Apr 11, 2013 16:41 |
| Version | svn |
| OS | Linux |
| Architecture | PC |
| Attachments | [bug525.sac](/uploads/bfd91cbea9201920b5e66195ecbc1637/bug525.sac) |
## Extended Description
<pre>Created an attachment (id=961)
source code to reproduce fault
~/sac/testsuite/optimizations/constantfolding$ sac2c bug525.sac -doawlf -check c -v1
WARNING: Option -check c implies option -ecc.
WARNING: Insertion of explicit conformity checks has been enabled.
WARNING: AWLF is enabled: -extrema enabled.
WARNING: AWLF is enabled: -maxoptcyc=20
print/print.c:2623 Assertion "global.indent == old_indent" failed!
Indentation unbalanced while printing function 'SACf__MAIN__main`.
Indentation at beginning of function: 1.
Indentation at end of function: 2
This has been around since at least Build #18089, but I suspect
it's archaic.</pre>BugZillaBugZillahttps://gitlab.sac-home.org/sac-group/sac2c/-/issues/1150Livermore Loop loop24 lousy performance - 3X slower than C2017-11-19T20:25:07ZRobert BerneckyLivermore Loop loop24 lousy performance - 3X slower than C| | |
| --- | --- |
| Bugzilla Link | [1058](http://bugs.sac-home.org/show_bug.cgi?id=1058) |
| Created on | Apr 08, 2013 19:06 |
| Version | svn |
| OS | Linux |
| Architecture | PC |
## Extended Description
<pre>~/sac/demos/benchma...| | |
| --- | --- |
| Bugzilla Link | [1058](http://bugs.sac-home.org/show_bug.cgi?id=1058) |
| Created on | Apr 08, 2013 19:06 |
| Version | svn |
| OS | Linux |
| Architecture | PC |
## Extended Description
<pre>~/sac/demos/benchmarks/livermore_loops/for_comparison/loop24$ sac2c -V
sac2c v1.00-beta (Haggis And Apple)
product rev 18089 linux-gnu_x86_64
(Mon Apr 8 11:42:04 EDT 2013 by sac)
time loop24.sac.exe.awlf.18089 <loop24.inp
-1794967296
real 0m17.422s
user 0m17.380s
sys 0m0.020s
~/sac/benchmarks/c/livermore_loops/for_comparison/loop24$ time loop24.c.exe.4.4.5 <loop24.inp
-1794967296
real 0m5.813s
user 0m5.810s
sys 0m0.000s
This is unrelated to Bug#1056 - the bug is not present in this case.
It just looks like SAC does a crummy job on scalar loops.
[This is finding minimum index of minimum value in a vector.]</pre>BugZillaBugZillahttps://gitlab.sac-home.org/sac-group/sac2c/-/issues/1149WL fusion broken for Livermore Loop loop232017-11-19T20:25:04ZRobert BerneckyWL fusion broken for Livermore Loop loop23| | |
| --- | --- |
| Bugzilla Link | [1057](http://bugs.sac-home.org/show_bug.cgi?id=1057) |
| Created on | Apr 08, 2013 18:31 |
| Version | svn |
| OS | Linux |
| Architecture | PC |
## Extended Description
<pre>~/sac/demos/benchma...| | |
| --- | --- |
| Bugzilla Link | [1057](http://bugs.sac-home.org/show_bug.cgi?id=1057) |
| Created on | Apr 08, 2013 18:31 |
| Version | svn |
| OS | Linux |
| Architecture | PC |
## Extended Description
<pre>~/sac/demos/benchmarks/livermore_loops/for_comparison/loop22$ sac2c loop22.sac -v1 -dowlfs
ABORT: line 1 file: ArrayTransform.sac
ABORT: _accu_ yields 1 instead of 2 return value(s)
*** Compilation failed ***
*** Exit code 82 (Running SAC optimizations)
*** 1 Error(s), 0 Warning(s)
sac2c -V
sac2c v1.00-beta (Haggis And Apple)
product rev 18089 linux-gnu_x86_64
(Mon Apr 8 11:42:04 EDT 2013 by sac)</pre>BugZillaBugZillahttps://gitlab.sac-home.org/sac-group/sac2c/-/issues/1148Weird performance problem in Livermore Loop loop24.sac of for() vs. while()2017-11-19T20:25:01ZRobert BerneckyWeird performance problem in Livermore Loop loop24.sac of for() vs. while()| | |
| --- | --- |
| Bugzilla Link | [1056](http://bugs.sac-home.org/show_bug.cgi?id=1056) |
| Created on | Apr 05, 2013 15:46 |
| Version | svn |
| OS | Linux |
| Architecture | PC |
| Attachments | [crud.slow](/uploads/ced8eb293aaaf...| | |
| --- | --- |
| Bugzilla Link | [1056](http://bugs.sac-home.org/show_bug.cgi?id=1056) |
| Created on | Apr 05, 2013 15:46 |
| Version | svn |
| OS | Linux |
| Architecture | PC |
| Attachments | [crud.slow](/uploads/ced8eb293aaaf4b5d4c9e6859dbecff0/crud.slow), [loop24.sac](/uploads/8d2303a918153e4e2b1b92ed54793cc0/loop24.sac), [bugfor.sac](/uploads/e66de34423d1e6d90deebccc948430d6/bugfor.sac), [loop24.inp](/uploads/3a748249a88a0d1f5f6b156f8fd6f7d4/loop24.inp), [bug1056.sac](/uploads/c558bd54254764a3b0f4861498b407eb/bug1056.sac), [crud.fast](/uploads/d465d8e4c93ad2c68e218b6bb22b5b03/crud.fast) |
## Extended Description
<pre>Created an attachment (id=955)
source code to reproduce fault
sac2c -V
sac2c v1.00-beta (Haggis And Apple)
product rev 18089 linux-gnu_x86_64
(Fri Apr 5 09:54:36 EDT 2013 by sac)
This says it all:
sac@rattler:~/sac/demos/benchmarks/livermore_loops/for_comparison/loop24$ sac2c-d bugfor.sac -v1 -doawlf -nowlf -O3
WARNING: AWLF is enabled: -ecc enabled.
WARNING: AWLF is enabled: -extrema enabled.
WARNING: AWLF is enabled: -maxoptcyc=20
sac@rattler:~/sac/demos/benchmarks/livermore_loops/for_comparison/loop24$ time a.out <loop24.inp ; echo $?
-1794967296
real 0m22.851s
user 0m22.840s
sys 0m0.000s
0
sac@rattler:~/sac/demos/benchmarks/livermore_loops/for_comparison/loop24$ vi bugfor.sac
sac@rattler:~/sac/demos/benchmarks/livermore_loops/for_comparison/loop24$ sac2c-d loop24.sac -v1 -doawlf -nowlf -O3
WARNING: AWLF is enabled: -ecc enabled.
WARNING: AWLF is enabled: -extrema enabled.
WARNING: AWLF is enabled: -maxoptcyc=20
sac@rattler:~/sac/demos/benchmarks/livermore_loops/for_comparison/loop24$ time a.out <loop24.inp ; echo $?
-1794967296
real 0m17.384s
user 0m17.380s
sys 0m0.000s
0
sac@rattler:~/sac/demos/benchmarks/livermore_loops/for_comparison/loop24$ diff bugfor.sac loop24.sac
21c21
< m = 0;
---
> max24 = 0;
24,35c24,28
< // while (k < n){
< // m = X[k] < X[m] ? k : m;
< // k = k + 1;
< // }
<
< for( k=0; k<n; k++) {
< if( X[k] < X[m]) {
< m = k;
< }
< }
<
< return( m);
---
> while (k < n){
> max24 = X[k] < X[max24] ? k : max24;
> k = k + 1;
> }
> return(max24);
The elapsed time for the for() loop is significantly greater than
that of the while() loop. Here are the PAPIEX results:
diff bugfor.inp loop24.inp
sac@rattler:~/sac/demos/benchmarks/livermore_loops/for_comparison/loop24$ rm bugfor*txt
sac@rattler:~/sac/demos/benchmarks/livermore_loops/for_comparison/loop24$ papioneLivermore bugfor.sac
Compiling livermore loop bugfor.sac -O3 -nowlf -doawlf
WARNING: AWLF is enabled: -ecc enabled.
WARNING: AWLF is enabled: -extrema enabled.
WARNING: AWLF is enabled: -maxoptcyc=20
Executing bugfor.sac
sac@rattler:~/sac/demos/benchmarks/livermore_loops/for_comparison/loop24$ cp loop24.inp crud.inp
sac@rattler:~/sac/demos/benchmarks/livermore_loops/for_comparison/loop24$ cp loop24.sac crud.sac
sac@rattler:~/sac/demos/benchmarks/livermore_loops/for_comparison/loop24$ rm crud*txt
sac@rattler:~/sac/demos/benchmarks/livermore_loops/for_comparison/loop24$ papioneLivermore crud.sac
Compiling livermore loop crud.sac -O3 -nowlf -doawlf
WARNING: AWLF is enabled: -ecc enabled.
WARNING: AWLF is enabled: -extrema enabled.
WARNING: AWLF is enabled: -maxoptcyc=20
Executing crud.sac
sac@rattler:~/sac/demos/benchmarks/livermore_loops/for_comparison/loop24$ cat bugfor.*txt
papiex version : 0.99
Executable : /home/sac/sac/demos/benchmarks/livermore_loops/for_comparison/loop24/bugfor.sac.exe.awlf.18089
Arguments :
Processor : AMD Phenom(tm) II X6 1075T Processor
Clockrate : 3000.000000
Hostname : rattler
Options : PAPI_TOT_INS,PAPI_L1_DCM,PAPI_L2_DCM,PAPI_VEC_INS,NO_MPI_GATHER,NO_SCIENTIFIC
Domain : User
Parent process id : 5978
Process id : 5979
Start : Fri Apr 5 10:38:20 2013
Finish : Fri Apr 5 10:38:43 2013
Instructions Completed ....................... 90016357233
Vector Instructions .......................... 60006020056
L1 Data Cache Misses ......................... 15473862
L2 Data Cache Misses ......................... 3234
Real usecs ................................... 22825991
Real cycles .................................. 68785872665
Virtual usecs ................................ 22822293
Virtual cycles ............................... 68466870000
PAPI_TOT_INS ................................. 90016357233
PAPI_L1_DCM .................................. 15473862
PAPI_L2_DCM .................................. 3234
PAPI_VEC_INS ................................. 60006020056
Event descriptions:
PAPI_TOT_INS : Instructions completed
PAPI_L1_DCM : Level 1 data cache misses
PAPI_L2_DCM : Level 2 data cache misses
PAPI_VEC_INS : Vector/SIMD instructions (could include integer)
sac@rattler:~/sac/demos/benchmarks/livermore_loops/for_comparison/loop24$ cat crud*txt
papiex version : 0.99
Executable : /home/sac/sac/demos/benchmarks/livermore_loops/for_comparison/loop24/crud.sac.exe.awlf.18089
Arguments :
Processor : AMD Phenom(tm) II X6 1075T Processor
Clockrate : 3000.000000
Hostname : rattler
Options : PAPI_TOT_INS,PAPI_L1_DCM,PAPI_L2_DCM,PAPI_VEC_INS,NO_MPI_GATHER,NO_SCIENTIFIC
Domain : User
Parent process id : 6027
Process id : 6028
Start : Fri Apr 5 10:39:49 2013
Finish : Fri Apr 5 10:40:07 2013
Instructions Completed ....................... 85008352882
Vector Instructions .......................... 50000520056
L1 Data Cache Misses ......................... 15087997
L2 Data Cache Misses ......................... 3285
Real usecs ................................... 17361987
Real cycles .................................. 52320240093
Virtual usecs ................................ 17358672
Virtual cycles ............................... 52076004000
PAPI_TOT_INS ................................. 85008352882
PAPI_L1_DCM .................................. 15087997
PAPI_L2_DCM .................................. 3285
PAPI_VEC_INS ................................. 50000520056
Event descriptions:
PAPI_TOT_INS : Instructions completed
PAPI_L1_DCM : Level 1 data cache misses
PAPI_L2_DCM : Level 2 data cache misses
PAPI_VEC_INS : Vector/SIMD instructions (could include integer)
This is with:
gcc --version
gcc (Ubuntu/Linaro 4.4.4-14ubuntu5) 4.4.5
Note the differences in both op counts and vector op counts.
This suggests that we are generating different
code for the two versions.
Compiling without -doawlf -nowlf does not affect the results,
which is to be expected, as this is a scalar-loop benchmark.</pre>BugZillaBugZillahttps://gitlab.sac-home.org/sac-group/sac2c/-/issues/1147TUP/ETV/EBT confused by UGLF movement, after SAACYC2017-11-19T20:24:54ZRobert BerneckyTUP/ETV/EBT confused by UGLF movement, after SAACYC| | |
| --- | --- |
| Bugzilla Link | [1053](http://bugs.sac-home.org/show_bug.cgi?id=1053) |
| Created on | Mar 19, 2013 15:55 |
| Version | svn |
| OS | Linux |
| Architecture | PC |
| Attachments | [loopis.sac](/uploads/828b4f3900b7...| | |
| --- | --- |
| Bugzilla Link | [1053](http://bugs.sac-home.org/show_bug.cgi?id=1053) |
| Created on | Mar 19, 2013 15:55 |
| Version | svn |
| OS | Linux |
| Architecture | PC |
| Attachments | [loopis.sac](/uploads/828b4f3900b783545c0748ca3813e542/loopis.sac) |
## Extended Description
<pre>Created an attachment (id=953)
source code to reproduce fault
This commit:
commit 66a68890f5336f76140b04f1772f1b6ad4960cf5
Author: Robert Bernecky <bernecky@snakeisland.com>
Date: Sun Mar 17 16:49:29 2013 -0400
caused the typechecker to get turn, erroneously, an AKS variable into an AKV
one, in the apex/loopis/loopis.sac benchmark.
Since the typechecker runs quite happily otherwise, I am going to
back off that change. I do not see anything obvious in phase_sac2c.mac
that should cause this, but perhaps Bodo wants to look at it some day...
sac2c -V
sac2c v1.00-beta (Haggis And Apple)
developer rev 18077 linux-gnu_x86_64
(Tue Mar 19 11:44:17 EDT 2013 by sac)
In Loop_2(), the type of A_CTR28_ has become AKV, with a value of 0.
Which is very wrong.</pre>BugZillaBugZillahttps://gitlab.sac-home.org/sac-group/sac2c/-/issues/1146blife.sac could do better than it does2017-11-19T20:24:49ZRobert Berneckyblife.sac could do better than it does| | |
| --- | --- |
| Bugzilla Link | [1047](http://bugs.sac-home.org/show_bug.cgi?id=1047) |
| Created on | Feb 24, 2013 20:19 |
| Version | svn |
| OS | Linux |
| Architecture | PC |
## Extended Description
<pre>I was just looking ...| | |
| --- | --- |
| Bugzilla Link | [1047](http://bugs.sac-home.org/show_bug.cgi?id=1047) |
| Created on | Feb 24, 2013 20:19 |
| Version | svn |
| OS | Linux |
| Architecture | PC |
## Extended Description
<pre>I was just looking at bug725.sac, an incarnation of Conway's Game of Life,
It does this, in function addpad():
sh1 = shape(a);
pad1 = [genarray([sh1[1]],0)];
add1 = pad1 ++ a ++ pad1;
The ++ functions are unable to AWLF, because of the way that pad1
is defined.
If the code is rewritten this way:
sh1 = shape(a);
shpad = [ 1, sh1[1]];
pad1 = genarray(shpad,0);
add1 = pad1 ++ a ++ pad1;
then the whole shebang wolds into a single WL. I think I'll go
change blife.sac to work this way. However, it would not
be a lot of work for some keener to make AWLF work on the original
blife.sac code.</pre>BugZillaBugZillahttps://gitlab.sac-home.org/sac-group/sac2c/-/issues/1145EWL doesn't like empty array WLs2017-11-19T20:24:47ZRobert BerneckyEWL doesn't like empty array WLs| | |
| --- | --- |
| Bugzilla Link | [1043](http://bugs.sac-home.org/show_bug.cgi?id=1043) |
| Created on | Jan 12, 2013 15:29 |
| Version | svn |
| OS | Linux |
| Architecture | PC |
## Extended Description
<pre>sac2c bug.sac
...
*...| | |
| --- | --- |
| Bugzilla Link | [1043](http://bugs.sac-home.org/show_bug.cgi?id=1043) |
| Created on | Jan 12, 2013 15:29 |
| Version | svn |
| OS | Linux |
| Architecture | PC |
## Extended Description
<pre>sac2c bug.sac
...
** 10: Enhancing with-loops ...
**** Introducing explicit accumulators ...
**** Adding default partitions ...
**** Applying constant folding ...
**** Applying common subexpression elimination ...
**** Generating full with-loop partitions ...
OOOOOOOPS, your program crashed the compiler 8-((
cat bug.sac
use Array:all;
int main()
{
x = [:int];
z = with {
( [0] <= iv < _shape_A_( x)) : 42;
} : modarray(x);
StdIO::print(z);
return(0);
}
sac2c -V
sac2c v1.00-beta (Haggis And Apple)
developer rev 18051 linux-gnu_x86_64
(Fri Jan 11 18:14:26 EST 2013 by sac)
I ran into this while trying to fault-isolate a bug that
shows up in CF, apparently due to WLUR ending up in the
same situation as above.
I'll leave this for someone in Edinburgh to fix.</pre>BugZillaBugZillahttps://gitlab.sac-home.org/sac-group/sac2c/-/issues/1144_mod_() definition blocks AWLF from operating on rotate() for AKD arrays2017-11-19T20:24:44ZRobert Bernecky_mod_() definition blocks AWLF from operating on rotate() for AKD arrays| | |
| --- | --- |
| Bugzilla Link | [1042](http://bugs.sac-home.org/show_bug.cgi?id=1042) |
| Created on | Jan 09, 2013 15:16 |
| Version | svn |
| OS | Linux |
| Architecture | PC |
## Extended Description
<pre>I have been looking...| | |
| --- | --- |
| Bugzilla Link | [1042](http://bugs.sac-home.org/show_bug.cgi?id=1042) |
| Created on | Jan 09, 2013 15:16 |
| Version | svn |
| OS | Linux |
| Architecture | PC |
## Extended Description
<pre>I have been looking at the SAC stdlib rotate (and shift)
functions, as they relate to AWLF, and this led me to
look at the definition of the _mod_()
primitive in SAC. I find it wanting, for several reasons.
The underlying challenge is to make AWLF/WLF operate
with rotate as a producer WL and as a consumer WL.
The existing stdlib code uses mod() and a conditional
to normalize the rotate count. I.e., map it
into a non-negative integer
in the range 0...(N-1), where N is the length of
the rotation axis. In the common cases
where the rotate count is a constant, this is not
a problem. However, in APL code, we often see
expressions such as this one, to drop the leading
blanks from a text vector:
((vec≠' ')⍳1) ↓ vec
or
((vec≠' ')⍳1) ⌽ vec
I do not know what the design rationale was for the mod()
primitive, but suspect it was intended to mimic the
behavior of the (equally ill-defined) a%b (remainder)
operation in C.
1. According to the C99 standard*, the result is defined only
if "...the quotient a/b is representable". E.g.,
0%0 is undefined.
For rotate() arguments where the rotate count ends up being
zero, this can cause the normalization of the rotate count
to signal an error.
In APL, where considerable thought was
given to the definition of residue (aka remainder) on
all integers, floats, and complex numbers,
0 | 0 ( 0%0 in C) is defined to produce 0.
2. In K&R (2nd Edition), "the sign of the result for
% {is} machine-dependent for negative operands."
By constrast, in APL, the definition is clearly specified for
all integer (and, in fact, all numeric) arguments. NB.
argument order is reversed in SAC from that of APL:
¯5 ¯4 ¯3 ¯2 ¯1 0 1 2 3 4 NB. A
0 1 2 3 4 0 1 2 3 4 NB. 5|A (AKA A%5)
This is not the same as the current behavior of the SAC
mod() primitive on negative A:
-5 -4 -3 -2 -1 0 1 2 3 4 NB. A
-5 -4 -3 -2 -1 0 1 2 3 4 NB. A%5
Note that the APL definition maps a negative
rotate count into the correct non-negative count.
The SAC behavior requires checking for a negative
count and then adding b to the count for negatives.
This often inhibits the ability to perform AWLF on
rotated arguments/results, because of problems
around partition/index vector intersect calculation.
Furthermore, it may be that the SAC definition is
"implementation-dependent", in which case we are
left in a position of not being able even to specify the
behavior of the stdlib rotate() function. It could screw
anyone running on an implementation (e.g., Windows?) that produces
different results.
If we are going to define stdlib functions on SAC primitives,
then we need to define the precise semantics of those
primitives, and eschew "implementation-dependent"
behavior.
So, a few questions,and a few proposals for redefinition:
Q1: Does anybody depend on the result of SAC
mod() on negative or zero arguments?
Q2: Are there concrete objections to extending the definition
of SAC mod() to handle zeros and negative arguments
in the same way that APL does, thereby avoiding
any arguments about "implementation-defined"
behavior there and in the stdlib?
P0: If the answers to Q1 and Q2 are No, I propose to
change sac2c mod() to operate as APL does, and
to change the stdlib rotate() functions to rely on
that definition.
2013-01-09: I have not heard any response from sacdev or others
on this topic. Silence is consent, in my book, so I am going
to proceed with the above implementation, as proposed.
* This is a draft of same:
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf</pre>BugZillaBugZillahttps://gitlab.sac-home.org/sac-group/sac2c/-/issues/1143Another nail in the coffin for CTZ2017-11-19T20:24:40ZRobert BerneckyAnother nail in the coffin for CTZ| | |
| --- | --- |
| Bugzilla Link | [1040](http://bugs.sac-home.org/show_bug.cgi?id=1040) |
| Created on | Dec 17, 2012 18:59 |
| Version | svn |
| OS | Linux |
| Architecture | PC |
| Attachments | [SCSprf_mod.sac](/uploads/e87f8068...| | |
| --- | --- |
| Bugzilla Link | [1040](http://bugs.sac-home.org/show_bug.cgi?id=1040) |
| Created on | Dec 17, 2012 18:59 |
| Version | svn |
| OS | Linux |
| Architecture | PC |
| Attachments | [SCSprf_mod.sac](/uploads/e87f8068c93be4e57ffc82677d4770b4/SCSprf_mod.sac) |
## Extended Description
<pre>Created an attachment (id=942)
source code to reproduce fault
The attached CF unit test fails if you compile it this way, in that
the _gt_() is not optimized out:
cd ~/sac/testsuite/optimizations/constantfolding
sac2c SCSprf_mod.sac -doawlf -nowlf -v1 -bopt:uglf >crud
If you compile it adding -noctz, the relational does get
optimized away.
Unfortunately, disabling CTZ cripples some important AWLF
benchmarks, so we need to come up with something better than CTZ.</pre>BugZillaBugZilla