fft (FT) performance problems in CF caused by CVP error?
|
|
Bugzilla Link |
455 |
Created on |
Sep 20, 2008 13:56 |
Resolution |
FIXED |
Resolved on |
Sep 22, 2008 18:37 |
Version |
1.00beta |
OS |
Linux |
Architecture |
PC |
Extended Description
Yesterday, we looked into performance problems in the NAS fft_dbl.sac
benchmark. These were initially traced to CF failing to replace
structural constant indexing expressions, as follows:
x = [dbl, 1.0];
z = x[[1]];
I looked into this and determined that the CF failure is due to a workaround
I placed in StructOpSel yonks ago to circumvent a problem in tvd2d: it would
produce wrong answers if the above sort of indexing operation returned
an N_double (or perhaps any primitive constant).
The change was to restrict the above CF to operate ONLY if the result
would be an N_id node. BTW, someone assured me that an N_array can
NEVER contain a mix of N_id nodes and N_double (etc.) nodes. This is
not true, I found out today.
Clemens and I looked further and determined that CVP propagates scalars
into PRF_ARG positions of, at least, scalar functions. as in the
example below, which shows the failure when halted at -bopt:cyc:cvp:3. The output
from the previous phase, -bopt:cyc:cf:3 does NOT have values in PRF_ARG
positions (See subfn).
So, I see two problems here:
1. The definition of N_array contents needs to be clarified:
If an N_array SHOULD contain a mixture of N_double (etc.) nodes
and N_id nodes, ast.xml should be updated to state this.
Ditto if that mixture should be verboten.
Clements suggested that the current mixture behavior is probably
desirable, due to the effect of generating huge numbers of N_id
nodes otherwise, in the case of character string constants, for example.
2. The propagation of constants into PRF_ARG positions is either
wrong, or downstream code for SOME primitives is unable to
handle those nodes properly, because of the wrong answers
produced by tvd2d if CF StructOpSel is allowed to produce
constant nodes as its result.
Code to reproduce failure is bugCVP.sac in obelix,
/home/rbe/sac/demos/nas_parallel_benchmarks/FT/bugCVP.sac.