|
|
Bugzilla Link |
1052 |
Created on |
Mar 14, 2013 15:27 |
Version |
svn |
OS |
Linux |
Architecture |
PC |
Attachments |
loop09A.sac |
Extended Description
Created an attachment (id=952)
source code to reproduce fault
sac2c -V
sac2c v1.00-beta (Haggis And Apple)
product rev 18069 linux-gnu_x86_64
(Wed Mar 13 17:40:45 EDT 2013 by sac)
I have made the SAC and C versions of loop09 match again, in the sense
that they return the same results. I have also reverted loop09 to the
earlier, imperative version, and created loop09A, a version that uses
the sparse vector-matrix multiply that I use in APEX.
Results are good, Almost... Here are the CPU times and cache miss rates
for same:
sac2c loop09A.sac -O3 -dolacsi -doawlf -nowlf -O3
CPU (usec) L1miss L2miss
loop09.c 2210390 20547020 1304
wlf loop09.sac 5228058 20041898 5567
awlf loop09.sac 2388230 19047599 2421
wlf loop09A.sac 10575784 64461221 6619
awlf loop09A.sac 4779216 105809057 3539
It is the last line that is puzzling. IMO, its performance should be
extremely close to that of the third line. But, it ain't.
What is going on is, essentially, this:
- Originally, Cond_0() in the inner loop is passed a 1-element vector V.
It uses V0=V[0] as GENERATOR_BOUND2( [V0]) and GENARRAY_SHAPE( [ V0]).
- -dolacsi allows elimination of the sel V0 = V[0],
- Someone (SAA?) generates V' = [ V0] as the result shape of the
Cond_0 result.
- Eventually, we end up with a funcond at the end of Cond_0(),
roughly of this form:
V' = [ V0];
shp = flat_1 ? V' : V;
Both legs of the funcond match, so this is really just: shp = V.
That should get shp lifted out of Cond_0(), but nothing has
enough smarts to do that.
We end up, therefore, with all this baggage in the inner loop,
of which the building of V' is causing most of the harm.
Possible actions:
1. I do not think we can do much in LACSI about these things.
The shp-related code goes through a lot of optimization before
we get to the point above.
2. We do happen to have AVIS_SCALARS( V) = [ V0]. It should be possible
to make CF or one of its buddies that looks at funconds check to see
if one funcond argument is an N_array that matches the AVIS_SCALARS
of the other and, if so, replace things appropriately:
shp = flat_1 ? V : V;
[Premature replacement by: shp = V; is bad, because of the delicate
nature of the funcond structure. That will get done elsewhere.]
The latter is fairly straightforward, so I'm going to try that approach.