|
|
Bugzilla Link |
332 |
Created on |
Dec 02, 2006 15:15 |
Resolution |
FIXED |
Resolved on |
Nov 19, 2007 04:30 |
Version |
1.00beta |
OS |
Linux |
Architecture |
PC |
Extended Description
As I noted two days ago, I saw some performance loss in a few
benchmarks after I released the IVE->SSA changes (build#15110), compared
to build#15093. As an example, compiotad.sac ran about 750msec
before, and 950msec afterwards.
I backed up to build #15093, and that version ran in 736-764msec.
I then compared various versions, as noted later. I also broke the
compiler at -b11 for #15111 (today) and #15100.
I have not looked at everything in detail, but I did note that,
near the end of the code, the computation of y[i] is no
longer being LIR'd upwards.
In fact, the computation of y[i] is invariant up to main, yet
y and i are passed through several laters of cond/loop fns...
Along the way, I noted other build times:
- 15093 736-764msec
- 15100 684-804msec
- 15101 2156-2244msec (could be related to not doing -doisv?)
- 15102 2160-2268msec (ditto)
- 15104 916-960msec
- 15109 908-996msec (w/ -doisv! Otherwise 4 seconds!
- 14111(today) 900-940msec
Clearly, things slowed down after 15100.
The change at version 15101 was to replace SCI with SAA as the driving
force behind IVE. That suggests that SAA is missing some inference
here, but both codes have replaced all sel() ops with _idx_sels.
But I don't see the connection between LIR and my shape_cliques.c
change.
---- later on ----
I again reverted to 15104(the one where I switched from using SCI to
using SAA).
I then ran the compiotad.sac test, then patched the return value in
shape_cliques.c to use SCI rather than SAA as the IVE driver, and got
this interesting result:
IVE w/SAA: 864-1024msec
IVE w/SCI: 732-816msec
Clearly, SAA is missing an inference. A diff at phase -b11 tells the
tale clearly (I've also attached the two .out files):
196a197
> int _ive_6988__pinl_6505__iv;
197a199
> int _ive_6986__pinl_3483__iv;
211c213,214
< _pinl_3512____flat_127 = idx_sel( _wlidx_6982__pinl_3487__z,
_pinl_3521___res);
---
> _ive_6986__pinl_3483__iv = idxs2offset( [ 40000000 ],
_pinl_3488___eat_218);
> _pinl_3512____flat_127 = idx_sel( _ive_6986__pinl_3483__iv,
_pinl_3521___res);
229c232,233
< _pinl_6845___flat_721__SSA7_1__SSA8_1 =
idx_sel( _wlidx_6984__pinl_6448_intx, _pinl_6852__z);
---
> _ive_6988__pinl_6505__iv = idxs2offset( [ 40000000 ],
_pinl_6508___eat_210);
> _pinl_6845___flat_721__SSA7_1__SSA8_1 =
idx_sel( _ive_6988__pinl_6505__iv, _pinl_6852__z);
This explains why compiotad is running slower, but it does not
explain why LIR is not being performed.