Adding a WL speeds up loopfsAKD.sac
| Bugzilla Link | 495 |
| Created on | May 17, 2009 19:52 |
| Version | 1.00beta |
| OS | Linux |
| Architecture | PC |
| Attachments | crud.sac |
Extended Description
Created an attachment (id=522) Source code to reproduce fault The attached code has the interesting property that it runs faster if you introduce an extra WL into the mix, via #define CRUD. The resulting code is NOT folded by WLF, so there is an extra WL at the end of phase 11. However, the resulting code executes about 5% FASTER than if you remove the extra WL. Very puzzling. I'm guessing some strangeness in the back end, because eyeballing the code did not turn up any other differences that I could see. Perhaps some backendian type can look at this? PAPI output: without extra loop: crud.sac.exe.O3.papiex.rattler.6186:PAPI_TOT_INS: 105104480 with extra loop: crud.sac.exe.O3.papiex.rattler.6353:PAPI_TOT_INS: 100104533