split WLIR up into two phases: WLIRI and WLIR; avoids excessive shuffling of N_assign chains; significantly simplifies WLIR
triggered by some compile time observations when compiling MG with Michiel's modulo optimisation, I completely rewrote WLIR. This implementation separates the inference from the actual movement, it's runtime complexity is 2n (2 linear traversals) and it does not require the use of nodelist stacks.