This fixes the long standing non-scalar mt-fold bug. The problem was that we in general try to reuse the neutral element within the folding. While this is desirable in the sequential case it cannot be done in the parallel case. What we did was to simply create an on-stack dummy descriptor for all arguments of the spmd function, including the neutral element. We did not bother to modify the RC of those arguments since we assumed that the RC would be done in a way that keeps all relatively free variables alive. Reusing the neutral element breaks with this invariant! So the first thing I did was to set the RC of the dummy RC to 2! That way, reuse within the With-loop has become impossible. While this solves the segfault problem, it also means that the neutral element that is expected to be consumed within the with-loop exactly once, now was not consumed any more! The solution to this problem is a dec_rc_free on the neutral element AFTER the call to the spmd function!
Besides this fix I added tests specifically for testing against the non-scalar fold with loop and the possible space leak.
I also did a major brushing of the spmd function creation phase, mainly inserting a large comment on what the phase does and a sketch on how it is implemented.