When we wrap functions with pushes and pops for -check s it could be that there are no checks in between the push and the pop. In that case we can safely remove the push and the pop because the function call will never end up in the stack trace.
This can improve the performance by a lot! Especially if you do not have checks in the inner loop of your computation.
Consider the following example:
TCF1 = ...
TCF2 = stackPush(TCF1, ...)
TCF3 = stackPop(TCF2, ...)
In this example above we have a stackPush that is immediately followed by a stackPop with
no code in between. Because we know that there will never be a call to SAC_RuntimeError
in between the push and the pop we can optimise away the push and the pop because the call
this push and pop pair is pushing will never show up in the stack trace.
This optimisation traversal will rewrite the above example into the following:
TCF1 = ...
TCf3 = TCF1;
We essentially cut out the middle assignment to TCF2.
In this next example we can not remove the push and pop because the wrapped function foo
consumes and creates a new TCF in between. This means that foo might call SAC_RuntimeError:
TCF1 = ...
TCF2 = stackPush(TCF1, "foo", ...)
TCF3 = foo(TCF2)
TCF4 = stackPop(TCF3)
Here is another larger example where we can remove the push and pop:
TCF1 = ...
TCF2 = stackPush(TCF1, ...)
c = _add_SxS(a,b)
TCF3 = stackPop(TCF2)
In the above case there are no checks (or any other effects on TheControlFlow) that
exist in between the push and the pop. Because of this we know that there will never
be a call to SAC_RuntimeError which generates a stack trace. As such we rewrite the above into:
TCF1 = ...
c = _add_SxS(a,b)
TCF3 = TCF1
More precisely, we can remove any stackPush and stackPop where the TCF produced by a stackPush
is directly consumed by a stackPop. Or from the bottom up perspective that we actually
take in this traversal: we can remove any stackPop and stackPush pair where the TheControlFlow
argument consumed by the stackPop was created by a stackPush.
We detect the pops with the new IsStackTracePop flag on AP.
One thing that is regrettable is that with -check ps by the time that we get to this optimisation the stackPops are inlined to _tp_stackpop_impl. Unless someone has a better solution I think the best way to deal with this is to just also look for this name using GTPgetImplFundef.