more comments

cac6953d · Sven-Bodo Scholz · b45df69a · cac6953d
Commit cac6953d authored 7 months ago by Sven-Bodo Scholz
Hide whitespace changes
Inline Side-by-side

Showing

with 99 additions and 1 deletion
+99 -1
--- a/src/libsac2c/arrayopt/SSAWLF.c
+++ b/src/libsac2c/arrayopt/SSAWLF.c
 /*******************************************************************************

+ This traversal implements With-Loop-Folding.
+
+ The general idea is to transform code such as:
+
+ a = with {
+        (... <= iv = [i0, ..., in] < ...) : bar (iv);
+     } : genarray (...);
+
+ ...
+
+ b = with {
+        (... <= jv = [j0, ..., jm] < ...) : foo (_sel_VxA_([expr (jv)], a)) ;
+     } : genarray (...);
+
+ into 
+
+ b = with {
+        ([0] <= [i] < [10]) : foo (bar (expr (i)));
+     } : genarray ([10], 0);
+
+ We refer to the WL that defines 'a' as the source-WL and to the one defining 'b'
+ as target-WL.
+ 
+ Our initial implementation from the '90s was limited to having a single,
+ non-nested Wl as source and a single, non-nested WL as target.
+ Both WLs can be multi-partition!
+
+ SBS: can we handle multi-operator WLs?... more detail needs to be written down here....
+
+ As indicated in the schema above, both with-loops can be higher dimensional,
+ and even have different dimensionalities, as long as 'expr (jv)'
+ constitutes a linear transformation of a permutation of a subset of {j0, ...,jm}
+ potentially combined with constant values.
+
+ Check SSAWLI.c: Folding of index vectors where the indices are not injective (ie not a 
+ proper permutation) are enabled as well, but can be explicitly disabled
+ using the parameter '-dowlf_true_permutation'. 
+ 
+
+ As of 2024, we also supported WLF on nested WLs when using '-dowlf_nested'.
+ The idea is to support scenarios like the following ones:
+
+ 1) flat -> nested:
+ ------------------
+
+  a = with { ([0,0] <= iv < [20,30]) : 0;
+           } : genarray( [ 200, 100], 42);
+
+  b = with {
+        ([0] <= [i] < [40]) {
+             x = heavy (i);
+             vec = with {
+                     ([0] <= [j] < [15]) : x + a[i,j];
+                   } : genarray( [15], 0);
+          } : vec;
+      } : genarray( [40], [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]);
+
+should lead to
+
+ b = with {
+        ([0] <= [i] < [20]) {
+             x = heavy (i);
+             vec = with {
+                     ([0] <= [j] < [15]) : x;
+                   } : genarray( [15], 0);
+          } : vec;
+        ([20] <= [i] < [40]) {
+             x = heavy (i);
+             vec = with {
+                     ([0] <= [j] < [15]) : x + 42;
+                   } : genarray( [15], 0);
+          } : vec;
+      } : genarray( [40], [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]);
+
+ 2) nested -> flat:
+ ------------------
+
+ a = with {
+        (.<= [i] <= .) {
+             x = heavy (i);
+             vec = with {
+                     (. <= [j] <= .) : x + i + j;
+                   } : genarray( [15], 0);
+          } : vec;
+      } : genarray( [100], [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]);
+
+  b = with { ([0,0] <= [k,l] < [20,10]) : a[k,l];
+           } : genarray( [ 20, 10], 42);
+
+should lead to
+
+  b = with { ([0,0] <= [k,l] < [20,10]) : a[k,l];
+           } : genarray( [ 200, 100], 42);
+
+ 3) nested -> nested:
+ ------------------
+
+ *******************************************************************************
 The traversal looks for the innermost WL suitable for folding into.
 The anonymous traversal ATsearchRef then identifies a foldable reference
 and initiates folding. Another anonymous traversal, ATreplace, then
@@ -16,7 +114,7 @@
 has DIFFERENT sets of vector and scalar WITHID names.

 Assumption: We assume that all generators of a WL have the same
- shape.  Furthermore we assume that, if an N_Ncode is referenced by
+ shape.  Furthermore we assume that, if an N_code is referenced by
 more than one generator, all these generators' indexes (vector and
 scalar) have the same names. This 'same name assumption' can even be
 expanded to all generators a WL has. This is true as a consequence of