Skip to content
GitLab
  • Menu
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
  • sac2c sac2c
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 394
    • Issues 394
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 25
    • Merge requests 25
  • Deployments
    • Deployments
    • Releases
  • Wiki
    • Wiki
  • External wiki
    • External wiki
  • Activity
  • Graph
  • Create a new issue
  • Commits
  • Issue Boards
Collapse sidebar
  • sac-group
  • sac2csac2c
  • Merge requests
  • !368

Distributed memory backend: use mt_pth within a node and make memory-distributedness a runtime attribute

  • Review changes

  • Download
  • Email patches
  • Plain diff
Merged Thomas Koopman requested to merge thomas/sac2c:distmem-and-mt into develop Mar 12, 2025
  • Overview 21
  • Commits 132
  • Changes 89

This merge request makes two conceptual changes.

  1. The distmem target now depends on the mt_pth target instead of seq.
  2. Distributedness of memory is now a runtime attribute and no longer reflected in the type (used to be PDI/NDI)

As a result of 1, we can use the ST / MT / SPMD functions to remove the runtime execution mode SAC_DISTMEM_exec_mode_t SAC_DISTMEM_exec_mode;

We already stored the distributedness of an array in the descriptor (SAC_ND_A_DESC_IS_DIST), we now overload the necessary functions with a version that checks this flag a runtime. There are no more PDI/NDI attributes in the type needed.

If we need a certain memory type, we enforce this by generating primitive functions (see documentation of distmemify.c). The phase dist_alloc.c ensures that almost always this function does not have to do anything. The only exception is memory that was returned from an external function, as we do not distribute the result (e.g. if we read in an array from stdin, this is only done by node 0, which then broadcasts the result).

The parallelisation of with-loops for the mt backend has been refactored slightly as outlined in schedule_design.txt. It intersects the first schedule with SAC_wl_global_start0 and SAC_wl_global_stop0. Now the only change the distmem backend has to make, is to set these values to ShrayStart, ShrayStop (or a facsimile for fold-loops).

Fold-loops fold their local index range, and then a primitive function _fold_nodes_(scalar) generates the code for our version of an MPI_Allreduce.

Edited Apr 30, 2025 by Thomas Koopman
Assignee
Assign to
Reviewer
Request review from
Time tracking
Source branch: distmem-and-mt