Skip to content
GitLab
  • Menu
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
  • sac2c sac2c
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 395
    • Issues 395
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 27
    • Merge requests 27
  • Deployments
    • Deployments
    • Releases
  • Wiki
    • Wiki
  • External wiki
    • External wiki
  • Activity
  • Graph
  • Create a new issue
  • Commits
  • Issue Boards
Collapse sidebar
  • sac-group
  • sac2csac2c
  • Merge requests
  • !108

EMR optimisation extension for CUDA backend

  • Review changes

  • Download
  • Email patches
  • Plain diff
Merged Hans-Nikolai Viessmann requested to merge hans/sac2c:hans-emr-lift-fix-cuda into develop Apr 04, 2019
  • Overview 11
  • Commits 17
  • Changes 24

When applying the EMR loop optimisation, which lifts allocations out of loop functions, the CUDA backend previously would cause host2device and cudaMalloc/cudaFree calls to be made for these lifted allocations - in effect negating the optimisation.

This MR includes a new traversal for the CUDA backend call the Minimize EMR Transfers (MEMRT) optimisation which finds functions that have had allocations lifted out (via EMRL), and lifts out host2device primitives which reference EMRL lifted variables. The effect is that we only perform one allocation on the device per lifted allocation, and perform no memory transfers within the loop. The MEMRT traversal is run after all other CUDA transfer minimization (see MTRAN - minimize_transfers.c).

Edited Apr 16, 2019 by Hans-Nikolai Viessmann
Assignee
Assign to
Reviewer
Request review from
Time tracking
Source branch: hans-emr-lift-fix-cuda