Skip to content

Merge/cleanup/fix CUDA W-Tiling code

Bram Veenboer requested to merge merge-cuda-wtiling into master

The W-Tiling code of GenericOptimized is moved to common/CUDA and Generic now uses the same code. The Unified Memory submode is added to this shared set of W-Tiling routines. All submodes are tested and where needed fixed. Some documentation is added to the constructors of Generic and GenericOptiized to explain the capabilities of the proxies.

Edited by Bram Veenboer

Merge request reports