Major update to the CUDA W-Tiling code. It is now integrated into the GenericOptimized proxy.
GenericOptimized