COB-148: Enable NVRTC for CUDA kernel compilation
The NVRTC code has been updated. When NVRTC is enabled, the tSubbandProcPerformance
test illustrated that the FIR_Filter
kernel becomes substantially slower compared to when it is compiled with NVCC. By making a few changes to this kernel, this problem is not only solved, but the kernel now even runs a lot faster. There are no significant performance differences in other kernels, thus NVRTC is now enabled as the new default.
Edited by Bram Veenboer