Add CUDA gridder and degridder tuning script based on kernel_tuner
The CUDA gridder and degridder kernels have several parameters that can be tuned to achieve the best performance. This is a tedious process that was previously performed by hand. After the last changes to these kernels, this was only applied for one specific GPU architecture (Ampere). Consequently, performance on some other architectures is now suboptimal.
In this MR some scripts are added that make tuning of the kernels much simpler, making use of the excellent kernel_tuner by Ben van Werkhoven. Moreover, this new tool support research into energy-efficiency of the kernels.
For a follow-up MR, I plan to run the tuning on several architectures and add the resulting parameters to the code.