Add workaround for CUDA memory segmentation issues
Try to allocate all device memory in cases where there should be sufficient device memory available, but where allocation fails anyway.
According to Frits Sweijen, this workaround indeed works for his use case:
that branch seems to work! It's reporting
Memory fragmentation detected, retrying to allocate device memory.
CUDA Error at /net/achterrijn/data2/sweijen/software/2020_11_23/idg/src/idg-lib/src/CUDA/common/CU.cpp:194 in function cuMemAlloc(&_ptr, size): out of memory
Memory fragmentation detected, retrying to allocate device memory.
CUDA Error at /net/achterrijn/data2/sweijen/software/2020_11_23/idg/src/idg-lib/src/CUDA/common/CU.cpp:194 in function cuMemAlloc(&_ptr, size): out of memory
Memory fragmentation detected, retrying to allocate device memory.