Let CUDA::allocate_memory return managed memory
Currently memory allocated through the CUDA::allocate_memory functions outlives the life time of the returned Memory object. This is basically a memory leak, although the memory is freed at the end of the Proxy lifetime, not at the end of the application. Still, this is unexpected and unwanted behavior. This MR makes sure that the memory is freed when the Memory object is destroyed.