Fix CUDA errors on Maxwell
Resolve some crashes encountered when running the tests on a Titan X GPU (Maxwell architecture):
- atomicAdd for double-precision floating-point is not available prior to Pascal
- explicit synchronization is needed after unified memory accesses to prevent a bus error