Skip to content
Snippets Groups Projects

Draft: use cmake to build, test, and install

Closed Bouwe Andela requested to merge cmake-build into master
3 unresolved threads

Use CMake to build, test, and install the library.

See the instructions in README.md for how to make use of the new feature.

It would be good to make a release of the cudawrappers package, so we can use a released version instead of the tests branch.

Edited by Bouwe Andela

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • Bouwe Andela added 1 commit

    added 1 commit

    • a506abb6 - Download cudawrappers using CMake

    Compare with previous version

  • Bouwe Andela
  • Bouwe Andela added 1 commit

    added 1 commit

    • 2b44d1d1 - Use same name for package as for repo

    Compare with previous version

    • The number of polarizations should be 2, the number of bits per sample should be 4, 8, or 16 (BTW, not all GPUs support 4 and 8 bits), and there are a few more constraints on other parameters (see test/CorrelatorTest/Options.cc). The run times can be long due to the validity checks of the output (recomputing all results on the CPU can be ~100x slower than on the GPU). As the code path is quite different for different numbers of receivers, it is useful to test a wide range of receiver numbers (including all kinds of odd values like prime numbers), up to, say 768 receivers. The runtime can be limited by reducing the number of channels to a few (if the code works for a few channels, it is likely to work for a large number of channels as well).

    • I added a few more test cases, but it's probably best if you update this yourself later because you understand better what is going on. Currently, the tests run in about half a minute with srun -N 1 -C gpunode --gres=gpu:1 make -C build test ARGS=-j16 on DAS6. I'm a bit confused about the OpenCL test though, as it does not seem to use the library interface, but the kernel directly. Is that on purpose?

      We could probably speed up the tests by caching the CPU computed results in a file if the runtime becomes too long (i.e more than a few minutes).

    • Please register or sign in to reply
  • Bouwe Andela added 1 commit

    added 1 commit

    Compare with previous version

    • The OpenCL test is a bit odd indeed, as mixing CUDA and OpenCL seems not to be supported, but the example shows how one can make it work (OpenCL does not work on Jetson though).

      In those cases where the runtime becomes excessive, such a file would be excessive in size as well. Better limit the number of channels then.

    • In those cases where the runtime becomes excessive, such a file would be excessive in size as well. Better limit the number of channels then.

      Another possibility would be to compute a hash of the result and store that in the file.

    • Please register or sign in to reply
  • Bram Veenboer mentioned in merge request !5 (merged)

    mentioned in merge request !5 (merged)

  • closed

  • Please register or sign in to reply
    Loading