Draft: MERL-105 Introduce python DPBuffers that move instead of copy data

By doing this we can save significant time in pipelines as copying data is a very expensive operation; Especially when running memory bound processing.

To do this a change to uvector in aocommon is required, so that uvector can wrap exiting memory that is passed to it: See merge request here https://gitlab.com/aroffringa/aocommon/-/merge_requests/217

Various changes are required to ensure that python will not deleted the data out from under us:

  1. Introduce emplace_* methods on python DPBuffer that emplace instead of copy data
  2. Have these methods increment and hold a handle to the python object so that python does not GC the data out from under us
  3. Pass this handle via a std::function wrapper to th DPBuffer/uvectors so that they can call the function to allow python to deallocate when appropriate

This approach should never cause double frees, and should not leak memory.

There are however some caveats:

  • We have to use uvector for all buffers now not just the data buffer
  • Memory alignment can be an issue; either memory alignment requirements need to be relaxed on the C++ side (which I've temporarily done) or python needs to align it correctly -- which would be more ideal, but perhaps harder to achieve?

Some very basic rudimentary benchmark run using a basic python script to test this functionality out:


Streaming data into DPBuffer


Using move buffer setup time: 0.05361814989009872 buffer process time: 6.4640420528594404


Using copy buffer setup time: 2.409742674266454 buffer process time: 7.804229561821558


Using copy buffer setup time: 2.504001561435871 buffer process time: 7.5073683312512


Using move buffer setup time: 0.8196774645475671 buffer process time: 10.609952771803364

Edited by Malcolm MacLeod

Merge request reports

Loading