MERL-97: Optimise PreFlagger
I'm working on the SKA pre-processing pipeline, which uses DP3. I noticed PreFlagger was taking longer than ApplyCal when run on SKA Low datasets, even though all we do is flag one contiguous range of channels, noting that these datasets have 13,824 channels.
For that specific use case, this update makes PreFlagger 24x faster, but please note that all optimisations in this MR are quite generic.
Changes:
- Every type of flagging in PreFlagger has a policy "flag one pol means flag all pols", therefore instead of storing flags internally as 3D array (nbsl, nchan, npol), we can just use a 2D array (nbsl, nchan). That's a 4x speedup right here.
- Flags were internally stored as
int(32 bit), now usinguint8_twhich is less wasteful. About 1.5x speedup. - The compiler would not vectorise the
flagChannels()function, a bit disappointing on the part of xtensor. Rewrote the loop a bit more explicitly, another 4x speedup.