Improve solver speed and reduce thread-local memory use
This MR parallelizes over antenna in a nested RecursiveFor call, and limits the outer loop over frequency blocks to use at most 4 threads, to limit memory usage. The value of 4 may require further tweaking, and if this approach turns out to be slower, another approach would be to check data size and mem availability to only use this approach when low on memory.
Edited by Andre Offringa