Segfault while creating Plan with large number of baselines

This issue was reported on 24-12-2020 by Yuping Huang.

I was trying to image some DSA-2000 simulated data with IDG on wsclean and it was seg faulting pretty early on. I used wsclean tag v2.10.1 and idg 0.8.0. The command and output are as follows.

wsclean -no-update-model-required -niter 0 -size 1000 1000 -scale 0.5asec
-weight natural -make-psf -use-idg -idg-mode hybrid -verbose -name noise
DSA_2000_snapshot_bright.ms

WSClean version 2.10.1 (2020-07-20)
This software package is released under the GPL version 3.
Author: André Offringa (offringa@gmail.com).

Using image size of 1000 x 1000, padded to 1200 x 1200.
First measurement set has corrected data: tasks will be applied on the
corrected data column.
Total nr of channels found in measurement sets: 1
=== IMAGING TABLE ===
       # Pol Ch JG ²G In Freq(MHz)
| Independent group:
+-+-J- 0  I   0  0  0  0  695-705 (1)

 == Constructing PSF ==
Precalculating weights for natural weighting...
Opening DSA_2000_snapshot_bright.ms, spw 0 with contiguous MS reader.
WARNING: This measurement set has no or an invalid WEIGHT_SPECTRUM column;
will use less informative WEIGHT column.
Mapping measurement set rows... DONE (0-2089990; 2089990 rows)

trying to open config file
reading config file
IDG version  0.8:HEAD:55f2e64b
CUDA::default_info
Searching for source files in: /home/yuping/idg-env/lib/idg-cuda
Temporary files will be stored in: /tmp/idg-mPj6Su
CUDA::CUDA
InstanceCUDA
set_parameters
compile_kernels
Searching for source files in: /home/yuping/idg-env/lib/idg-cuda
Temporary files will be stored in: /tmp/idg-Jvkck8
Compiling /tmp/idg-mPj6Su/Adder.cubin
Compiling /tmp/idg-mPj6Su/Calibrate.cubin
/usr/local/cuda/bin/nvcc -cubin  -use_fast_math  -G -src-in-ptx -arch=sm_75
-DNR_POLARIZATIONS=4 -I/home/yuping/idg-env/include -o
/tmp/idg-mPj6Su/Calibrate.cubin
/home/yuping/idg-env/lib/idg-cuda/KernelCalibrate.cu
/usr/local/cuda/bin/nvcc -cubin  -use_fast_math  -G -src-in-ptx -arch=sm_75
-DNR_POLARIZATIONS=4 -I/home/yuping/idg-env/include -DTILE_SIZE_GRID=128 -o
/tmp/idg-mPj6Su/Adder.cubin /home/yuping/idg-env/lib/idg-cuda/KernelAdder.cu
Compiling /tmp/idg-mPj6Su/Degridder.cubin
/usr/local/cuda/bin/nvcc -cubin  -use_fast_math  -G -src-in-ptx -arch=sm_75
-DNR_POLARIZATIONS=4 -I/home/yuping/idg-env/include -DBATCH_SIZE=256 -o
/tmp/idg-mPj6Su/Degridder.cubin
/home/yuping/idg-env/lib/idg-cuda/KernelDegridder.cu
Compiling /tmp/idg-mPj6Su/Scaler.cubin
/usr/local/cuda/bin/nvcc -cubin  -use_fast_math  -G -src-in-ptx -arch=sm_75
-DNR_POLARIZATIONS=4 -I/home/yuping/idg-env/include -o
/tmp/idg-mPj6Su/Scaler.cubin
/home/yuping/idg-env/lib/idg-cuda/KernelScaler.cu
Compiling /tmp/idg-mPj6Su/Gridder.cubin
/usr/local/cuda/bin/nvcc -cubin  -use_fast_math  -G -src-in-ptx -arch=sm_75
-DNR_POLARIZATIONS=4 -I/home/yuping/idg-env/include -DBATCH_SIZE=128 -o
/tmp/idg-mPj6Su/Gridder.cubin
/home/yuping/idg-env/lib/idg-cuda/KernelGridder.cu
Compiling /tmp/idg-mPj6Su/Splitter.cubin
/usr/local/cuda/bin/nvcc -cubin  -use_fast_math  -G -src-in-ptx -arch=sm_75
-DNR_POLARIZATIONS=4 -I/home/yuping/idg-env/include -DTILE_SIZE_GRID=128 -o
/tmp/idg-mPj6Su/Splitter.cubin
/home/yuping/idg-env/lib/idg-cuda/KernelSplitter.cu
Devices:
GeForce RTX 2080 Ti
Device memory : 10783 Mb  / 10989 Mb (free / total)
Shared memory : 48.00 Kb
Clk frequency : 1545 Ghz
Mem frequency : 7000 Ghz
Number of SM  : 68
Mem bus width : 352 bit
Mem bandwidth : 616 GB/s
Number of threads  : 1024
Capability    : 75
Unified memory : 1


Compiler flags:
 -use_fast_math  -G -src-in-ptx -arch=sm_75 -DNR_POLARIZATIONS=4
-I/home/yuping/idg-env/include

GenericOptimized::GenericOptimized
InstanceCPU
load_shared_objects
Loading:
/home/yuping/idg-env/lib/idg-cpu/Optimized/libcpu-optimized-kernel-gridder.so
Loading:
/home/yuping/idg-env/lib/idg-cpu/Optimized/libcpu-optimized-kernel-degridder.so
Loading:
/home/yuping/idg-env/lib/idg-cpu/Optimized/libcpu-optimized-kernel-calibrate.so
Loading:
/home/yuping/idg-env/lib/idg-cpu/Optimized/libcpu-optimized-kernel-adder.so
Loading:
/home/yuping/idg-env/lib/idg-cpu/Optimized/libcpu-optimized-kernel-splitter.so
Loading:
/home/yuping/idg-env/lib/idg-cpu/Optimized/libcpu-optimized-kernel-fft.so
Loading:
/home/yuping/idg-env/lib/idg-cpu/Optimized/libcpu-optimized-kernel-adder-wstack.so
Loading:
/home/yuping/idg-env/lib/idg-cpu/Optimized/libcpu-optimized-kernel-splitter-wstack.so
Loading:
/home/yuping/idg-env/lib/idg-cpu/Optimized/libcpu-optimized-kernel-adder-wtiles.so
Loading:
/home/yuping/idg-env/lib/idg-cpu/Optimized/libcpu-optimized-kernel-splitter-wtiles.so
load_kernel_funcions
CPU
Optimized
Opening DSA_2000_snapshot_bright.ms, spw 0 with contiguous MS reader.
WARNING: This measurement set has no or an invalid WEIGHT_SPECTRUM column;
will use less informative WEIGHT column.
Mapping measurement set rows... DONE (0-2089990; 2089990 rows)
Selected channels: 0-1
Determining min and max w & theoretical beam size... DONE
(w=[1.63647e-07:2.78806] lambdas, maxuvw=35025 lambda)
Theoretic beam = 5.89''
m_padded_size: 1200
nr_w_layers: 1
IDG subgrid size: 32
Detected 93 GB of system memory, usage not limited.
Allocatable timesteps (2045 stations, 1 channels, 93 GB mem): 181
Plan::Plan (with WTiles)
Plan::initialize
kernel_size  : 15
subgrid_size : 32
grid_size    : 1200
Segmentation fault (core dumped)

You can find a similar dataset at http://www.tauceti.caltech.edu/yuping/DSA_2000_10s_1chan_fun.ms.tar.gz. I did have a idg.conf in the directory setting buffersize=1. I was able to image a 256-antenna dataset but not this one so maybe something went wrong while it was allocating memory?

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information