LINC issueshttps://git.astron.nl/RD/LINC/-/issues2024-03-25T08:39:45Zhttps://git.astron.nl/RD/LINC/-/issues/56LINC selfcal crash in latest master (and latest singularity image)2024-03-25T08:39:45ZTimothy ShimwellLINC selfcal crash in latest master (and latest singularity image)alexalexhttps://git.astron.nl/RD/LINC/-/issues/60LINC target crashes during structure function2024-03-19T16:49:03ZRoland TimmermanLINC target crashes during structure functionHi,
My LINC target run crashed while working on the structure function. Here is the excerpt from the job log:
> \[1;30mWARNING\[0m \[33m\[job structure_function\] exited with status: 1\[0m
>
> \[1;30mERROR\[0m \[31m\[job structure_func...Hi,
My LINC target run crashed while working on the structure function. Here is the excerpt from the job log:
> \[1;30mWARNING\[0m \[33m\[job structure_function\] exited with status: 1\[0m
>
> \[1;30mERROR\[0m \[31m\[job structure_function\] Job error:
>
> ("Error collecting output for parameter 'output_plot': LINC/steps/LoSoTo.Structure.cwl:60:7: Did not find output file with glob pattern: \['3C277.3ComaA_structure.png'\].", {})\[0m
>
> \[1;30mWARNING\[0m \[33m\[job structure_function\] completed permanentFail\[0m
Also, I have attached the relevant cal_solutions.h5-losoto_err.log.
[cal_solutions.h5-losoto_err.log](/uploads/c7dfc40a182f5fc31c12504e3f2465cb/cal_solutions.h5-losoto_err.log)
Any and all advice is welcome! Please let me know if any more information would be helpful.
Thanks,
Rolandalexalexhttps://git.astron.nl/RD/LINC/-/issues/61LINC target is currently incompatible with compute clusters that restrict int...2024-03-19T16:47:18ZRoland TimmermanLINC target is currently incompatible with compute clusters that restrict internet accessHi,
This may be a relatively uncommon scenario, but I am currently mainly running LINC target on a compute cluster that (for security reasons?) does not allow any internet access on the compute nodes. The login nodes have internet acces...Hi,
This may be a relatively uncommon scenario, but I am currently mainly running LINC target on a compute cluster that (for security reasons?) does not allow any internet access on the compute nodes. The login nodes have internet access, obviously, but batch jobs are run on nodes without internet access.
Currently, this presents two problems as far as I've encountered: the TGSS sky model and the IONEX data can not be downloaded.
I have managed to hack my way around it by downloading these on the login nodes and then injecting them into the pipeline. For the TGSS sky model, this is relatively straight forward, as the pipeline simply tries to find a target.skymodel file and only attempts to download it if the skymodel is not present. However, for the IONEX data, this is more challenging as this required modifying the createRMh5parm.py file:
```
if ionexf == -1:
logging.error("IONEX data not available, even not from fast product server. Trying to find existing file.")
from glob import glob
os.system('mv ../../IONEX/* .')
IONEX_files = glob(f'./*')
for f in IONEX_files:
if f is not "./cal_solutions.h5":
ionexf = f
if ionexf == -1:
logging.error("IONEX data not available")
return(-1)
```
The above solution requires the aid of an additional script to download the IONEX data and place them in the temporary work directory, so it can be found by the pipeline. So this is probably not a robust implementation that would work for everyone. I've only added it as a demonstration of how I've managed to work around it.
To make it easier for myself, and potentially others, to keep our software up-to-date without needing to implement custom edits, it would be great if the official LINC version could be updated to accommodate users on compute infrastructure where internet access is not available.
Thanks,
Rolandalexalexhttps://git.astron.nl/RD/LINC/-/issues/37Structure function failing2024-03-19T15:58:36ZTimothy ShimwellStructure function failingHey,
Sometimes the structure function fails, with the below error. Not quite sure if/how to fix it but wondered if otherwise it could just be optional to fit the structure function or something?
Thanks
usr/local/lib/python3.8/dist-pac...Hey,
Sometimes the structure function fails, with the below error. Not quite sure if/how to fix it but wondered if otherwise it could just be optional to fit the structure function or something?
Thanks
usr/local/lib/python3.8/dist-packages/losoto/operations/structure.py:125: RuntimeWarning: divide by zero encountered in log10
par = np.linalg.lstsq( A.T, np.log10(variance[myselect][~mask]) )[0]
/usr/local/lib/python3.8/dist-packages/losoto/operations/structure.py:125: FutureWarning: `rcond` parameter will change to the default of machine precision times ``max(M, N)`` where M a
nd N are the input matrix dimensions.
To use the future default and silence this warning we advise to pass `rcond=None`, to keep using the old, explicitly pass `rcond=-1`.
par = np.linalg.lstsq( A.T, np.log10(variance[myselect][~mask]) )[0]
/usr/local/lib/python3.8/dist-packages/losoto/operations/structure.py:165: UserWarning: Attempted to set non-positive bottom ylim on a log-scaled axis.
Invalid limit will be ignored.
ax.set_ylim(ymin,ymax)
/usr/local/lib/python3.8/dist-packages/losoto/operations/structure.py:166: RuntimeWarning: divide by zero encountered in log10
ax1.set_ylim(np.log10(ymin),np.log10(ymax))
Traceback (most recent call last):
File "/usr/local/bin/losoto", line 154, in <module>
returncode += ops[ op ]._run_parser( soltab, parser, step )
File "/usr/local/lib/python3.8/dist-packages/losoto/operations/structure.py", line 16, in _run_parser
return run(soltab, doUnwrap, refAnt, plotName, ndiv)
File "/usr/local/lib/python3.8/dist-packages/losoto/operations/structure.py", line 166, in run
ax1.set_ylim(np.log10(ymin),np.log10(ymax))
File "/usr/local/lib/python3.8/dist-packages/matplotlib/axes/_base.py", line 4027, in set_ylim
bottom = self._validate_converted_limits(bottom, self.convert_yunits)
File "/usr/local/lib/python3.8/dist-packages/matplotlib/axes/_base.py", line 3614, in _validate_converted_limits
raise ValueError("Axis limits cannot be NaN or Inf")
ValueError: Axis limits cannot be NaN or Infalexalexhttps://git.astron.nl/RD/LINC/-/issues/43make_summary issue at end of LINC_calibrator2024-03-07T15:32:25ZNeal Jacksonmake_summary issue at end of LINC_calibratorI am getting a crash at the end of LINC-calibrator in make_summary (after the cal_solutions file has been created) and have an updated pull of LINC from the end of September. The full log is on http://www.jb.man.ac.uk/~njj/slurm-92481.ou...I am getting a crash at the end of LINC-calibrator in make_summary (after the cal_solutions file has been created) and have an updated pull of LINC from the end of September. The full log is on http://www.jb.man.ac.uk/~njj/slurm-92481.out but the bit around the crash site is�
[1;30mWARNING�[0m �[33m[job summary] exited with status: 1�[0m
�[1;30mERROR�[0m �[31m[job summary] Job error:
("Error collecting output for parameter 'summary_file': ../../../LINC/steps/summary.cwl:137:7: Did not find output file with glob pattern: '['*.json']'.", {})�[0m
�[1;30mWARNING�[0m �[33m[job summary] completed permanentFail�[0m
�[1;30mERROR�[0m �[31m[step summary] Output is missing expected field file:///share/nas/njj/LINC/workflows/linc_calibrator/ion.cwl#ion/summary/summary_file�[0m
�[1;30mERROR�[0m �[31m[step summary] Output is missing expected field file:///share/nas/njj/LINC/workflows/linc_calibrator/ion.cwl#ion/summary/logfile�[0m
�[1;30mWARNING�[0m �[33m[step summary] completed permanentFail�[0malexalexhttps://git.astron.nl/RD/LINC/-/issues/58Index -1 not working for automatic num_SBs_per_group when doing selfcal2024-03-07T09:25:24ZFrits SweijenIndex -1 not working for automatic num_SBs_per_group when doing selfcalIn the target workflow the `-1` index for automatically setting a full-bandwidth concat for selfcal does not seem to work for me:
https://git.astron.nl/RD/LINC/-/blob/master/workflows/linc_target.cwl#L244
```
- id: num_SBs_per_gro...In the target workflow the `-1` index for automatically setting a full-bandwidth concat for selfcal does not seem to work for me:
https://git.astron.nl/RD/LINC/-/blob/master/workflows/linc_target.cwl#L244
```
- id: num_SBs_per_group
source:
- num_SBs_per_group
- selfcal
valueFrom: '$(self[0] == null ? (self[-1] ? -1 : null) : self[0])'
```
Changing it to `self[1]` seems to work ok. I thought we caught this during the final stages of !175, but maybe it slipped back in.alexalexhttps://git.astron.nl/RD/LINC/-/issues/44Getting LINC to work on a slurm cluster2024-02-28T10:19:20ZKelly GourdjiGetting LINC to work on a slurm clusterHi,
I'm having trouble getting the LINC pipeline to run on the OzSTAR cluster in Australia, which uses slurm. I cannot seem to track down exactly what the root of the problem is as the log file error messages are rather cryptic to me. I ...Hi,
I'm having trouble getting the LINC pipeline to run on the OzSTAR cluster in Australia, which uses slurm. I cannot seem to track down exactly what the root of the problem is as the log file error messages are rather cryptic to me. I noticed that if I grep "error" in the *.err.log files in the work or temporary directory, all the files say that the job was cancelled due to slurm time limit. I'm surprised because the head job failed before reaching the time limit I allocated in the batch script. I wonder if this is related to the root cause of the issue? The job ran for about 3.5 hours before failing. I have attached the batch script as well as the log file. This is for the calibrator pipeline by the way (have yet to proceed to target).[slurm-41973723.out](/uploads/fd631fd70668a5b95ed6a6e90ba4d567/slurm-41973723.out) [cal_linc.sh](/uploads/5282dc41ef90f3c3a8d3b6d0961a616b/cal_linc.sh
I would appreciate any guidance on how to proceed here - I'm eager to process some DDT data:)
Many thanks,
Kellyalexalexhttps://git.astron.nl/RD/LINC/-/issues/53clock phasewraps2024-02-20T08:40:51ZTimothy Shimwellclock phasewrapsHi,
After chatting with Maaijke about all things clock/tec it was bought to my attention that apparently the "removePhaseWraps" option in losoto clock/tec separation should only really be used for long calibrator observations (tested on...Hi,
After chatting with Maaijke about all things clock/tec it was bought to my attention that apparently the "removePhaseWraps" option in losoto clock/tec separation should only really be used for long calibrator observations (tested on 6hr observations)
Presently the default is True but for HBA the calibrator our calibrator observations are only typically 10 minutes long. Should the default be changed to false?
Thanks
```
- id: removePhaseWraps
type: boolean?
doc: |
Detect and remove phase wraps, by default True.
```alexalexhttps://git.astron.nl/RD/LINC/-/issues/54prep.losoto_plot_RM.losot_plot - file existing error2024-02-15T14:26:32ZTimothy Shimwellprep.losoto_plot_RM.losot_plot - file existing errorHi,
Running the latest master of linc (together with latest singularity) on a completely new run I can the following error:
```
[lotss-tshimwell@ui-01 L2005609]$ cat failed_CWLJob_prep.losoto_plot_RM.losoto_plot_kind--CWLJob-instance--k...Hi,
Running the latest master of linc (together with latest singularity) on a completely new run I can the following error:
```
[lotss-tshimwell@ui-01 L2005609]$ cat failed_CWLJob_prep.losoto_plot_RM.losoto_plot_kind--CWLJob-instance--kjg0k_48_v1000.log
[2024-02-05T12:26:37+0100] [MainThread] [I] [toil.worker] ---TOIL WORKER OUTPUT LOG---
[2024-02-05T12:26:37+0100] [MainThread] [I] [toil] Running Toil version 5.9.2-54bfe0b146b76ecc6221de384c255e1be89547c6 on host wn-dc-03.spider.surfsara.nl.
[2024-02-05T12:26:37+0100] [MainThread] [I] [toil.worker] Working on job 'CWLJob' prep.losoto_plot_RM.losoto_plot kind-CWLJob/instance-kjg0k_48 v1
[2024-02-05T12:26:37+0100] [MainThread] [I] [toil.worker] Loaded body Job('CWLJob' prep.losoto_plot_RM.losoto_plot kind-CWLJob/instance-kjg0k_48 v1) from description 'CWLJob' prep.losoto_plot_RM.losoto_plot kind-CWLJob/instance-kjg0k_48 v1
[2024-02-05T12:26:37+0100] [MainThread] [I] [cwltool] Using local copy of Singularity image found in /project/lotss/Software/linc-master//pull
[2024-02-05T12:26:38+0100] [MainThread] [I] [cwltool] [job prep.losoto_plot_RM.losoto_plot] /project/lotss/Software/linc-master/detailed-logs/L2005609/L2005609ns69cg87$ singularity \
--quiet \
exec \
--contain \
--ipc \
--cleanenv \
--userns \
--home \
/project/lotss/Software/linc-master/detailed-logs/L2005609/L2005609ns69cg87:/DpUURT \
--bind \
/project/lotss/Software/linc-master/tmpdir/L2005609/7zlb0jy9:/tmp \
--bind \
/project/lotss/Software/linc-master/detailed-logs/L2005609/L2005609w0xoi_cc/cal_solutions.h5:/DpUURT/cal_solutions.h5:ro \
--pwd \
/DpUURT \
--net \
--network \
none \
/project/lotss/Software/linc-master/pull/astronrd_linc.sif \
bash \
run_losoto_plot.sh > /project/lotss/Software/linc-master/detailed-logs/L2005609/L2005609ns69cg87/cal_solutions.h5-losoto.log 2> /project/lotss/Software/linc-master/detailed-logs/L2005609/L2005609ns69cg87/cal_solutions.h5-losoto_err.log
[2024-02-05T12:26:41+0100] [MainThread] [W] [cwltool] [job prep.losoto_plot_RM.losoto_plot] exited with status: 1
[2024-02-05T12:26:41+0100] [MainThread] [W] [cwltool] [job prep.losoto_plot_RM.losoto_plot] completed permanentFail
Traceback (most recent call last):
File "/home/lotss-tshimwell/.local/lib/python3.9/site-packages/toil/worker.py", line 403, in workerScript
job._runner(jobGraph=None, jobStore=jobStore, fileStore=fileStore, defer=defer)
File "/home/lotss-tshimwell/.local/lib/python3.9/site-packages/toil/job.py", line 2743, in _runner
returnValues = self._run(jobGraph=None, fileStore=fileStore)
File "/home/lotss-tshimwell/.local/lib/python3.9/site-packages/toil/job.py", line 2660, in _run
return self.run(fileStore)
File "/home/lotss-tshimwell/.local/lib/python3.9/site-packages/toil/cwl/cwltoil.py", line 2301, in run
raise cwl_utils.errors.WorkflowException(status)
cwl_utils.errors.WorkflowException: permanentFail
[2024-02-05T12:26:41+0100] [MainThread] [E] [toil.worker] Exiting the worker because of a failed job on host wn-dc-03.spider.surfsara.nl
```
Looking at the error log that it points to I see:
```
[lotss-tshimwell@ui-01 L2005609]$ cat /project/lotss/Software/linc-master/detailed-logs/L2005609/L2005609ns69cg87/cal_solutions.h5-losoto_err.log
Traceback (most recent call last):
File "/usr/local/bin/losoto", line 141, in <module>
H = h5parm(args.h5parm, readonly=False)
File "/usr/local/lib/python3.8/dist-packages/losoto/h5parm.py", line 80, in __init__
self.H = tables.open_file(h5parmFile, 'r+', IO_BUFFER_SIZE=1024*1024*10, BUFFER_TIMES=500)
File "/usr/local/lib/python3.8/dist-packages/tables/file.py", line 300, in open_file
return File(filename, mode, title, root_uep, filters, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/tables/file.py", line 750, in __init__
self._g_new(filename, mode, **params)
File "tables/hdf5extension.pyx", line 366, in tables.hdf5extension.File._g_new
File "/usr/local/lib/python3.8/dist-packages/tables/utils.py", line 173, in check_file_access
raise PermissionError(f"file ``{path}`` exists but it can not be written")
PermissionError: file ``/DpUURT/cal_solutions.h5`` exists but it can not be written
[lotss-tshimwell@ui-01 L2005609]$
```alexalexhttps://git.astron.nl/RD/LINC/-/issues/52RM missing for months old observation2024-02-15T14:25:40ZTimothy ShimwellRM missing for months old observationHi,
Processing data for an observation on the 23rd Oct 2023 I am unable to download the IONDEX data:
E.g.
Apptainer> createRMh5parm.py -t 300 --solsetName target --prefix CODG --server http://ftp.aiub.unibe.ch/CODE/ /project/lotss/Dat...Hi,
Processing data for an observation on the 23rd Oct 2023 I am unable to download the IONDEX data:
E.g.
Apptainer> createRMh5parm.py -t 300 --solsetName target --prefix CODG --server http://ftp.aiub.unibe.ch/CODE/ /project/lotss/Data/linc_inputs/L2021898/L2021898_SAP000_SB001_uv.MS /project/lotss/Data/linc_outputs/Calibrator-L131782/cal_solutions.h5
pyrap will be used to compute positions
Successful readonly open of default-locked table /project/lotss/Data/linc_inputs/L2021898/L2021898_SAP000_SB001_uv.MS: 24 columns, 18377640 rows
Successful readonly open of default-locked table /project/lotss/Data/linc_inputs/L2021898/L2021898_SAP000_SB001_uv.MS/FIELD: 10 columns, 1 rows
Successful readonly open of default-locked table /project/lotss/Data/linc_inputs/L2021898/L2021898_SAP000_SB001_uv.MS/ANTENNA: 10 columns, 71 rows
Successful readonly open of default-locked table /project/lotss/Data/linc_inputs/L2021898/L2021898_SAP000_SB001_uv.MS/ANTENNA: 10 columns, 71 rows
Successful readonly open of default-locked table /project/lotss/Data/linc_inputs/L2021898/L2021898_SAP000_SB001_uv.MS: 24 columns, 18377640 rows
Successful readonly open of default-locked table /project/lotss/Data/linc_inputs/L2021898/L2021898_SAP000_SB001_uv.MS/FIELD: 10 columns, 1 rows
Successful readonly open of default-locked table /project/lotss/Data/linc_inputs/L2021898/L2021898_SAP000_SB001_uv.MS/ANTENNA: 10 columns, 71 rows
Successful readonly open of default-locked table /project/lotss/Data/linc_inputs/L2021898/L2021898_SAP000_SB001_uv.MS/ANTENNA: 10 columns, 71 rows
Traceback (most recent call last):
File "/usr/local/bin/createRMh5parm.py", line 253, in <module>
main(MS, h5parmdb, ionex_server=args.server, ionex_prefix=args.prefix,
File "/usr/local/bin/createRMh5parm.py", line 160, in main
rmdict = getRM.getRM(MS,
File "/usr/local/lib/python3.8/dist-packages/RMextract/getRM.py", line 169, in getRM
assert (ionexf!=-1),"error getting ionex data"
Looking on http://ftp.aiub.unibe.ch/CODE/2023/ I see that there are two files for the 23rd Oct 2023 but these have names:CGIM2910.23N.Z and COD0OPSFIN_20232910000_01D_01D_GIM.RNX.gz but for many other dates there are a lot more files (none named CODG* though in the 2023 folder).
If I force the script to also look in the https://igsiono.uwm.edu.pl place then I also dont find any files.
Very odd because the scripts find ionex files for far new observations (e.g. ive processed observations successfully from Dec 2023).
Thanksalexalexhttps://git.astron.nl/RD/LINC/-/issues/39option to selfcal?2024-02-01T09:17:45ZTimothy Shimwelloption to selfcal?Not really an issue but a feature.
Have you considered adding an option to do a simple self calibration off the image that is produced by LINC. So redoing the step that uses TGSS/GSM for the calibration but instead using the model from ...Not really an issue but a feature.
Have you considered adding an option to do a simple self calibration off the image that is produced by LINC. So redoing the step that uses TGSS/GSM for the calibration but instead using the model from the data?
Thanks,
Timalexalexhttps://git.astron.nl/RD/LINC/-/issues/33LINC_Target step fail at DP32024-01-31T09:56:55ZJeremy RigneyLINC_Target step fail at DP3Hi, I'm trying to run the LINC pipeline from a singularity with cwltool on a local computing cluster. The calibrator pipeline ran successfully with 3C295, however it is failing after ~2 hours on the target pipeline with 8 hours + 30 subb...Hi, I'm trying to run the LINC pipeline from a singularity with cwltool on a local computing cluster. The calibrator pipeline ran successfully with 3C295, however it is failing after ~2 hours on the target pipeline with 8 hours + 30 subbands of HBA core and remote station observations. I've attached a text file with the end of the log file:
[LINC_Target_pipeline_error.txt](/uploads/fc75932f892d1dd9341807632b62e73e/LINC_Target_pipeline_error.txt)
I ran the pipeline in verbose mode and in a DP3 log file (within the first dp3_execute folder in the log folder) the error:
`terminate called after throwing an instance of 'dyscostman::DyscoStManError'
terminate called recursively
what(): Table DataManager error: I/O error: error while writing file '/tmp/l2rwcywm/L793560_SAP000_SB001_uv.dppp.MS/table.f6' -- Error occured inside the Dysco Storage Manager`
was displayed.
The same error was then propagated down through the remaining measurement set error log files. Is this a file accessibility issue or some kind of memory error?
The command I'm running within the singularity is:
`cwltool --no-container --preserve-entire-environment --parallel --log-dir /.../HBA_target.cwl /.../LINC_target.json > logfiletar 2>&1`
(with the full directories to the .cwl and .json files)
Thanks,
Jeremyalexalexhttps://git.astron.nl/RD/LINC/-/issues/50LINC LBA target workflow crashes at selfcal_target step2023-12-19T14:20:22ZMarco IacobelliLINC LBA target workflow crashes at selfcal_target stepHi, when running the LINC LBA target workflow (docker-hub TAG latest, digest sha256:50adec809c43a9510d2e9d271c5cf0ca99b7159df0eff6bc4ca225e8b849609d), I am getting a crash at the step selfcal_target, with logged error being:
```
INFO [wo...Hi, when running the LINC LBA target workflow (docker-hub TAG latest, digest sha256:50adec809c43a9510d2e9d271c5cf0ca99b7159df0eff6bc4ca225e8b849609d), I am getting a crash at the step selfcal_target, with logged error being:
```
INFO [workflow selfcal_target] starting step fr
Expecting value: line 1 column 1 (char 0)
script was:
01 "use strict";
02 var inputs = {
03 "msin": [
04 {
05 "basename": "L2031540_54MHz_uv-000.dp3concat",
06 "class": "Directory",
07 "location": "/data/scratch/iacobelli/LBAtest/TMPmy8o183i/L2031540_54MHz_uv-000.dp3concat",
08 "nameext": ".dp3concat",
09 "nameroot": "L2031540_54MHz_uv-000",
10 "writable": false
11 }
12 ],
13 "skymodel": {
14 "location": "file:///data/scratch/iacobelli/LBAtest/TMPqyrt4l4x/3C380_1_demix_avg.skymodel",
15 "basename": "3C380_1_demix_avg.skymodel",
16 "nameroot": "3C380_1_demix_avg",
17 "nameext": ".skymodel",
18 "class": "File",
19 "checksum": "sha1$cbccc5a179e2d238af6a8cac5395e40e45722526",
20 "size": 2721,
21 "http://commonwl.org/cwltool#generation": 0
22 },
23 "refant": "CS004LBA",
24 "max_dp3_threads": 10,
25 "propagatesolutions": true,
26 "antennaconstraint": "&;CS302LBA;CS003LBA*;CS103LBA*;CS006LBA*;CS031LBA*;CS004LBA*;CS007LBA*;CS024LBA*;CS017LBA*;CS201LBA*;CS005LBA*;CS101LBA*;CS030LBA*;CS501LBA*;CS028LBA*;CS011LBA*;CS301LBA*;CS026LBA*;CS401LBA*;CS021LBA*;CS013LBA*;CS002LBA*",
27 "insolutions": {
28 "location": "file:///data/scratch/iacobelli/LBAtest/TMPlh_271o8/cal_solutions.h5",
29 "basename": "cal_solutions.h5",
30 "nameroot": "cal_solutions",
31 "nameext": ".h5",
32 "class": "File",
33 "checksum": "sha1$f4ae4fd30ad39d90a3bf58da9e2f8872d3547d7c",
34 "size": 684993884,
35 "format": "https://git.astron.nl/eosc/ontologies/raw/master/schema/lofar.owl#H5Parm",
36 "http://commonwl.org/cwltool#generation": 0
37 },
38 "process_baselines_target": "[CR]S*&",
39 "bad_antennas": "[CR]S*&"
40 };
41 var self = "&;CS302LBA;CS003LBA*;CS103LBA*;CS006LBA*;CS031LBA*;CS004LBA*;CS007LBA*;CS024LBA*;CS017LBA*;CS201LBA*;CS005LBA*;CS101LBA*;CS030LBA*;CS501LBA*;CS028LBA*;CS011LBA*;CS301LBA*;CS026LBA*;CS401LBA*;CS021LBA*;CS013LBA*;CS002LBA*";
42 var runtime = {
43 "tmpdir": null,
44 "outdir": null
45 };
46 (function(){return ((["["+self.replaceAll(';',',').replaceAll('','').replaceAll(';','').replaceAll('&','')+"]"]));})()
stdout was: ''
stderr was: 'evalmachine.<anonymous>:46
(function(){return ((["["+self.replaceAll(';',',').replaceAll('','').replaceAll(';','').replaceAll('&','')+"]"]));})()
^
TypeError: self.replaceAll is not a function
at evalmachine.<anonymous>:46:32
at evalmachine.<anonymous>:46:119
at Script.runInContext (vm.js:133:20)
at Script.runInNewContext (vm.js:139:17)
at Object.runInNewContext (vm.js:322:38)
at Socket.<anonymous> ([eval]:11:57)
at Socket.emit (events.js:198:13)
at addChunk (_stream_readable.js:288:12)
at readableAddChunk (_stream_readable.js:265:13)
at Socket.Readable.push (_stream_readable.js:224:10)'
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/cwl_utils/sandboxjs.py", line 509, in eval
return cast(CWLOutputType, json.loads(stdout))
File "/usr/lib/python3.8/json/_init_.py", line 357, in loads
return _default_decoder.decode(s)
File "/usr/lib/python3.8/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python3.8/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
```alexalexhttps://git.astron.nl/RD/LINC/-/issues/49Is interleaved mode still not supported?2023-11-30T13:05:45ZAlex KurekIs interleaved mode still not supported?We have some observations made a long time ago in the interleaved mode. Is it still not supported LINC? Are you planning to add support in the future? Or can I do anything else to process such data?We have some observations made a long time ago in the interleaved mode. Is it still not supported LINC? Are you planning to add support in the future? Or can I do anything else to process such data?alexalexhttps://git.astron.nl/RD/LINC/-/issues/48how. to stop logs going to /tmp2023-11-09T13:29:56ZTimothy Shimwellhow. to stop logs going to /tmpHi,
Perhaps I should know how to solve this but I dont seem to be able to figure out how to get my useful log files not going to /tmp.
I run LINC like:
toil-cwl-runner --singularity --writeLogsFromAllJobs --stats --bypass-file-store ...Hi,
Perhaps I should know how to solve this but I dont seem to be able to figure out how to get my useful log files not going to /tmp.
I run LINC like:
toil-cwl-runner --singularity --writeLogsFromAllJobs --stats --bypass-file-store --maxCores 40 --workDir /project/lotss/Software/linc/workdir/%s --outdir /project/lotss/Software/linc/outdir/%s/ --writeLogs /project/lotss/Software/linc/logs/%s/ --tmpdir-prefix /project/lotss/Software/linc/tmpdir/%s/ /project/lotss/Software/linc/LINC/workflows/HBA_target.cwl json-files/%s-target.json'%(fieldproperties['target_OBSID'],fieldproperties['target_OBSID'],fieldproperties['target_OBSID'],fieldproperties['target_OBSID'],fieldproperties['target_OBSID'])
But the logs that go to the --writelogs place do not really contain the useful information about what exactly the failure is. I.e. they are called like:
failed_CWLJob_prep.createRMh5parm.createRMh5parm_kind--CWLJob-instance--us1q3c6k_v3000.log
and contain like:
...
/var/lib/cwl/stg34688df3-25b0-4b41-b753-2fb0de8a9816/L2021897_SAP001_SB466_uv.MS \
/var/lib/cwl/stg4380bf67-10c1-42b0-9837-df47882ede02/L2021897_SAP001_SB467_uv.MS \
/var/lib/cwl/stg10c9e27d-da11-4e53-8910-dfe974a36882/L2021897_SAP001_SB468_uv.MS \
/var/lib/cwl/stgab541edf-1cb3-4238-a9e5-c14f01dad3f4/L2021897_SAP001_SB469_uv.MS \
/var/lib/cwl/stg1d73d7dc-02a7-4c67-9da3-2591ee797132/L2021897_SAP001_SB470_uv.MS \
/var/lib/cwl/stga7b808b2-cef2-454d-b416-b38a5110e34b/L2021897_SAP001_SB471_uv.MS \
/var/lib/cwl/stg5aac657c-a8fa-4a74-a246-793c1ca8f90e/L2021897_SAP001_SB472_uv.MS \
/var/lib/cwl/stg486b1c0d-4c5e-4349-a911-b2e645f9a11a/L2021897_SAP001_SB473_uv.MS \
/mKPyMU/cal_solutions.h5 > /tmp/tmpogpqi_gc/createh5parm.log 2> /tmp/tmpogpqi_gc/createh5parm_err.log
The /tmp/tmpogpqi_gc/createh5parm_err.log is the file that would contain the useful information. But /tmp/ is a temporary place on the node that is executing LINC and when the job crashes those files are removed because the job exits the node.
Is there anyway I can change this /tmp/ location to somewhere else?alexalexhttps://git.astron.nl/RD/LINC/-/issues/47Should we reprocess our data?2023-10-31T12:51:53ZAlex KurekShould we reprocess our data?I have some fields from LTA processed by LINC half a year or more ago. I see a lot of significant changes to LINC, including even related to selfcal. If I would process my data anew using the current LINC master, would the resulting maps...I have some fields from LTA processed by LINC half a year or more ago. I see a lot of significant changes to LINC, including even related to selfcal. If I would process my data anew using the current LINC master, would the resulting maps from DDF or Rapthor be better?https://git.astron.nl/RD/LINC/-/issues/45Return of the "Solution-table faraday not found in solset sol000."2023-10-27T09:08:47ZTimothy ShimwellReturn of the "Solution-table faraday not found in solset sol000."Hi,
For some reason in the latest master LINC when using the latest singularity image I get this error on the calibrator.
Traceback (most recent call last):
File "/usr/local/bin/H5parm_collector.py", line 79, in <module>
soltab ...Hi,
For some reason in the latest master LINC when using the latest singularity image I get this error on the calibrator.
Traceback (most recent call last):
File "/usr/local/bin/H5parm_collector.py", line 79, in <module>
soltab = solset.getSoltab(insoltab)
File "/usr/local/lib/python3.8/dist-packages/losoto/h5parm.py", line 611, in getSoltab
raise Exception("Solution-table "+soltab+" not found in solset "+self.name+".")
Exception: Solution-table faraday not found in solset sol000.
Closing remaining open files:cal_solutions.h5...done/var/lib/cwl/stg5716d0cb-87e5-4dde-b687-07021f87ff42/output.h5...done
I am executing LINC like:
toil-cwl-runner --singularity --stats --bypass-file-store --maxCores 32 --workDir /project/lotss/Software/linc/workdir/L734367 --outdir /project/lotss/Software/linc/outdir/L734367/ --logLevel info --log-dir /project/lotss/Software/linc/logs/L734367/ --tmpdir-prefix /project/lotss/Software/linc/tmpdir/L734367/ /project/lotss/Software/linc/LINC/workflows/HBA_calibrator.cwl json-files/L734367-cal.json[lotss-tshimwell@ui-01 linc]$ cat /project/lotss/Software/linc/monitoring/cal_L734367_job.sh at /project/lotss/Software/linc/monitoring/cal_L734367_job.sh
This error the same as was previously reported and fixed so mysterious it seems back again.alexalexhttps://git.astron.nl/RD/LINC/-/issues/46dp3_max_threads not passed to certain DP3 calls in HBA target2023-10-23T08:58:36ZFrits Sweijendp3_max_threads not passed to certain DP3 calls in HBA targetI noticed certain DP3 steps like e.g. dp3_make_parset_target.cwl don't use the numthreads passed to them and still tried to use all 96 CPUs on my node at times. It could run many chunks in parallel, which caused high (>500% in this case)...I noticed certain DP3 steps like e.g. dp3_make_parset_target.cwl don't use the numthreads passed to them and still tried to use all 96 CPUs on my node at times. It could run many chunks in parallel, which caused high (>500% in this case) system load. I can make a MR addressing the places I noticed it happening.Frits SweijenFrits Sweijenhttps://git.astron.nl/RD/LINC/-/issues/41Solution-table farday missing from sol0002023-10-19T09:46:08ZTimothy ShimwellSolution-table farday missing from sol000Hi,
I'm using the make_structure_plot branch of LINC and have also tried the latest master. I'm also using the latest singularity (pulled with singularity pull docker://astronrd/linc).
However, I am getting erorrs on the HBA calibrato...Hi,
I'm using the make_structure_plot branch of LINC and have also tried the latest master. I'm also using the latest singularity (pulled with singularity pull docker://astronrd/linc).
However, I am getting erorrs on the HBA calibrator pipeline (workflows/HBA_calibrator.cwl ). Has anything been changed there recently because it was working all ok.
On the make_structure_plot branch im getting the error:
Traceback (most recent call last):
File "/usr/local/bin/H5parm_collector.py", line 79, in <module>
soltab = solset.getSoltab(insoltab)
File "/usr/local/lib/python3.8/dist-packages/losoto/h5parm.py", line 611, in getSoltab
raise Exception("Solution-table "+soltab+" not found in solset "+self.name+".")
Exception: Solution-table faraday not found in solset sol000.
Closing remaining open files:/var/lib/cwl/stg687e91eb-5f0d-4687-9a67-827b09c5808e/output.h5...donecal_soluti
ons.h5...done
Thanks,
Timalexalexhttps://git.astron.nl/RD/LINC/-/issues/27LINC demix freqstep different from prefactor?2023-10-04T15:42:03ZFrits SweijenLINC demix freqstep different from prefactor?I noticed that LINC chooses `demixfreqstep=4`, or 48.82 kHz (or 4 ch/SB) for 16 ch/SB HBA data. This is different from prefactor's `demixfreqstep=16`. Was it found that demixing at four times higher frequency resolution produces better r...I noticed that LINC chooses `demixfreqstep=4`, or 48.82 kHz (or 4 ch/SB) for 16 ch/SB HBA data. This is different from prefactor's `demixfreqstep=16`. Was it found that demixing at four times higher frequency resolution produces better results or has something evaded translation?Sarod YatawattaalexSarod Yatawatta