Target pipeline
Note
If you are running the deprecated genericpipeline version of the pipeline (prefactor 3.2 or older), please check the :doc:`old instructions page<target_old>`.
This pipeline processes the target data in order to apply the direction-independent corrections from the calibrator pipeline. A first initial direction-independent self-calibration of the target field is performed, using a global sky model based on the TGSS ADR or the new Global Sky Model (GSM), and applied to the data.
This chapter will present the specific steps of the target pipeline in more detail.
All results (diagnostic plots and calibration solutions) will be stored usually in the --outdir
directory specified with your cwltool
or toil
command.
Prepare target, incl. "demixing" (prep
)
This part of the pipeline prepares the target data in order to be calibration-ready for the first direction-independent phase-only self-calibration against a global sky model. This mainly includes mitigation of bad data (RFI, bad antennas, contaminations from A-Team sources), selection of the data to be calibrated (usually Dutch stations only), and some averaging to reduce data size and enhance the signal-to-noise ratio. Furthermore, ionospheric Rotation Measure corrections are applied, using RMextract The user can specify whether to do raw data or pre-processed data flagging and whether demixing should be performed.
The basic workflows are:
- preparation of data (
prep
) - concatenating and phase-only self-calibration against a global sky model (
gsmcal
) - creating the finally calibrated data set, via applying the self-calibration solutions and compressing the data (
finalize
)
- The workflow
prep
consists of: -
- check for a potential station mismatch between calibrator solutions and the target data (step
compare_station_list
) - checking for nearby A-Team sources (step
check_Ateam_separation
) - creating a model of A-Team sources to be subtracted (step
make_sourcedb_ateam
) - getting ionospheric Rotation Measure corrections and adding it to the solutions (step
createRMh5parm
)
-
- basic flagging, applying solutions, and averaging (subworkflow
dp3_prep_target
) -
- edges of the band (
flagedge
) -- only used ifraw_data : true
- statistical flagging (
aoflag
) -- only used inraw_data : true
- baseline flagging (
flagbaseline
) - low elevation flagging (below 15 degress elevation) (
flagelev
) - low amplitude flagging (below 1e-30) (
flagamp
) - demix A-Team sources (
demix
) -- only used if specifieddemix : true
- applying calibrator solutions (steps
applyPA
,applyBandpass
,prep_target_applycal
) - averaging of the data in time and frequency
- predicting impact of A-Team sources and write it to the
MODEL_DATA
column (steppredict
) - clipping time- and frequency chunks that are likely to be affected by A-Team sources (step
Ateamclipper
)
- edges of the band (
- basic flagging, applying solutions, and averaging (subworkflow
- check for a potential station mismatch between calibrator solutions and the target data (step
Phase-only self-calibration (gsmcal
)
These steps aim for deriving a good first guess for the phase correction in the direction of the phase center (direction-independent phase correction).
Once this is done, the data is ready for further processing with direction-dependent calibration techniques, using software like Rapthor, factor or killMS.
The phase solutions derived from the gsmcal
workflow are collected and loaded into LoSoTo to provide diagnostic plots:
-
-
ph_freq??
: matrix plot of the phase solutions with time for a particular chunk of target data, where both polarizations are colorcoded -
-
-
-
ph_poldif_freq??
: matrix plot of the XX-YY phase solutions with time for a particular chunk of target data -
-
-
-
ph_pol??
: matrix plot of the phase solutions for the XX and YY polarization -
-
-
-
ph_poldif
: matrix plot of the phase solutions for the XX-YY polarization -
-
- The workflow
gsmcal
consists of: -
- retrieving and creating a global sky model (steps
find_skymodel_target
,make_sourcedb_target
) - identification of fully flagged antennas (step
identify_bad_antennas
) - concatenating the data into chunks (subworkflow
concat
) - wide-band statistical flagging (steps
ms_concat
andaoflag
) - checking for bad data chunks (step
check_unflagged_fraction
) - perform the self-calibration against the global skymodel (subworkflow
calibrate_target
, baseline-dependend smoothing (stepBLsmooth
) if specifieddo_smooth : true
))
- retrieving and creating a global sky model (steps
Finalizing the LINC output (finalize
)
These steps produce the final data output and many helpful diagnostics.
- The workflow
finalize
consists of: -
- adding missing stations to the solution set with zero phase and unit amplitude (for international stations, step
add_missing_stations
) - applying the phase-only self-calibration solutions to the data and compress them (step
apply_gsmcal
) - derive the structure function of the phases (step
structure_function
) - make a fast image of the target field (steps
average
andwsclean
) - create plots of the
uv
-coverage of the final data set (stepuvplot
) - create a summary file (step
summary
)
- adding missing stations to the solution set with zero phase and unit amplitude (for international stations, step
The last step also incorporates full Dysco compression to save disk space. The fully calibrated data is stored in the DATA column of the final data set.
Note
All solutions are written in the h5parm file format via the steps H5parm_collector
and called during all the workflows.
The solutions are stored in the final calibrator solution set cal_solutions.h5
.
Further diagnostics
- The output directory will contain all relevant outputs of the current LINC run, once the pipeline has finished:
-
- fully calibrated datasets in
results
, concatenated withnum_SBs_per_group
subbands per MS file and averaged, if desired (see averaging options below). The DATA column of each MS contains the calibrated data (with the direction-independent solutions applied). - logfiles in
logs
- summary file (JSON format) in
??_LINC_target_summary.json
- calibration solutions in
cal_solutions.h5
- inspection plots in
inspection
- fully calibrated datasets in
The following diagnostic help to assess the quality of the data reduction:
Ateam_separation.png
: shows the distance and the elevation of A-Team sources with respect to the analyzed observation![]()
Ateamclipper.png
: fraction of flagged data due to their potential contamination from A-Team sources versus frequency![]()
unflagged_fraction.png
: fraction of remaining unflagged data versus frequency![]()
??_uv-coverage_uvdist.png
: fraction of remaining unflagged data versusuv
-distance![]()
??_uv_coverage.png
: theuv
-coverage of the final data set![]()
??_structure.png
: plot of the ionospheric structure function of the processed target field![]()
??-MFS-image.fits
: FITS image of the target field![]()
You can also check the calibration solutions for more details:
$ losoto -i cal_solutions.h5
Summary of cal_solutions.h5
Solution set 'calibrator':
==========================
Directions: 3c286
Stations: CS001HBA0 CS001HBA1 CS002HBA0 CS002HBA1
CS003HBA0 CS003HBA1 CS004HBA0 CS004HBA1
CS005HBA0 CS005HBA1 CS006HBA0 CS006HBA1
CS007HBA0 CS007HBA1 CS011HBA0 CS011HBA1
CS017HBA0 CS017HBA1 CS021HBA0 CS021HBA1
CS024HBA0 CS024HBA1 CS026HBA0 CS026HBA1
CS028HBA0 CS028HBA1 CS030HBA0 CS030HBA1
CS031HBA0 CS031HBA1 CS032HBA0 CS032HBA1
CS101HBA0 CS101HBA1 CS103HBA0 CS103HBA1
CS201HBA0 CS201HBA1 CS301HBA0 CS301HBA1
CS302HBA0 CS302HBA1 CS401HBA0 CS401HBA1
CS501HBA0 CS501HBA1 RS106HBA RS205HBA
RS208HBA RS210HBA RS305HBA RS306HBA
RS307HBA RS310HBA RS406HBA RS407HBA
RS409HBA RS503HBA RS508HBA RS509HBA
Solution table 'bandpass' (type: amplitude): 120 times, 372 freqs, 60 ants, 2 pols
Flagged data: 0.000%
Solution table 'clock' (type: clock): 120 times, 60 ants
Flagged data: 0.000%
Solution table 'faraday' (type: rotationmeasure): 60 ants, 120 times
Flagged data: 0.014%
Solution table 'polalign' (type: phase): 120 times, 60 ants, 1484 freqs, 2 pols
Flagged data: 0.000%
Solution set 'target':
======================
Directions: P000+00
Stations: CS001HBA0 CS001HBA1 CS002HBA0 CS002HBA1
CS003HBA0 CS003HBA1 CS004HBA0 CS004HBA1
CS005HBA0 CS005HBA1 CS006HBA0 CS006HBA1
CS007HBA0 CS007HBA1 CS011HBA0 CS011HBA1
CS017HBA0 CS017HBA1 CS021HBA0 CS021HBA1
CS024HBA0 CS024HBA1 CS026HBA0 CS026HBA1
CS028HBA0 CS028HBA1 CS030HBA0 CS030HBA1
CS031HBA0 CS031HBA1 CS032HBA0 CS032HBA1
CS101HBA0 CS101HBA1 CS103HBA0 CS103HBA1
CS201HBA0 CS201HBA1 CS301HBA0 CS301HBA1
CS302HBA0 CS302HBA1 CS401HBA0 CS401HBA1
CS501HBA0 CS501HBA1 RS106HBA RS205HBA
RS208HBA RS210HBA RS305HBA RS306HBA
RS307HBA RS310HBA RS406HBA RS407HBA
RS409HBA RS503HBA RS508HBA RS509HBA
Solution table 'RMextract' (type: rotationmeasure): 60 ants, 119 times
Flagged data: 0.000%
Solution table 'TGSSphase' (type: phase): 3446 times, 58 ants, 1 freq, 2 pols
Flagged data: 0.000%
History: 2021-07-30 11:25:44: Bad stations 'CS006HBA1', 'CS006HBA0' have not been added
back.
For an overall summary it is advised to check the summary logfile:
$ cat logs/???_summary.log
************************************
*** LINC target pipeline summary ***
************************************
Field name: P000+00
User-specified baseline filter: [CR]S*&
Additional antennas removed from the data: CS006HBA1, CS006HBA0
A-Team sources close to the phase reference center: NONE
XX diffractive scale: 4.4 km
YY diffractive scale: 4.0 km
Changes applied to cal_solutions.h5:
2021-07-30 11:25:44: Bad stations 'CS006HBA1', 'CS006HBA0' have not been added back.
Amount of flagged solutions per station and solution table:
Station bandpass clock faraday polalign RMextract TGSSphase
CS001HBA0 0.29% 0.00% 0.00% 0.00% 0.00% 0.00%
CS001HBA1 0.29% 0.00% 0.00% 0.00% 0.00% 0.00%
CS002HBA0 0.29% 0.00% 0.00% 0.00% 0.00% 0.05%
CS002HBA1 0.29% 0.00% 0.00% 0.00% 0.00% 0.00%
CS003HBA0 0.29% 0.00% 0.00% 0.00% 0.00% 0.00%
CS003HBA1 0.29% 0.00% 0.00% 0.00% 0.00% 0.05%
CS004HBA0 0.29% 0.00% 0.00% 0.00% 0.00% 0.05%
CS004HBA1 6.05% 0.00% 0.00% 0.00% 0.00% 0.05%
CS005HBA0 0.29% 0.00% 0.00% 0.00% 0.00% 0.05%
CS005HBA1 0.39% 0.00% 0.00% 0.00% 0.00% 0.00%
CS006HBA0 0.29% 0.00% 0.00% 0.00% 0.00%
CS006HBA1 0.29% 0.00% 0.00% 0.00% 0.00%
Amount of flagged data per station at a given state:
Station initial prep Ateam final
CS001HBA0 5.13% 5.41% 11.12% 22.74%
CS001HBA1 5.13% 5.41% 11.03% 22.51%
CS002HBA0 5.12% 5.39% 11.39% 23.18%
CS002HBA1 5.12% 5.40% 21.09% 29.95%
CS003HBA0 5.12% 5.39% 9.92% 22.58%
CS003HBA1 5.12% 5.40% 11.37% 23.95%
CS004HBA0 5.12% 5.40% 13.27% 24.62%
CS004HBA1 5.12% 5.40% 12.24% 23.53%
CS005HBA0 5.12% 5.40% 11.59% 23.38%
CS005HBA1 5.12% 15.36% 20.07% 30.09%
CS006HBA0 100.00% 100.00% 100.00%
CS006HBA1 100.00% 100.00% 100.00%
**********
Summary file is written to: ???_LINC_target_summary.json
Summary has been created.
User-defined parameter configuration
Parameters you will need to adjust
Location of the target data and calibrator solutions
-
msin
: location of the input target data, for instructions look at the :doc:`configuration instructions<parset>` page -
cal_solutions
: location of the calibrator solutions, for instructions look at the :doc:`configuration instructions<parset>` page.
Parameters you may need to adjust
Data selection and calibration options
-
refant
: regular expression of the stations that are allowed to be selected as a reference antenna by the pipeline (default:CS00.*
) -
flag_baselines
: DP3-compatible pattern for baselines or stations to be flagged (may be an empty list, i.e.:[]
) -
process_baselines_target
: performs A-Team-clipping/demixing and direction-independent phase-only self-calibration only on these baselines. Choose[CR]S*&
if you want to process only cross-correlations and remove international stations (default:[CR]S*&
) -
filter_baselines
: selects only this set of baselines to be processed. Choose[CR]S*&
if you want to process only cross-correlations and remove international stations (default:[CR]S*&
) -
do_smooth
: enable or disable baseline-based smoothing (default:false
) -
rfistrategy
: strategy to be applied with the statistical flagger (AOFlagger, default:HBAdefault.rfis
) -
min_unflagged_fraction
: minimal fraction of unflagged data to be accepted for further processing of the data chunk (default: 0.5) -
raw_data
: use autoweight, set to True in case you are using raw data (default:false
) -
compression_bitrate
: defines the bitrate of Dysco compression of the data after the final step, choose 0 if you do NOT want to compress the data -
propagatesolutions
: use already derived solutions as initial guess for the upcoming time slot -
apply_tec
: apply TEC solutions from the calibrator (default:false
) -
apply_clock
: apply clock solutions from the calibrator (default:true
) -
apply_phase
: apply full phase solutions from the calibrator (default:false
) -
apply_RM
: apply ionospheric Rotation Measure from RMextract (default:true
) -
apply_beam
: apply element beam corrections (default:true
) -
gsmcal_step
: type of calibration to be performed in the self-calibration step (default:phase
) -
updateweights
: updateWEIGHT_SPECTRUM
column in a way consistent with the weights being inverse proportional to the autocorrelations (default:true
) -
use_target
: enable downloading of a target skymodel (default:true
) -
skymodel_source
: choose the target skymodel from TGSS ADR or the new Global Sky Model (GSM) (default:TGSS
)
A comprehensive explanation of the baseline selection syntax can be found here.
Demixing and clipping options
-
demix_sources
: choose sources to demix (provided as list), e.g.,[CasA,CygA]
-
demix_freqres
: frequency resolution used when demixing (default: 48.82kHz, which translates to 4 channels per subband) -
demix_timeres
: time resolution used when demixing in seconds (default: 10) -
demix
: iftrue
force demixing using all sources ofdemix_sources
, iffalse
do not demix (default:null
, automatically determines sources to be demixed according tomin_separation
) -
lbfgs_historysize
: for the LBFGS solver: the history size, specified as a multiple of the parameter vector, to use to approximate the inverse Hessian (default: 10) -
lbfgs_robustdof
: for the LBFGS solver: the degrees of freedom (DOF) given to the noise model (default: 200) -
clip_sources
: list of the skymodel patches to be used for Ateamclipping (default:[VirA_4_patch,CygAGG,CasA_4_patch,TauAGG]
). An empty list means including all sources. -
clipAteam
: enables A-Team clipping using the source list fromclip_sources
(default: true)
Further pipeline options
-
min_separation
: minimal accepted distance to an A-team source on the sky in degrees (will raise a WARNING, default: 30)
Parameters for pipeline performance
-
max_dp3_threads
: number of threads per process for DP3 (default: 10) -
memoryperc
: maximum of memory used for aoflagger in raw_flagging mode in percent (default: 20) -
aoflag_reorder
: make aoflagger reorder the measurement set before running the detection. This prevents that aoflagger will use its memory reading mode, which is faster but uses more memory (default: false, see the AOFlagger manual`_) -
aoflag_chunksize
: this will split the set into intervals with the given maximum size, and flag each interval independently. This lowers the amount of memory required (default: 2000)
Skymodel directory
-
A-Team_skymodel
: location of the A-Team skymodels -
target_skymodel
: location of a user-defined target skymodel used for the self-calibration
Averaging for the target data
-
avg_timeresolution
: intermediate time resolution of the data in seconds after averaging (default: 4) -
avg_freqresolution
: intermediate frequency resolution of the data after averaging (default: 48.82kHz, which translates to 4 channels per subband) -
avg_timeresolution_concat
: final time resolution of the data in seconds after averaging and concatenation (default: 8) -
avg_freqresolution_concat
: final frequency resolution of the data after averaging and concatenation (default: 97.64kHz, which translates to 2 channels per subband) -
num_SBs_per_group
: make concatenated measurement-sets with that many subbands, choose a high number if running LBA (default: 10)
Concatenating of the target data
-
num_SBs_per_group
: make concatenated measurement-sets with that many subbands (default: 10) -
reference_stationSB
: station-subband number to use as reference for grouping, (default:None
-> use lowest frequency input data as reference)
RMextract settings
-
ionex_server
: URL of the IONEX server (default:"http://ftp.aiub.unibe.ch/CODE/"
) -
ionex_prefix
: the prefix of the IONEX files (default:CODG
) -
proxy_server
: specify URL or IP of proxy server if needed -
proxy_port
: port of proxy server if needed -
proxy_user
: user name of proxy server if needed -
proxy_pass
: password of proxy server if needed
In case of LBA observations you might also want to enable demixing (demix: true
).
If your LBA data has not been demixed before you may still want to keep the A-Team-clipping.