Skip to content
Snippets Groups Projects
Select Git revision
  • master default protected
  • full_bandwidth_solve
  • poppy_integration
  • optimize_workflow
  • releases/v5.0 protected
  • use-versioned-releases
  • releases/v5.0rc2 protected
  • releases/v5.0rc1 protected
  • releases/ldv_v407_atdb protected
  • ldv_v407_debug
  • releases/ldv_v406_debug protected
  • releases/ldv_v405 protected
  • releases/ldv_v404 protected
  • v5.0
  • v5.0rc2
  • v5.0rc1
  • ldv_v406_debug
  • ldv_v405_debug
  • ldv_v404
  • ldv_v403
  • ldv_v402
  • v4.0
  • ldv_v401
  • ldv_v40
  • ldv_v031
  • ldv_v03
  • ldv_v01
27 results

target.rst

Blame
  • Code owners
    Assign users and groups as approvers for specific file changes. Learn more.
    target.rst 20.07 KiB

    Target pipeline

    Note

    If you are running the deprecated genericpipeline version of the pipeline (prefactor 3.2 or older), please check the :doc:`old instructions page<target_old>`.

    This pipeline processes the target data in order to apply the direction-independent corrections from the calibrator pipeline. A first initial direction-independent self-calibration of the target field is performed, using a global sky model based on the TGSS ADR or the new Global Sky Model (GSM), and applied to the data.

    This chapter will present the specific steps of the target pipeline in more detail.

    All results (diagnostic plots and calibration solutions) will be stored usually in the --outdir directory specified with your cwltool or toil command.

    targetscheme.png

    Prepare target, incl. "demixing" (prep)

    This part of the pipeline prepares the target data in order to be calibration-ready for the first direction-independent phase-only self-calibration against a global sky model. This mainly includes mitigation of bad data (RFI, bad antennas, contaminations from A-Team sources), selection of the data to be calibrated (usually Dutch stations only), and some averaging to reduce data size and enhance the signal-to-noise ratio. Furthermore, ionospheric Rotation Measure corrections are applied, using RMextract The user can specify whether to do raw data or pre-processed data flagging and whether demixing should be performed.

    The basic workflows are:

    • preparation of data (prep)
    • concatenating and phase-only self-calibration against a global sky model (gsmcal)
    • creating the finally calibrated data set, via applying the self-calibration solutions and compressing the data (finalize)
    The workflow prep consists of:
    • check for a potential station mismatch between calibrator solutions and the target data (step compare_station_list)
    • checking for nearby A-Team sources (step check_Ateam_separation)
    • creating a model of A-Team sources to be subtracted (step make_sourcedb_ateam)
    • getting ionospheric Rotation Measure corrections and adding it to the solutions (step createRMh5parm)
    RMextract.png
    • basic flagging, applying solutions, and averaging (subworkflow dp3_prep_target)
      • edges of the band (flagedge) -- only used if raw_data : true
      • statistical flagging (aoflag) -- only used in raw_data : true
      • baseline flagging (flagbaseline)
      • low elevation flagging (below 15 degress elevation) (flagelev)
      • low amplitude flagging (below 1e-30) (flagamp)
      • demix A-Team sources (demix) -- only used if specified demix : true
      • applying calibrator solutions (steps applyPA, applyBandpass, prep_target_applycal)
      • averaging of the data in time and frequency
      • predicting impact of A-Team sources and write it to the MODEL_DATA column (step predict)
      • clipping time- and frequency chunks that are likely to be affected by A-Team sources (step Ateamclipper)

    Phase-only self-calibration (gsmcal)

    These steps aim for deriving a good first guess for the phase correction in the direction of the phase center (direction-independent phase correction).

    Once this is done, the data is ready for further processing with direction-dependent calibration techniques, using software like Rapthor, factor or killMS.

    The phase solutions derived from the gsmcal workflow are collected and loaded into LoSoTo to provide diagnostic plots:

    • ph_freq??: matrix plot of the phase solutions with time for a particular chunk of target data, where both polarizations are colorcoded
      ph_freq.png
    • ph_poldif_freq??: matrix plot of the XX-YY phase solutions with time for a particular chunk of target data
      ph_poldif_freq.png
    • ph_pol??: matrix plot of the phase solutions for the XX and YY polarization
      ph_polXX.png
    • ph_poldif: matrix plot of the phase solutions for the XX-YY polarization
      ph_poldif.png
    The workflow gsmcal consists of:
    • retrieving and creating a global sky model (steps find_skymodel_target, make_sourcedb_target)
    • identification of fully flagged antennas (step identify_bad_antennas)
    • concatenating the data into chunks (subworkflow concat)
    • wide-band statistical flagging (steps ms_concat and aoflag)
    • checking for bad data chunks (step check_unflagged_fraction)
    • perform the self-calibration against the global skymodel (subworkflow calibrate_target, baseline-dependend smoothing (step BLsmooth) if specified do_smooth : true))

    Finalizing the LINC output (finalize)

    These steps produce the final data output and many helpful diagnostics.

    The workflow finalize consists of:
    • adding missing stations to the solution set with zero phase and unit amplitude (for international stations, step add_missing_stations)
    • applying the phase-only self-calibration solutions to the data and compress them (step apply_gsmcal)
    • derive the structure function of the phases (step structure_function)
    • make a fast image of the target field (steps average and wsclean)
    • create plots of the uv-coverage of the final data set (step uvplot)
    • create a summary file (step summary)

    The last step also incorporates full Dysco compression to save disk space. The fully calibrated data is stored in the DATA column of the final data set.

    Note

    All solutions are written in the h5parm file format via the steps H5parm_collector and called during all the workflows.

    The solutions are stored in the final calibrator solution set cal_solutions.h5.

    Further diagnostics

    The output directory will contain all relevant outputs of the current LINC run, once the pipeline has finished:
    • fully calibrated datasets in results, concatenated with num_SBs_per_group subbands per MS file and averaged, if desired (see averaging options below). The DATA column of each MS contains the calibrated data (with the direction-independent solutions applied).
    • logfiles in logs
    • summary file (JSON format) in ??_LINC_target_summary.json
    • calibration solutions in cal_solutions.h5
    • inspection plots in inspection

    The following diagnostic help to assess the quality of the data reduction:

    • Ateam_separation.png: shows the distance and the elevation of A-Team sources with respect to the analyzed observation
      Ateam_separation.png
    • Ateamclipper.png: fraction of flagged data due to their potential contamination from A-Team sources versus frequency
      Ateamclipper.png
    • unflagged_fraction.png: fraction of remaining unflagged data versus frequency
      unflagged_fraction.png
    • ??_uv-coverage_uvdist.png: fraction of remaining unflagged data versus uv-distance
      uv-coverage_uvdist.png
    • ??_uv_coverage.png: the uv-coverage of the final data set
      uv-coverage.png
    • ??_structure.png: plot of the ionospheric structure function of the processed target field
      structure.png
    • ??-MFS-image.fits: FITS image of the target field
      target_field.png

    You can also check the calibration solutions for more details:

    $ losoto -i cal_solutions.h5
    
    Summary of cal_solutions.h5
    
    
    Solution set 'calibrator':
    ==========================
    
    Directions: 3c286
    
    Stations: CS001HBA0     CS001HBA1       CS002HBA0       CS002HBA1
              CS003HBA0     CS003HBA1       CS004HBA0       CS004HBA1
              CS005HBA0     CS005HBA1       CS006HBA0       CS006HBA1
              CS007HBA0     CS007HBA1       CS011HBA0       CS011HBA1
              CS017HBA0     CS017HBA1       CS021HBA0       CS021HBA1
              CS024HBA0     CS024HBA1       CS026HBA0       CS026HBA1
              CS028HBA0     CS028HBA1       CS030HBA0       CS030HBA1
              CS031HBA0     CS031HBA1       CS032HBA0       CS032HBA1
              CS101HBA0     CS101HBA1       CS103HBA0       CS103HBA1
              CS201HBA0     CS201HBA1       CS301HBA0       CS301HBA1
              CS302HBA0     CS302HBA1       CS401HBA0       CS401HBA1
              CS501HBA0     CS501HBA1       RS106HBA        RS205HBA
              RS208HBA      RS210HBA        RS305HBA        RS306HBA
              RS307HBA      RS310HBA        RS406HBA        RS407HBA
              RS409HBA      RS503HBA        RS508HBA        RS509HBA
    
    Solution table 'bandpass' (type: amplitude): 120 times, 372 freqs, 60 ants, 2 pols
        Flagged data: 0.000%
    
    Solution table 'clock' (type: clock): 120 times, 60 ants
        Flagged data: 0.000%
    
    Solution table 'faraday' (type: rotationmeasure): 60 ants, 120 times
        Flagged data: 0.014%
    
    Solution table 'polalign' (type: phase): 120 times, 60 ants, 1484 freqs, 2 pols
        Flagged data: 0.000%
    
    Solution set 'target':
    ======================
    
    Directions: P000+00
    
    Stations: CS001HBA0     CS001HBA1       CS002HBA0       CS002HBA1
              CS003HBA0     CS003HBA1       CS004HBA0       CS004HBA1
              CS005HBA0     CS005HBA1       CS006HBA0       CS006HBA1
              CS007HBA0     CS007HBA1       CS011HBA0       CS011HBA1
              CS017HBA0     CS017HBA1       CS021HBA0       CS021HBA1
              CS024HBA0     CS024HBA1       CS026HBA0       CS026HBA1
              CS028HBA0     CS028HBA1       CS030HBA0       CS030HBA1
              CS031HBA0     CS031HBA1       CS032HBA0       CS032HBA1
              CS101HBA0     CS101HBA1       CS103HBA0       CS103HBA1
              CS201HBA0     CS201HBA1       CS301HBA0       CS301HBA1
              CS302HBA0     CS302HBA1       CS401HBA0       CS401HBA1
              CS501HBA0     CS501HBA1       RS106HBA        RS205HBA
              RS208HBA      RS210HBA        RS305HBA        RS306HBA
              RS307HBA      RS310HBA        RS406HBA        RS407HBA
              RS409HBA      RS503HBA        RS508HBA        RS509HBA
    
    Solution table 'RMextract' (type: rotationmeasure): 60 ants, 119 times
        Flagged data: 0.000%
    
    Solution table 'TGSSphase' (type: phase): 3446 times, 58 ants, 1 freq, 2 pols
        Flagged data: 0.000%
        History: 2021-07-30 11:25:44: Bad stations 'CS006HBA1', 'CS006HBA0' have not been added
                                      back.

    For an overall summary it is advised to check the summary logfile:

    $ cat logs/???_summary.log
    
    ************************************
    *** LINC target pipeline summary ***
    ************************************
    
    Field name: P000+00
    
    User-specified baseline filter: [CR]S*&
    Additional antennas removed from the data: CS006HBA1, CS006HBA0
    A-Team sources close to the phase reference center: NONE
    
    XX diffractive scale: 4.4 km
    YY diffractive scale: 4.0 km
    
    Changes applied to cal_solutions.h5:
    2021-07-30 11:25:44: Bad stations 'CS006HBA1', 'CS006HBA0' have not been added back.
    
    Amount of flagged solutions per station and solution table:
    Station   bandpass    clock    faraday  polalign  RMextract TGSSphase
    CS001HBA0    0.29%     0.00%     0.00%     0.00%     0.00%     0.00%
    CS001HBA1    0.29%     0.00%     0.00%     0.00%     0.00%     0.00%
    CS002HBA0    0.29%     0.00%     0.00%     0.00%     0.00%     0.05%
    CS002HBA1    0.29%     0.00%     0.00%     0.00%     0.00%     0.00%
    CS003HBA0    0.29%     0.00%     0.00%     0.00%     0.00%     0.00%
    CS003HBA1    0.29%     0.00%     0.00%     0.00%     0.00%     0.05%
    CS004HBA0    0.29%     0.00%     0.00%     0.00%     0.00%     0.05%
    CS004HBA1    6.05%     0.00%     0.00%     0.00%     0.00%     0.05%
    CS005HBA0    0.29%     0.00%     0.00%     0.00%     0.00%     0.05%
    CS005HBA1    0.39%     0.00%     0.00%     0.00%     0.00%     0.00%
    CS006HBA0    0.29%     0.00%     0.00%     0.00%     0.00%
    CS006HBA1    0.29%     0.00%     0.00%     0.00%     0.00%
    
    Amount of flagged data per station at a given state:
    Station    initial  prep    Ateam   final
    CS001HBA0   5.13%   5.41%  11.12%  22.74%
    CS001HBA1   5.13%   5.41%  11.03%  22.51%
    CS002HBA0   5.12%   5.39%  11.39%  23.18%
    CS002HBA1   5.12%   5.40%  21.09%  29.95%
    CS003HBA0   5.12%   5.39%   9.92%  22.58%
    CS003HBA1   5.12%   5.40%  11.37%  23.95%
    CS004HBA0   5.12%   5.40%  13.27%  24.62%
    CS004HBA1   5.12%   5.40%  12.24%  23.53%
    CS005HBA0   5.12%   5.40%  11.59%  23.38%
    CS005HBA1   5.12%  15.36%  20.07%  30.09%
    CS006HBA0 100.00% 100.00% 100.00%
    CS006HBA1 100.00% 100.00% 100.00%
    
    **********
    Summary file is written to: ???_LINC_target_summary.json
    Summary has been created.

    User-defined parameter configuration

    Parameters you will need to adjust

    Location of the target data and calibrator solutions

    • msin: location of the input target data, for instructions look at the :doc:`configuration instructions<parset>` page
    • cal_solutions: location of the calibrator solutions, for instructions look at the :doc:`configuration instructions<parset>` page.

    Parameters you may need to adjust

    Data selection and calibration options

    • refant: regular expression of the stations that are allowed to be selected as a reference antenna by the pipeline (default: CS00.*)
    • flag_baselines: DP3-compatible pattern for baselines or stations to be flagged (may be an empty list, i.e.: [] )
    • process_baselines_target: performs A-Team-clipping/demixing and direction-independent phase-only self-calibration only on these baselines. Choose [CR]S*& if you want to process only cross-correlations and remove international stations (default: [CR]S*&)
    • filter_baselines: selects only this set of baselines to be processed. Choose [CR]S*& if you want to process only cross-correlations and remove international stations (default: [CR]S*&)
    • do_smooth: enable or disable baseline-based smoothing (default: false)
    • rfistrategy: strategy to be applied with the statistical flagger (AOFlagger, default: HBAdefault.rfis)
    • min_unflagged_fraction: minimal fraction of unflagged data to be accepted for further processing of the data chunk (default: 0.5)
    • raw_data: use autoweight, set to True in case you are using raw data (default: false)
    • compression_bitrate: defines the bitrate of Dysco compression of the data after the final step, choose 0 if you do NOT want to compress the data
    • propagatesolutions: use already derived solutions as initial guess for the upcoming time slot
    • apply_tec: apply TEC solutions from the calibrator (default: false)
    • apply_clock: apply clock solutions from the calibrator (default: true)
    • apply_phase: apply full phase solutions from the calibrator (default: false)
    • apply_RM: apply ionospheric Rotation Measure from RMextract (default: true)
    • apply_beam: apply element beam corrections (default: true)
    • gsmcal_step: type of calibration to be performed in the self-calibration step (default: phase)
    • updateweights: update WEIGHT_SPECTRUM column in a way consistent with the weights being inverse proportional to the autocorrelations (default: true)
    • use_target: enable downloading of a target skymodel (default: true)
    • skymodel_source: choose the target skymodel from TGSS ADR or the new Global Sky Model (GSM) (default: TGSS)

    A comprehensive explanation of the baseline selection syntax can be found here.

    Demixing and clipping options

    • demix_sources: choose sources to demix (provided as list), e.g., [CasA,CygA]
    • demix_freqres: frequency resolution used when demixing (default: 48.82kHz, which translates to 4 channels per subband)
    • demix_timeres : time resolution used when demixing in seconds (default: 10)
    • demix: if true force demixing using all sources of demix_sources, if false do not demix (default: null, automatically determines sources to be demixed according to min_separation)
    • lbfgs_historysize: for the LBFGS solver: the history size, specified as a multiple of the parameter vector, to use to approximate the inverse Hessian (default: 10)
    • lbfgs_robustdof: for the LBFGS solver: the degrees of freedom (DOF) given to the noise model (default: 200)
    • clip_sources: list of the skymodel patches to be used for Ateamclipping (default: [VirA_4_patch,CygAGG,CasA_4_patch,TauAGG]). An empty list means including all sources.
    • clipAteam : enables A-Team clipping using the source list from clip_sources (default: true)

    Further pipeline options

    • min_separation: minimal accepted distance to an A-team source on the sky in degrees (will raise a WARNING, default: 30)

    Parameters for pipeline performance

    • max_dp3_threads: number of threads per process for DP3 (default: 10)
    • memoryperc: maximum of memory used for aoflagger in raw_flagging mode in percent (default: 20)
    • aoflag_reorder: make aoflagger reorder the measurement set before running the detection. This prevents that aoflagger will use its memory reading mode, which is faster but uses more memory (default: false, see the AOFlagger manual`_)
    • aoflag_chunksize: this will split the set into intervals with the given maximum size, and flag each interval independently. This lowers the amount of memory required (default: 2000)

    Skymodel directory

    • A-Team_skymodel: location of the A-Team skymodels
    • target_skymodel: location of a user-defined target skymodel used for the self-calibration

    Averaging for the target data

    • avg_timeresolution: intermediate time resolution of the data in seconds after averaging (default: 4)
    • avg_freqresolution : intermediate frequency resolution of the data after averaging (default: 48.82kHz, which translates to 4 channels per subband)
    • avg_timeresolution_concat: final time resolution of the data in seconds after averaging and concatenation (default: 8)
    • avg_freqresolution_concat: final frequency resolution of the data after averaging and concatenation (default: 97.64kHz, which translates to 2 channels per subband)
    • num_SBs_per_group: make concatenated measurement-sets with that many subbands, choose a high number if running LBA (default: 10)

    Concatenating of the target data

    • num_SBs_per_group: make concatenated measurement-sets with that many subbands (default: 10)
    • reference_stationSB: station-subband number to use as reference for grouping, (default: None -> use lowest frequency input data as reference)

    RMextract settings

    • ionex_server: URL of the IONEX server (default: "http://ftp.aiub.unibe.ch/CODE/")
    • ionex_prefix: the prefix of the IONEX files (default: CODG)
    • proxy_server: specify URL or IP of proxy server if needed
    • proxy_port: port of proxy server if needed
    • proxy_user: user name of proxy server if needed
    • proxy_pass: password of proxy server if needed

    In case of LBA observations you might also want to enable demixing (demix: true). If your LBA data has not been demixed before you may still want to keep the A-Team-clipping.