From f43d94c568f82ce8f441143e0497bcd52e4772f9 Mon Sep 17 00:00:00 2001
From: Jan David Mol <mol@astron.nl>
Date: Thu, 23 Nov 2017 08:45:27 +0000
Subject: [PATCH] Task #11059: Data loss: Added impact of payload, and hint to
 check VLAN IPs

---
 RTCP/Cobalt/GPUProc/doc/data-loss.txt | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/RTCP/Cobalt/GPUProc/doc/data-loss.txt b/RTCP/Cobalt/GPUProc/doc/data-loss.txt
index a6beaa88a0a..dc3ef5cfea7 100644
--- a/RTCP/Cobalt/GPUProc/doc/data-loss.txt
+++ b/RTCP/Cobalt/GPUProc/doc/data-loss.txt
@@ -22,9 +22,15 @@ Total input loss occurs when:
              ||_|\_ 1-digit bord number (0..3, and 6..9 for HBA1)
              | \___ 3-digit station number
              \_____ fixed prefix
+      * For international stations, the receiving COBALT node needs to have the right VLANs configured. If not, the packets will
+        arrive on eth5 (cbt00x-10GB04), but dropped as the destination IP (belonging to the VLAN) does not exist.
 
       * As root on COBALT, run "tcpdump -i <interface> udp -c 100", and check if the packets are received and correctly addressed.
 
+      * For international stations, the receiving COBALT node needs to have the right VLANs configured. If not, the packets will
+        arrive on eth5 (cbt00x-10GB04), but dropped as the destination IP (belonging to the VLAN) does not exist. Check with
+        "ip addr" which IPs exist, if you see packets arriving to VLAN IPs.
+
   * The network drops the datagrams due to routing issues. Trace the station route through the network:
       https://www.astron.nl/lofarwiki/doku.php?id=wanarea:start
 
@@ -55,6 +61,20 @@ Fractional or total input loss occurs when:
         - "payload error" means the packet is marked as incomplete by the station.
         - "otherwise bad" means the packet header is corrupted.
 
+      * The impact of payload errors is signficant. They arrive scattered over time, and any flagged input is smeared over hundreds of samples
+        during processing due to the FIR filter. For a 64-channel interferometry observation, we measured the following:
+
+                    % payload errors    % visibilities flagged
+                    --------------------------------------------
+                    3.5%                91%
+                    1.9%                73%
+                    1.5%                63%
+                    1.06%               44%
+                    0.22%               14%
+                    0.19%               12%
+                    0.10%                6.7%
+                    0.002%               0.13%
+
   * COBALT is not running at real time, and is thus unable to keep up with the input data. This triggers many errors, but all cases devolve into printing:
   
         >>> ERROR RTCP.Cobalt.GPUProc - [block 1] Not running at real time! Deadline was 1.23456 seconds ago
-- 
GitLab