diff --git a/applications/lofar2/doc/prestudy/station2_sdp_hdl_components.txt b/applications/lofar2/doc/prestudy/station2_sdp_hdl_components.txt
index 0b1cc193b482232c63a9efc238e1546e766aef8a..d0569123d721f0678fb6a4170d0df710c3edf2e1 100644
--- a/applications/lofar2/doc/prestudy/station2_sdp_hdl_components.txt
+++ b/applications/lofar2/doc/prestudy/station2_sdp_hdl_components.txt
@@ -1,7 +1,7 @@
 *******************************************************************************
-* DP encoder
+* DP encoder / decoder
 *******************************************************************************
-- dp_packet_enc
+- dp_packet_enc / dp_packet_dec
   . Current dp_packet_enc encodes sosi fields into: CHAN (32b), sync & BSN (64b), DATA (>= 1 b), ERR (32b).
   . Use new dp_packet_enc with CRC to mitigate false positive ETH CRC --> dp_packet_enc_crc:
       CHAN (32b), Sync & BSN (64b), DATA (>= 1 b), ERR (32b), CRC (32b)
@@ -36,157 +36,193 @@ Design decisions:
   it does not have to be robust against corrupted packets (wrong contents, wrong length).
 
 
+*******************************************************************************
+* dp_validate_crc
+* - Validate (geldig verklaren) CRC and store-and-forward or store-and-discard this packet
+*******************************************************************************
+
+The Ethernet/DP packet has two CRC checksums in the packet tail:
+
+- the Ethernet CRC is calculated by the 1GbE MAC
+- the DP packet CRC is calculated by the dp_packet_dec.
+
+The packet needs to be stored before it can be forwarded or discarded, because the entire packet is needed 
+to calculate and verify the CRC. The CRC results are reported via the sosi.err field at the end of packet
+(eop). The dp_validate_crc forwards the packet when the CRC is oke and discards the packet when the CRC is
+wrong.
+
+
 
 *******************************************************************************
-* BSN aligner 
+* dp_validate_bsn_at_sync
+* - Validate (geldig verklaren) BSN at Rx sync and pass on or discard packets until next Rx sync
 *******************************************************************************
 
-Usage schemes:
-. General data transports:
-  - N = 2 input aligner with 1 local data and 1   remote data
-  - N > 2 input aligner with 1 local data and N-1 remote data
-  - N >=2 input aligner with only N remote data (not used on ring, but was used in APERTIF)
-         
-. Ring data transports:      
-  - beamlets on ring: l --> r+l --> r+l --> ... --> r+l
-    . on each node align two inputs: l,r
-    . output filler data if remote got lost, to preserve nominal output rate to CEP
-    
-  - crosslets on ring:  rrrrrrrr,l --> rrrrrrrr,l --> ... --> rrrrrrrr,l
-    . on each node separately align N/2 pairs of inputs l,r, have one pair per XC cell
-    or
-    . on each node first align all inputs l,N/2*r, and then split into N/2 pairs of l,r to have one pair per XC cell
-    . discard output data if remote got lost, to count number of active blocks per integration sync interval
-      or
-      output filler data if remote got lost, and use zero to not disturb the intergation and count unflagged blocks
-      to know the number of active blocks per integration sync interval
-    
-  - subbands on ring: l, rl, rrl, rrrl, ..., rrrrrrrrrrrrrrrl
-    . on final node align all l,(N-1)*r inputs
-    . output filler data if remote got lost, to preserve nominal output rate to AARTFAAC
-    
-  - transient buffer readout: l, r, r, ..., r
-    . no align, readout from one node at a time
-    
+The DP packet has a sync and BSN field in the packet header. This field is at the start of the packet (sop),
+so it can be verified while the packet arrives. The Rx BSN at the Rx sync in the received packet should be
+equal to the local Station BSN at the local sync. If the Rx sync BSN and the local sync BSN are:
+
+- not equal, then discard all subsequent blocks until the Rx sync BSN is equal again,
+- equal, then pass on all subsequent blocks until the next Rx sync
+
+The assumption is that if the BSN at sync is wrong, then the block processing at this node or at the remote
+node has not been started properly, so then subsequent blocks will have wrong BSN also. If the BSN at
+sync is oke, all nodes have been started properly abd then the BSN for all subsequent blocks in the sync
+interval will be correct too. The sync and BSN value are not corrupted, because they are determined inside
+the local and remote FPGA (so error free, because the logic is error free) and the remote BSN is
+transported using a CRC (so error free, because the CRC detects all errors).
+
+The initial state to discard or pass on block is don't care, because the assumption is that the block
+processing was (re)started properly on all nodes. At power up, choose to initially pass on packets.
+If the packet with the Rx sync and BSN is lost, then the last decision to discard or pass on packets
+remains, because it is still valid.
+
+The dp_validate_bsn_at_sync function verifies the entire 64 bit sync and BSN in an Rx packet. For local and
+remote inputs the BSN can only differ by a limited number dependent on the latency differences between the
+different inputs. Therefore if the input Rx BSN at sync matches the local Station BSN, then for the
+BSN aligner that aligns the inputs based on the BSN it is sufficient to only use a fraction of the BSN.
+Uding the fraction of the BSN as index is suffivient to distinguish between blocks within the maximum BSN
+latency. If the fraction N is a power of 2 , then only the log2(N) LSbits of the BSN need to be compared
+to ensure that all inputs have the same 64 bit sync and BSN.
+
+
+
+*******************************************************************************
+* BSN aligner 
+*******************************************************************************
 
 Assumptions:
-. Per input the blocks arrive in order
-. Only allow correct Rx packets to enter the FPGA processing, therefore discard Rx packets per input if:
-  - if CRC is wrong,
-    . this requires using store and forward per node, because the enitre packet is needed to calculate the CRC
-  - if the BSN at sync in the Rx packet is not equal to the local Station BSN at sync.
-    . if the sync BSN is not equal then discard all subsequent blocks until the sync BSN is equal again.
-      - sync monitor component: checks that received Station BSN ar sync equals local BSN at sync
-    . if the sync BSN is equal, then:
-      - assume that the BSN for all subsequent blocks will be correct too. This
-        is a save assumption, because the BSN is determined inside the remote FPGA (so error free, because the
-        logic is error free) and transported using a CRC (so error free, because the CRC detects errors.)
-      - BSN based Wr pointer and the align_sync are sufficient in BSN aligner to ensure that all inputs are 
-        aligned
-- Lost packets:
-  . should not cause subsequent packets to get lost too
-    - in APERTIF the BSN aligner does loose more packets due to flush and realign
-    - in APERTIF the sync_checker looses entire sync intervals to ensure filled sync intervals
-    - therefore recovery at the 1 ssync takes too long, because an incidental loss of packet should not
-      cause an entire second of data to get lost.
-  . must not create a burst of filler packets
-- BSN latency
+- Per input the Rx packets arrive in order
+  . a packet contains one or more blocks, on the ring every packet contains one block
+- Only allow correct blocks to enter the FPGA processing
+  . the block validation is based on Rx packet CRC and BSN at sync
+- Usage schemes:
+  . N = 2 inputs aligner with 1 local data and 1   remote data
+  . N > 2 inputs aligner with 1 local data and N-1 remote data
+  . N >=2 inputs aligner with 0 local data and N   remote data (not used on ring, but was used in APERTIF)
+- The local sync and BSN sources on all FPGAs are synchronous, to avoid additional BSN latency between inputs.
+- Static input enable or disable via M&C
+  - it is possible to enable or disable any combination of inputs
+  - if all inputs are disabled then the output stops.
+  - if the input enable or disable setting is changed, then the BSN aligner restarts trying to achieve alignment.
+  - for the ring with 1 local and 1 remote input the static input enable/disable supports the align modes:
+    . disabled,
+    . local only,
+    . remote only,
+    . local and remote
+- Input latency:
+  . the input latencies are fixed by design, so inputs have a maximum BSN latency g_bsn_latency that is fixed
+    and that does not have to be programmable via M&C.
   . If all hops on the ring are active then the total latency will be (N-1)*(d + 1) where d is the transport
     latency of each hop and 1 is due to store-and-forward at each node. Typically the total transport latency
-    on the ring is (N-1)*d < 1, so less than one block period.
-  . The total ring latency is covered by g_bsn_latency > (N-1)*(d + 1). 
-  . programmable g_bsn_latency is not needed
-  . the input latencies are fixed by design, so inputs cannot have more letency than g_bsn_latency and the 
-    Rx packet CRC and BSN at sync checking ensure that only good packets enter the FPGA.
-- If one input packet is lost then it is acceptable that the corresponding output is lost
-  . there should be an option to have filler data such that the other inputs can still be output
-- Only output correct packets, so do not allow packets with corrupted data or wrong BSN to slip through the BSN aligner.
-- If often packets on one input get lost then it is not acceptable that the output is lost.
-  . Support static and dynamic input enable/disable control, or
-  . support filler data on streams that lost input packets
-- If all inputs of BSN aligner stop, then the output stops.
-
+    on the ring is (N-1)*d < 1, so less than one block period. The total ring latency is covered by
+    g_bsn_latency > (N-1)*(d + 1). 
+- Lost input blocks:
+  . accept that the corresponding output is lost too, or output filler block to replace lost block
+  . should not cause subsequent blocks to get lost too
+  . must not induce a burst of output blocks due output catch up after late lost block detection
+  . If often blocks on one input get lost, then it is not acceptable that the output is lost.
+    - insert filler block to replace the lost input blocks, or
+    - support dynamic input enable/disable control
+- Only output correct blocks, either with the received input block or with flagged filler block
+- The output passes on the sync and therefore it does not have to pass on the BSN
+- The output should support flow control to provide output throttling
+- Stopped input:
+  . If all inputs of the BSN aligner stop, then the output stops.
+  . If after some block periods (e.g. g_bsn_latency) there is no more block pending at any input, then the
+    BSN aligner should restart trying to achieve alignment.
+
+Notes:
+- In LOFAR and APERTIF the BSN aligner does loose more blocks due to input flush and realign
+- a BSN aligner can align at any BSN, using a sync aligner that can only align at the sync, would cause
+  loosing an entire sync interval to realign, which is not acceptable
+- in APERTIF the sync_checker looses entire sync intervals to ensure filled sync intervals
+- In LOFAR and APERTIF the output is driven by the remote input to add minimal latency, however this
+  results in loosing more packets and having to realign if input packets get lost.
+- In dp_bsn_align the artifical local data stream was used to ensure that the output block size was correct,
+  by using extra CRC checking (ETH CRC and DP CRC) and store and forward in Rx it is already certain that only
+  correct input packets arrive at the BSN aligner input. Therefore an artifical local data stream is not needed.
 
 
 Design options:
-- Lost packets and filler packets:
-  . Detected by
+- Lost packet detection
+  . Rely on next received packet:
+    - check per input that the align BSN increments +1 within the align_sync interval
+    - requires a timeout or overflow detection on other inputs to detect a burst of lost packets
+    - after a burst of lost packets, typically the output cannot catch up anymore, so then the BSN aligner
+      needs to flush its input buffer and restart.
+  . Per packet using a local output block pacer.
+    The local output block pacer is offset by at least g_bsn_latency relative to the local BSN source, to
+    ensure that all inputs should have a new block pending for output. This is possible, because the input
+    latencies are static and within a fixed range:
     - in circular buffer the Wr flag for the lost block remains unset
     - in FIFO by no pending input or pending input with higher BSN then current output BSN
-    - checking per input that the align bsn increments +1 within the align_sync interval
-    - timeout checking per input using a local slot pacer at the block rate
-    - realign after FIFO overflow on other inputs, due to that output waits for lost packet
-    - can we do without timeouts? Yes because the input latencies are static and within a fixed range
-  . replace by a filler block or let all streams to drop this block
-    - for BF drop all inputs, because beam is affected
-    - for XC pass on, because visibilities of active inputs are still oke.
-  . flagging
-    - filler data blocks can be flagged using a sosi.channel bit as flag
-    - filler data can be undefined, forced to zero, noise, most negative integer in the data, for compled e.g. use real as
-      flag and imag as cause identifier.
-      
-      
-- Input FIFO
-  . packets are stored in arrival order
-  . The BSN of the input packets must differ during a g_bsn_latency interval, but does not have to be incrementing or
-    continuous, because the alignment is based on the BSN being equal
-  . the FIFO must also pass on the 1 s sync
-  . Flushing:
-    - flush per packet or flush until empty?
-    - flush per input per input or flush all inputs?
-    - flush by reading, or by reset or by moving a Rd pointer
-    - Use packet count instead of FIFO full indicator
-    - can we do without flushing the FIFO? Not if we need to realign.
-    - If multiple packets on a remote input get lost, then the other inputs fill up if there is no timeout. Flush
-      all inputs empty if one of them got filled up. Flush empty to avoid that at some moment all inputs may have
-      multiple packets pending in the FIFO, that will then be output in a burst. The pending packets that
-      corresponded to the lost packet will need to be discarded anyway, because there is no time to output them still.
-    - also useful to know BSNs at FIFO inputs? --> No, because FIFO packet count can be used to detect pending FIFO overflow.
-  . Keep FIFOs outside or inside BSN aligner component.
-    - the input of the FIFO is needed to be able to maintain a count of the number of packets in the FIFO, which is
-      relevant for the align timeout. The input eop increments the count and the output eop decrements the count.
-    - inputs with a large latency could use a smaller FIFO, this is easier to control with external FIFOs
-    - if the BSN aligner relies on FIFO input information, then it is better to have the FIFOs inside.
-   
+  
+  ==> Design decision:
+      - Use local block reference to define when to detect lost packets, because one lost block should not
+        cause subsequent blocks to get lost too.
 
-- Input circular buffer
-  . can handle data arriving out of order, but this is not needed within SDP
-  . The buffer memory size is g_bsn_latency * g_nof_inputs slots that can store a packet.
-    - the maximum latency between any two inputs must be < g_bsn_latency number of data blocks
-    - For each slot there is a Wr flag that needs to be maintained.
-    - For each slot there is also a sync flag to pass on the 1 s sync
-  . Can handle out-of-order data, because it uses the BSN as an index. However on the ring in SDP all data will be in order.
-  . The circular buffer could be used as a FIFO with internal access and an incrementing Wr pointer. However it
-    seems better to use it with a Wr pointer that is derived from the BSN.
-  . the BSN must be continuous BSN and incrementing, because then the remainder of the BSN divided by the buffer size can
-    be used as Wr pointer. 
-  . The buffer size is preferrably a power of two, but can be any size (to save memory):
-    - Using a buffer size that is a power of 2 avoids an integer divsion of the BSN, because it can then use the
-      corrsponding LSbits of the BSN as Wr pointer.
-    - Modulo 2**n - 1 can be calculated efficiently for binary numbers, by adding the n-bit digit parts. Similar as
-      mpdulo 3 (= (10-1)/3) can be calculated by adding the decimal digits.
-    - Modulo n for constnat n can be calculated efficiently suing multiplication by 1/n. The 1/n fraction must be 
-      represented with sufficient accuracy to determine the remainder.
-  . The slots in the circular buffer have a Wr flag that is set when the slot is written with an Rx packet and cleared
-    when the slot is read for output.
-  . Flushing:
-    - Clearing a Wr flag or all Wr flags is much faster than flush reading a FIFO.
-  . The Rd pointer increments at every output block period.
-  . The Rd pointer increments after every output slot.
-  . The write pointer always needs to be ahead of the Rd pointer. The minimum distance between the Wr and Rd pointer
-    is g_bsn_latency. The size of the circular buffer is the same for all inputs and must be > g_bsn_latency (for wr)
-    + 1 (for rd). The circular buffer read can occur when the write pointer exceeds rd pointer + g_bsn_latency. 
-  . the circular buffer is part of the BSN aligner component
-  . On CEP the beamlet data is written into a circular buffer based on the time stamp. A flag indicates whether data in the
-    circular buffer is valid. The size of the circular buffer is in the order of hundreds of ms to cover the distance latency 
-    of the international stations. An array of tupples lists the lenght of continuous blocks in the circular buffer, and 
-    therefore also to the gaps. A local timer determines when the circular buffer is read. The local timer has ms accuracy
-    compared to UTC, so the size of the circular buffer dominated by the largest latencies. The channel filterbank in CEP
-    also flags the initial channel data that is disturbed after a gap.
 
+- Output driven by remote input block arrival or by local block reference
+  . in case of 1 remote input, the remote input does not need a FIFO if it drives the output
+  . in case of > 1 remote input, then the remote inputs also requires FIFOs
+  . using local input increases the latency from remote input to output, because fixed to the T_sub grid
+  . using local input at T_sub grid avoids bursts, this can also be handled using flow control
+  . with local input driving the output the assumption is that if the local input has M packets, then all remote
+    inputs will have delivered at least one frame, so there should be a sop pending from all.
+  . if there is no local input, then an artifical local input can be derived when BSN is equal on all enabled remote inputs.
+  . if remote input is lost, then entire output is lost if remote drives output, because there is not enough spare time
+    to still output the other input packets
+  . For remote driven output a slot can be output when for all active inputs there is a block. However if one or
+    a series of packets got lost, then the other inputs will overflow. Hence remote driven output needs a timeout
+    to keep the output running, so a form of local driven output. Hence to avoid additional packet loss on other 
+    inputs or of subsequent packets in time it is necessary to have a local driven output. Therefore using a remote
+    driven output is not feasible. 
+
+  ==> Design decision:
+      - Use local block reference to define when aligned blocks should be output, because one lost block should
+        not cause subsequent blocks to get lost too, which is more important then adding minimal latency and
+        potentially saving BSN aligner input buffer memory.
+
+
+- Generation of local block reference to define the output pace:
+  . During initial input alignment it is important that all active inputs are indeed active, because together they
+    determine the latency difference between inputs. After initial alignment the data output can continue at at a
+    fixed rate, driven by a local block reference:
+    - The local input or the remote input with the least latency could be used as local output block reference,
+      because (N-1)*d << 1. This requires having a local input or detecting the closest remote input.
+    - Alternatively a dedicated local block reference can be started with a certain time offset can be started
+      after achieving input alignment. The time offset sets a margin that ensures that at subsequent block
+      refererence pulses all inputs will have a new block pending if the block is not lost.
+      
+  ==> Design decision:
+      - Generate local block reference when initial BSN alignment has been achieved and start it with a certain
+        fixed offset.
+        
+
+- Filler data insertion      
+  . Whether to drop a block or to replace it by a filler block depends on the application
+    - for BF drop all inputs, because beam is affected
+    - for XC insert filler data, because visibilities of active inputs are still oke.
+    - for the output via the Network insert filler data to keep the output at the nominal rate, such that
+      the destination can distinguish between data blocks that got lost inside Station and packet loss on
+      the Network.
+  . Filler blocks can be flagged using a sosi.channel bit as flag
+  . Filler data can be:
+    - undefined
+    - forced to zero
+    - random with similar noise level,
+    - most negative integer in real data
+    - most negative integer in complex real part and imag part (or use imag part as cause identifier).
+
+  ==> Design decision:
+      - Replace lost blocks by filler blocks, to preserve the nominal output rate
+      - Flag the filler block via a sosi.channel bit, to distinguish the block
+      - Forced the filler data to some constant dependent on a generic, to support transparant operation
+        in e.g. an adder where x + 0 = x or a multiplier where x * 1 = x, or to support flagging per data
+        value using most negative integer value.
 
-- static input enable/disable via M&C
-  - align modes local only, remote only, combined, disabled can be achieved via input enable/disable M&C
+    
   
 . dynamic input enable/disable in case of lost packets
   - Scheme:
@@ -202,52 +238,38 @@ Design options:
       - Define number of sync intervals for dynamic input control as a generic
       - preferred because it is less active and easier to monitor
   - Is dynamic input enable/disable necessary if a lost packet does not affect next packets?
-    . If lost data is replaced by filler data, then only static input enable/disable is necessary. During initial input
-      alignment it is important that all active inputs are indeed active, because together they determine the latency
-      difference between inputs. After initial alignment the data output can continue at at a fixed rate, driven by a
-      local reference. The local input could be used as local output block reference, because (N-1)*d << 1. 
-      Alternatively another block period pacer with another time offset can be started after achieving
-      input alignment. If an input becomes inactive it will be flagged and the output can still continue.
+    . If lost data is replaced by filler data, then only static input enable/disable is necessary, because if an input
+      becomes inactive it will be flagged and the output can still continue.
     . If lost data causes all inputs to be discarded, then dynamic input enable/disable may be useful to avoid that a
       single input causes all output to stop.
+      
+  ==> Design decision:
+      - It is not necessary to support dynamic input enable/disable, because lost blocks are replaced by filler blocks.
 
-- BSN output driven by local input or by remote input
-  . in case of 1 remote input, the remote input does not need a FIFO if it drives the output
-  . in case of > 1 remote input, then the remote inputs also requires FIFOs
-  . using local input increases the latency from remote input to output, because fixed to T_sub grid
-  . using local input at T_sub grid avoids bursts, this can also be handled using flow control
-  . with local input driving the output the assumption is that if the local input has M packets, then all remote
-    inputs will have delivered at least one frame, so there should be a sop pending from all.
-  . if there is no local input, then an artifical local input can be derived when BSN is equal on all enabled remote inputs.
-  . if remote input is lost, then entire output is lost if remote drives output, because there is not enough spare time
-    to still output the other input packets
-  . For remote driven output a slot can be output when for all active inputs there is a block. However if one or
-    a series of packets got lost, then the other inputs will overflow. Hence remote driven output needs a timeout
-    to keep the output running, so a form of local driven output. Hence to avoid additional packet loss on other 
-    inputs or of subsequent packets in time it is necessary to have a local driven output. Therefore using a remote
-    driven output is not feasible. 
-                       
-
-. alignment:    
-  - use BSN incrementing or only use that the input BSN (and input sync) must be equal
-  - use local data stream as reference stream or treat all streams as equal
-    . the local data stream is considered perfect, because it can not have packet loss.
-  - use an artifical local data stream as reference, fixed artifical stream or derived from the combination of input streams.
-    . in dp_bsn_align the artifical local data stream was used to ensure that the output block size was correct,
-      by using extra CRC checking (ETH CRC and DP CRC) and store and forward in Rx it is already certain that only
-      correct input packets arrive at the BSN aligner input. Therefore an artifical local data stream is not needed.
-  - it is not necessary to be able to achieve alignment near the sync. Once alignment is achieved then the near the sync
-    the alignment is kept, because the BSN from all enabled inputs are still equal, even though in time they are 
-    discontinuous at the sync due to that the local BSN then restarts at 0.
-  - Enabled inputs that are aligned all have the same BSN, one of them can be used as output BSN.
-    . If the local data stream can be used as reference stream, then that can be used as output BSN.
-    . Else if all inputs are treated equal, then which input to select is dynamic. Therefore then use any one or all of them
-      so output bsn = or(input BSNs) provided that the disabled inputs have BSN = 0.
 
-  
+. Treat all inputs equal or use local input stream as reference to achieve input alignment:
+  - using the local data stream as reference stream can benefit from the fact that the local data stream has no
+    packet loss, because internally in the FPGA logic is error free.
+  - treating all streams equal is more general and also works when static input enable/disable disables the local
+    input. 
+    
+  ==> Design decision:
+      - Treat all inputs equal. Do not make use of the fact that the ring has a local input. In this way the BSN 
+        aligner can also work when there are only remote inputs.
+
 
+. Define align_sync  
+  -  ...
+  
+  ==> Design decision:
+      - Define align_sync to start initial alignment and to avoid need for twice as large input buffer given a 
+        certain BSN latency
+      
+      
+      
+      
 . Initial alignment declaration can be based on:
-  - All active inputs have data pending in the same slot or at the FIFO output
+  - All active inputs have data pending with the same BSN index (in the same circular buffer slot or at the FIFO output)
   - If BSN latency number of slots on all inputs got filled, then set the Rd pointer. This requires that all inputs
     start filling at the same BSN index, because then the input with the lowest latency will get filled first. The
     Rd pointer is set at the BSN index.
@@ -303,10 +325,75 @@ Design options:
     
   
 
+      
+- Input FIFO
+  . Blocks are stored in arrival order, therefore the FIFO must pass on the BSN index to be able to align the inputs
+    and to detect lost packets.
+  . The BSN index does not have to be incrementing, but is must be unique per BSN latency interval
+  . The FIFO must pass on the 1 s sync, to allow timestamp recovery from Station BSN.
+  . Flushing:
+    - flush per packet or flush until empty?
+    - flush per input per input or flush all inputs?
+    - flush by reading, or by reset or by moving a Rd pointer
+    - Use packet count instead of FIFO full indicator
+    - can we do without flushing the FIFO? Not if we need to realign.
+    - If multiple packets on a remote input get lost, then the other inputs fill up if there is no timeout. Flush
+      all inputs empty if one of them got filled up. Flush empty to avoid that at some moment all inputs may have
+      multiple packets pending in the FIFO, that will then be output in a burst. The pending packets that
+      corresponded to the lost packet will need to be discarded anyway, because there is no time to output them still.
+    - also useful to know BSNs at FIFO inputs? --> No, because FIFO packet count can be used to detect pending FIFO overflow.
+  . Keep FIFOs outside or inside BSN aligner component.
+    - the input of the FIFO is needed to be able to maintain a count of the number of packets in the FIFO, which is
+      relevant for the align timeout. The input eop increments the count and the output eop decrements the count.
+    - inputs with a large latency could use a smaller FIFO, this is easier to control with external FIFOs
+    - if the BSN aligner relies on FIFO input information, then it is better to have the FIFOs inside.
+   
+
+- Input circular buffer
+  . can handle data arriving out of order, but this is not needed within SDP
+  . The buffer memory size is g_bsn_latency * g_nof_inputs slots that can store a packet.
+    - the maximum latency between any two inputs must be < g_bsn_latency number of data blocks
+    - For each slot there is a Wr flag that needs to be maintained. The Wr flag can be set when the data block write
+      begins, because then the read could already start as well since Wr and Rd run at same rate.
+    - For each slot there is also a sync flag to pass on the 1 s sync
+  . Can handle out-of-order data, because it uses the BSN as an index. However on the ring in SDP all data will be in order.
+  . The circular buffer could be used as a FIFO with internal access and an incrementing Wr pointer. However it
+    seems better to use it with a Wr pointer that is derived from the BSN.
+  . the BSN must be continuous BSN and incrementing, because then the remainder of the BSN divided by the buffer size can
+    be used as Wr pointer. 
+  . The buffer size is preferrably a power of two, but can be any size (to save memory):
+    - Using a buffer size that is a power of 2 avoids an integer divsion of the BSN, because it can then use the
+      corrsponding LSbits of the BSN as Wr pointer.
+    - Modulo 2**n - 1 can be calculated efficiently for binary numbers, by adding the n-bit digit parts. Similar as
+      mpdulo 3 (= (10-1)/3) can be calculated by adding the decimal digits.
+    - Modulo n for constnat n can be calculated efficiently suing multiplication by 1/n. The 1/n fraction must be 
+      represented with sufficient accuracy to determine the remainder.
+  . The slots in the circular buffer have a Wr flag that is set when the slot is written with an Rx packet and cleared
+    when the slot is read for output.
+  . Flushing:
+    - Clearing a Wr flag or all Wr flags is much faster than flush reading a FIFO.
+  . The Rd pointer increments at every output block period.
+  . The Rd pointer increments after every output slot.
+  . The write pointer always needs to be ahead of the Rd pointer. The minimum distance between the Wr and Rd pointer
+    is g_bsn_latency. The size of the circular buffer is the same for all inputs and must be > g_bsn_latency (for wr)
+    + 1 (for rd). The circular buffer read can occur when the write pointer exceeds rd pointer + g_bsn_latency. 
+  . the circular buffer is part of the BSN aligner component
+  . On CEP the beamlet data is written into a circular buffer based on the time stamp. A flag indicates whether data in the
+    circular buffer is valid. The size of the circular buffer is in the order of hundreds of ms to cover the distance latency 
+    of the international stations. An array of tupples lists the lenght of continuous blocks in the circular buffer, and 
+    therefore also to the gaps. A local timer determines when the circular buffer is read. The local timer has ms accuracy
+    compared to UTC, so the size of the circular buffer dominated by the largest latencies. The channel filterbank in CEP
+    also flags the initial channel data that is disturbed after a gap.
+
+
+
+
+
 . Circular buffer state machine
-    Receive and monitor input
-    Derive align_sync and Wr pointer from input BSN
-    Write the input at the slot indexed by the Wr pointer and set the Wr flag for that slot.
+    all:
+      Receive and monitor input
+      Derive align_sync and Wr pointer from input BSN
+      Write the input at the slot indexed by the Wr pointer and set the Wr flag for that slot.
     s_xoff:
       Accept static input enable/disable control
       Clear all Wr flags of the slots to initially align or to realign the inputs.
@@ -315,13 +402,14 @@ Design options:
     s_align:
       If input control event --> s_xoff
       If for all active inputs the Wr flag is set in slot 0 and slot 0 contains the align_sync then
-        restart a periodic slot pulse to set the pace for outputting the slots. An offset of the pace period
-        is used to ensure that in subsequent block periods --> s_sop
+        restart a periodic slot pulse to set the pace for outputting the slots. An offset of the slot period
+        is used to ensure that in subsequent block periods all inputs will have a pending block --> s_sop
     s_sop
-      if input control event --> s_xoff
+      If input control event --> s_xoff
       If slot pulse --> s_output
     s_output:
-      output one block, clear Wr flag of slot and increment Rd pointer --> s_sop
+      If all Wr flags are unset (empty buffer) --> s_xoff
+      else output one block, clear Wr flag of slot and increment Rd pointer --> s_sop
       
              
     
@@ -347,6 +435,7 @@ Design options:
   * more packets get lost, one input stops --> g_sop_timeout in s_align --> flush all inputs in s_xoff
   * one packet gets lost, next input arrives within g_sop_timeout --> bsn in range, flush one block from all other inputs
 
+
 . sync aligner instead of BSN aligner
   - Using the sosi.sync one packet lost causes whole interval lost, this is too much impact.
   - Use as much BSN range as necessary. At the end of the range the limited range BSN will wrap. This will cause
@@ -542,18 +631,18 @@ Design options:
   
 
 Design decisions:
-. Use FIFO instead of circular buffer memory to store input packets:
-  - avoids need for deriving address from discontinuous local BSN
-  - avoids need for memory write and Rd pointer control and memory used/not used flags
-  - avoids need for buffer size that is a power of 2 (or need for modulo address calculation based on entire BSN)
+
+. Probably either circular buffer memory or FIFOs is suitable. For circular buffer the BSN fraction is used as slot
+  index and for the FIFO the BSN index needs to be passed on through the FIFO to compare pending inputs:
 . Support number of inputs >= 2
 . Treat all inputs equal, so no special role for a local input
   - suits more general usage
-. Use remote input to drive the output:
-  - output when last remote sop has arrived to avoid extra latency from remote to output
+. Use local reference to drive the output block rate:
+  - adds somewhat more latency then using remote input to drive the output, but is necessary avoid extra loss in case
+    of lost packets and to support filler output
 . Support flow control
-  - to smoothen bursts
-  - to provide output throttling
+  - to smoothen bursts (only an issue with remote drive output)
+  - to provide output throttling (requires output FIFOs or data blocks that have sufficient gaps)
 . Use sosi.sync and sosi.bsn(c_bsn_align_w-1:0) to align BSN
   - using c_bsn_align_w much smaller than 32 b saves logic and thus eases timing closure
   - if all enabled input BSN are equal then output
@@ -574,17 +663,47 @@ Design decisions:
   - Static disabled inputs carry zero data
   - Dynamically disabled inputs carry flagged data, using most negative real as flag and imag = 0.
 
-    
+
 *******************************************************************************
-* Reorder 
+* Rx input status:
 *******************************************************************************
-    . Page swap (needed for TB)
-    . Variable output size
+
+* Existing components:
+  - RSP rad_frame_status of the previous PPS sync interval: 
+    . rx_cnt:   18 bits, number Rx frames
+    . brc   :    1 bit,  0 if no Rx frames with CRC error, 1 if >= 1 Rx frames had a CRC error
+    . sync  :    1 bit,  1 if the frame with Rx sync was detected, else 0
+    . align :    1 bit,  1 if all frames aligned OK, else 0
+  
+  - RSP rad_latency:
+    . rx_latency : 16 bit, stores an internal count value when the Rx sync is detected. The internal count
+                           restarts at the PPS sync. This measures the latency in clock cycles.
+                           
+  - APERTIF dp_bsn_monitor
+    . mon_sync_timeout        = '1' when the Rx sync did not occur within 200M cycles since last Rx sync    ~= sync
+    . mon_ready_stable        = '1' when ready was always '1' during last Rx sync interval
+    . mon_xon_stable          = '1' when xon   was always '1' during last Rx sync interval
+    . mon_bsn_at_sync         = BSN at Rx sync
+    . mon_nof_sop             = number of sop during last Rx sync interval             = rx_cnt
+    . mon_nof_err             = number of err at eop during last Rx sync interval     ~= brc
+    . mon_nof_valid           = number of valid during last Rx sync interval
+    . mon_bsn_first           = BSN at first Rx sync     --> not useful
+    . mon_bsn_first_cycle_cnt = latency at first Rx sync --> should use every Rx sync like on RSP
+  
+    ==> Reuse dp_bsn_monitor with improvements:
+    . Monitor the packets per sync interval using Rx sync. This is more precise then using the PPS sync. 
+      The Rx sync based values are only valid if mon_sync_timeout = 0.
+    . Remove mon_bsn_first and mon_bsn_first_cycle_cnt.
+    . Add mon_latency, use PPS sync like in RSP to measure the latency between PPS sync and Rx sync in
+      number of clock cycles.
+      
+
   
 *******************************************************************************
-* Store and forward or discard
+* Reorder 
 *******************************************************************************
-    . Discard when CRC error
+    . Page swap (needed for TB)
+    . Variable output size
 
     
 *******************************************************************************
diff --git a/applications/lofar2/doc/prestudy/station2_sdp_ring.txt b/applications/lofar2/doc/prestudy/station2_sdp_ring.txt
index b970a593f96e34e2347c3a43f4397f5f24f86e2b..b484c32f0ece90b0c77debf5c079ce623ce21ddb 100644
--- a/applications/lofar2/doc/prestudy/station2_sdp_ring.txt
+++ b/applications/lofar2/doc/prestudy/station2_sdp_ring.txt
@@ -194,6 +194,27 @@ correct CRC. With wormhole routing it was necessary to limit or extend a packet
 also packets with CRC error are passed on. With store-and-forward routing the CRC provides sufficient protection
 to ensure that only correct packets enter the application.
 
+Ring data transport schemes:
+  - beamlets on ring: l --> r+l --> r+l --> ... --> r+l
+    . on each node align two inputs: l,r
+    . output filler data if remote got lost, to preserve nominal output rate to CEP
+    
+  - crosslets on ring:  rrrrrrrr,l --> rrrrrrrr,l --> ... --> rrrrrrrr,l
+    . on each node separately align N/2 pairs of inputs l,r, have one pair per XC cell
+    or
+    . on each node first align all inputs l,N/2*r, and then split into N/2 pairs of l,r to have one pair per XC cell
+    . discard output data if remote got lost, to count number of active blocks per integration sync interval
+      or
+      output filler data if remote got lost, and use zero to not disturb the intergation and count unflagged blocks
+      to know the number of active blocks per integration sync interval
+    
+  - subbands on ring: l, rl, rrl, rrrl, ..., rrrrrrrrrrrrrrrl
+    . on final node align all l,(N-1)*r inputs
+    . output filler data if remote got lost, to preserve nominal output rate to AARTFAAC
+    
+  - transient buffer readout: l, r, r, ..., r
+    . no align, readout from one node at a time
+
 
 Ring access schemes:
 
@@ -358,33 +379,6 @@ The beamformer function has the following sub functions:
       - "Beamlet data output" : Scale and output beamlet sums
 - "Beamlet statistics (BST)": Calculate BST
 
-Ring status:
-- RSP rad_frame_status of the previous PPS sync interval: 
-  . rx_cnt:   18 bits, number Rx frames
-  . brc   :    1 bit,  0 if no Rx frames with CRC error, 1 if >= 1 Rx frames had a CRC error
-  . sync  :    1 bit,  1 if the frame with Rx sync was detected, else 0
-  . align :    1 bit,  1 if all frames aligned OK, else 0
-- RSP rad_latency:
-  . rx_latency : 16 bit, stores an internal count value when the Rx sync is detected. The internal count
-                         restarts at the PPS sync. This measures the latency in clock cycles.
-- dp_bsn_monitor
-  . mon_sync_timeout        = '1' when the Rx sync did not occur within 200M cycles since last Rx sync    ~= sync
-  . mon_ready_stable        = '1' when ready was always '1' during last Rx sync interval
-  . mon_xon_stable          = '1' when xon   was always '1' during last Rx sync interval
-  . mon_bsn_at_sync         = BSN at Rx sync
-  . mon_nof_sop             = number of sop during last Rx sync interval             = rx_cnt
-  . mon_nof_err             = number of err at eop during last Rx sync interval     ~= brc
-  . mon_nof_valid           = number of valid during last Rx sync interval
-  . mon_bsn_first           = BSN at first Rx sync     --> not useful
-  . mon_bsn_first_cycle_cnt = latency at first Rx sync --> should use every Rx sync like on RSP
-  
-==> Reuse dp_bsn_monitor:
-    . Monitor the packets per sync interval using Rx sync. This is more precise then using the PPS sync. 
-      The Rx sync based values are only valid if mon_sync_timeout = 0.
-    . Remove mon_bsn_first and mon_bsn_first_cycle_cnt.
-    . Add mon_latency, use PPS sync like in RSP to measure the latency between PPS sync and Rx sync in
-      number of clock cycles.
-      
 
 *******************************************************************************
 * Subband Correlator