Skip to content
Snippets Groups Projects
Commit 239ed0eb authored by Eric Kooistra's avatar Eric Kooistra
Browse files

Weekly update of txt files.

parent 9e89c1d5
No related branches found
No related tags found
No related merge requests found
......@@ -10,6 +10,7 @@ OPC-UA = OPC Unified Architecture
https://opcfoundation.org/
http://wiki.opcfoundation.org/index.php/UA_Overview
https://en.wikipedia.org/wiki/OPC_Unified_Architecture
https://opcfoundation.org/about/opc-technologies/opc-ua/ -- functions in OPC-UA
- Service oriented architecture (SOA) using asynchronous request/response pattern
- transport: via TCP in binary or web based
......
......@@ -18,6 +18,15 @@ The prototype FIR filter can be regarded as a window function. For a static FFT
For the oversampling PFB the FFT is calculated every M input samples, where the oversampling factor R_os = N_fft / M. Note that the oversampling PFB increases the subband sample rate f_sub_os = f_sub * R_os, but not the subband frequency grid. The subband frequency grid is n * f_sub, for any R_os, because the downsampling factor N_fft is the same for any R_os.
Fsub
- spectral inversion
- wpfb_unit_dev : g_wpfb, fft_r2_pipe
- The tb_tb_wpfb_unit_wide verifies multiple variations of wpfb_unit_dev.
- Try one instance so that the FIR coef are use for all streams.
- mms_dp_gain_serial_arr : calibrate subband weights
- select subbands before or after calibrate weights, dp_switch needs 1 cylce gap between blocks.
- SST outside wpfb
- use MM master mux to select between MM access and UDP offload, when UDP offload is enabled then do not do MM access.
*******************************************************************************
......
......@@ -6,8 +6,10 @@ The System Engineering breaks up the product into sub products until the sub pro
* = Product ADD
<-- Design decisions for product ADD
+ = Decision document for product ADD
+ = Decision document for product ADD
--> Sub products in ADD
* = Sub Product ADD
etc.
* L2 Station ADD
<-- L2 STAT Design decisions
......@@ -151,14 +153,94 @@ Designs:
References:
- Preliminary design txt files:
. station2_sdp_m_and_c.txt : Monitoring and control, Gemini protocol
. station2_sdp_timing.txt : Station BSN, timestamp definition, BSN aligner
. station2_sdp_ring.txt : ring access, packets for beamlets, crosslets, subbands, TB readout
. station2_sdp_dsp.txt : beamformer, subband correlator, transient buffer, transient detection, subband offload
. station2_sdp_icd.txt : ICD
. station2_sdp_hdl_components.txt : rework existing HDL components for LOFAR2.0
. station2_sdp_hdl_article.txt : reference article on RTL design using RL = 0, state and pipelining, AXI4 streaming
- Other:
References:
station2_sdp_srs : --> L3 SDP product
- List of L2 Station requirements (from Station ADD at PDR) that map on SDP
station2_sdp_icd :
- SE explanation of ICD
- Lists of items per ICD:
--> L2 ICD STAT-NW
--> L2 STAT-CEP
--> L3 SC-SDP
args_next_steps
station2_opc_ua : --> L4 SDP Translator product (almost DONE)
- OPC-UA standard
- Architecture of SPD Translator
- Control points, monitor points, functions (?)
station2_sdp_timing
--> L2 STAT decisipn Timing in Station (DONE)
--> L3 SDP decisipn Timing in SDP (DONE)
station2_sdp_ring : --> L5 SDPFW product Ring
- 10GbE
- data types: beamlets, subbands offload, crosslets, transient readout
- use store and forward
- use raw ETH
- ring access and transport schemes
- ring direction per type of data
- remote and local data alignment
- packet sizes, data rates, R_os
- Beamformer (BF) --> L5 SDPFW product BF
- Subband correlator (XC), X_sq cell --> L5 SDPFW product XC
- Subband offload (SO) for AARTFAAC2.0
- Transient buffer (TBUF) readout --> L5 SDPFW product TBUF
- UDP offload resource usage
dupllo_oversampled_subband_filterbank
--> L2 STAT decision Oversampled Filterbank (L2SDP-64 process review)
station2_sdp_dsp :
- Fsub --> L5 SDPFW product Fsub (DONE)
- BF --> L5 SDPFW product BF
- TBUF --> L5 SDPFW product TBUF
station2_sdp_hdl_components :
- Rx input status: dp_bsn_monitor with latency monitoring --> L5 SDPFW product Ring
- DP encoder / decoder: dp_packet_enc with CRC --> L5 SDPFW product Ring
- dp_validate_crc (uses dp_store_and_forward) --> L5 SDPFW product Ring
- dp_validate_bsn_at_sync --> L5 SDPFW product Ring
- BSN aligner dp_bsn_align_v2 --> L5 SDPFW product Ring --> L6 SDPFW product Ring
- Fill FIFO --> L5 SDPFW product Ring
- Reorder with dual page --> L5 SDPFW product TBUF
- Synchronous global reset --> L2SDP-61,62
station2_sdp_hdl_article.txt
- Reference article on RTL design using RL = 0, state and pipelining, AXI4
- RL 0 development article and automatic pipelining tools
station2_semi_float32
- 32b statistics in LOFAR1, use 64b in LOFAR2.0
station2_sdp_m_and_c :
- M&C explanation --> ICD SC-SDP
- Update beamlet weights --> L2 STAT decision beamlet weigths
- Monitoring interval:
. asynchronous
. every sync
. at single BSN scheduler event
. at periodic BSN scheduler event
- Requirements: self-test, health-test, operationele aspecten
- List of APERTIF registers (asynchronous, synchronous, single page, dual page)
- UDP, TCP, sockets --> L3 SDP decision FPGA M&C protocol
WP 5 SDP plan: --> https://support.astron.nl/confluence/display/STAT/WP-5+SDP
- station2_sdp_firmware_planning : about planning, SDP planning and tasks, LTS, DTS, PTS, UniBoard2c
- station2_sdp_deliverables list of WP 5 SDP deliverables in ASCII from WP 5 SDP plan
- UniBoard2c planning : L2SDP-42
Other:
. tools/oneclick/doc/desp_firmware_dag_erko.txt
. tools/oneclick/doc/desp_firmware_overview.txt
\ No newline at end of file
. tools/oneclick/doc/desp_firmware_overview.txt
. desp_howtools_erko.txt
\ No newline at end of file
......@@ -2,8 +2,8 @@
* Rules
*******************************************************************************
1) Continuously plan increment of 4 sprint ahead
After initial planning for thge whole project (at PDR) it remains necessary
1) Continuously plan increment of 4 sprints (= 1 increment) ahead
After initial planning for the whole project (at PDR) it remains necessary
to keep on adapting / fine tuning the planning per quarter, so about 4
sprints ahead. This concerns not only time but also expectations, interfaces
and work
......@@ -51,7 +51,7 @@ This then means that with the SDP work starting 1 jan 2020 it can complete mid
1) Lab Test Station (LTS) - First-light Mai 2020
Objectives: Verification of (parts of) individyual elements and their
interfaces
- 1 UniBoard2 Rev 2 (use different FPGA on same UniBoard for FW, SW tests)
- 1 UniBoard2b Rev 2 (use different FPGA on same UniBoard for FW, SW tests)
Setups for:
- SW
- FW
......@@ -63,7 +63,7 @@ Setups for:
Objectives: Verify that a complete signal chain using the first iteration of
L3 hardware design shows no serious issues and that it can be
reliably installed in a LOFAR station.
- 1 UniBoard2 Rev 2
- 1 UniBoard2c Rev 3a
- First iteration of electronic boards --> 2 UniBoard2 Rev 3a
3) Prototype Test Station (PTS) - First-light Mai 2021
......@@ -71,7 +71,7 @@ Objectives: Verify Station L2 requirements through testing and analysis, and
provide evidence to the CDR review panel that the designs ensure
compliance with all L2 requirements.
- Second iteration of electronic boards --> 4 UniBoard2 Rev 3b
- 4 UniBoard2 Rev 3b in two subracks (one for LBA with 32 RCU2, one for HBA
- 4 UniBoard2c Rev 3b in two subracks (one for LBA with 32 RCU2, one for HBA
with 32 RCU2)
- Output to CEP for correlation with other stations
......@@ -586,7 +586,11 @@ all 12-2021 CDR M Complete SDP document package for Station CDR
So the difference is -10 weeks, which means that the 2019 PDR is about -10
/ 230 = 5 % more time then the 2018 AAD estimate.
- 2020-jul
Planning differences occur due to:
- pre PDR work was not budgetted (L2 Station work)
- SC-SDP Translator work was not budgetted
*******************************************************************************
* SDP effort estimates in LOFAR2.0 Station WP5 (since jan 2020)
*
......
......@@ -660,10 +660,33 @@ Obsolete investigations:
*******************************************************************************
* Reorder
* Reorder (RP1399)
*******************************************************************************
. Page swap (needed for TB)
. Variable output size
APERTIF needs reorder_matrix to implement R_sub, because it has 8 inputs and 16 outputs:
- The 8 inputs are subbands from signal inputs A,B on P_sub = 4 streams and from signal inputs C,D on P_sub = 4 streams.
- The 16 outputs are subbands from A,B,C,D on 16 streams, so each stream has 1/16-th of the total band.
reorder_matrix (= ss_parallel):
g_nof_inputs g_nof_internals g_nof_outputs
= g_wb _factor
--> reorder_row --> reorder_col_wide --> reorder_row -->
APERTIF uses reorder_col_wide for R_beam, to select and replicate the 6 subbands per 40 beams
APERTIF uses reorder_matrix for R_beamout, to redistribute the output beamlets
For LOFAR2.0 SDP R_beam (and R_beamout) are not needed, because SDP only makes one set of beamlets, equivalent to one
compound beam in APERTIF.
For LOFAR2.0 SDP only reorder_col_wide is needed to implement Rsub. because it has the same number of S_PN/Q_fft = 6
inputs and 6 outputs, and because the outputs remain on a single node so no need to distribute per band.
reorder_col_wide contains g_wb_factor instances of reorder_col
reorder_col:
--> reorder_store --> u_store_buf = common_paged_ram_r_w --> reorder_retrieve -->
MM --> u_select_buf = common_ram_crw_crw -->
New features for SDP:
. Page swap for u_select_buf (needed for TB)
. Variable output size ?
*******************************************************************************
......
......@@ -110,61 +110,131 @@ LFAA-CSP_Low : OSI (Open Systems Interconnection) layers
A) STAT-CEP Beamlet data interface:
LOFAR1 supported beamlet bit modes 16b, 8b and 4b by packing these beamlets into 16b, so 1 16b, 2 8b or 4 4b beamlet
values per 16b word. This creates 1, 2, or 4 beamsets called banks. Packing the beamlets from different
beamsets per 16b word made sense because these banks were already distinghuised at the AP and on the ring.
Default LOFAR2.0 only has 8b mode and 1 beam set of S_sub_bf = 488 beamlets. Future LOFAR2.0 could support more
beamsets. At the PN and on the ring these beamsets would be treated indepently and will always use W_beamlet_sum
= 18b indepent of the beamlet bit mode. Therefore in LOFAR2.0 packing different beamsets 8b word in a payload makes
less sense. Instead it is better to use more blocks per packet to have sufficiently large payload for the 4b and 2b
beamlet bit modes. Hence in LOFAR2.0 the extra beamlets that can be transported for the lower beamlet bit modes are
treated as independent beamsets, similar as an extra 8b beam set. The beamlet index follows from:
bf[set]_[t][blk][blet][pol] --> global beamlet index = set * 488 + blet
where t is the BSN of the first block in the packet and blk is the block index in the packet, and
there are NOF_BLOCK_PER_PACKET.
Hence in LOFAR2.0 it is then possible to output e.g. only one 4b beamlet bit mode beam set to CEP, instead of two,
because the 4b are packed into payloads per beam set, not from different beamsets.
LOFAR1 used 4 lanes to output the beamlets. Each lane carried 1/4 of the beamlets, to beamlets i:4:244 on lane i.
These lanes are usefull to distribute the beamlets to different processing destinations at CEP. For Cobalt it would
be optimum to have 22 destinations, because it has 22 processing input nodes. The beamlet index follows from:
bf[set][lane]_[t][blk][bl][pol] --> global beamlet index = set * 488 + bl * 4 + lane
where lane = 0:3, bl = 0:121 (488/4 = 122)
In LOFAR2.0 the number of lanes does not depend on the number of physical lanes on the ring. Therefore the number
of lanes can be different than 4. With S_sub_bf = 488 = 2*2*2*61 there can be 1, 2, 4 or 8 lanes with equal number
of beamlets per lane, respectively 488, 244, 122 or 61. With less beamlets per lane the NOF_BLOCK_PER_PACKET needs
to be increased to have sufficiently large packets (< 9000 octets). However other number of lanes are feasible to,
but will result in different number of beamlets per packet. For example using 22 lanes yields 18 * 22 + 4 * 23 =
488, so 18 streams with 22 beamlets per packet and 4 streams with 23 beamlets per packet.
In LOFAR1 the 4 lanes are physical and used in a staggered way such that each lane has its own beamformer that
outputs on a different RSP board. In LOFAR2.0 only one UniBoard2 PN does the output, so the lanes cannot be
staggered. Therefore the distribution over the lanes is done in the final PN and requires internal RAM to be
able to assemble the beamlet output packet for each lane, before they can be send. This requires a double buffer,
so about 2 * NOF_LANES * packet size number of octets of RAM. For example 2 * 16 * 8kB / 2 kB = 128 M20K BRAMs.
The FPGA has 2713 M20k, so this is 128/2713 ~= 5% of the internal BRAM resources.
The total number of streams to CEP then becomes NOF_BEAMSETS * NOF_LANES.
- MARKER 8b
. Like in APERTIF and ARTS, may be useful to quickly recognize the data packet.
- VERSION_ID 8b
. 2,3,4 for LOFAR1
. 5 first for LOFAR2.0
- STATION_ID 16b (idem as LOFAR1)
==> or 8b because there are only ~50 stations
==> use 16b to fit number from station name (e.g. CS001, LV614, see list of stations at
https://proxy.lofar.eu/array_status/STATIONS/HTML/cs011/index.html)
- OBSERVATION_ID 32b
Instead of CONFIGURATION_ID 8b (used in LOFAR1? intended to refer to the parset that defines this observation)
The observation ID provides the hook to information on:
. RCU mode
. f_adc = 200 MHz, 160 MHz
. Nyquist zone (0, 1, 2)
. critically PFB, oversampled PFB (or p, q for R_os = p/q)
. etc
- SOURCE_INFO 16b
. 2b Array ID (core station 1 LBA, 2 HBA, ...)
. 1b f_adc = 200 MHz, 160 MHz
. 1b critically PFB, oversampled PFB (or p, q for R_os = p/q)
. 4b beamlet width in number of bits (default 8 for W_beamlet = 8 bit, instead of BM = beamlet mode)
. 5b UniBoard2 FPGA id (16 FPGAs for LBA, 16 for HBA in International Station, instead of RSP ID)
. ==> Also beamlet scale setting
. ==> Number of antenna in beam (core, LBA, HBA inner to make HBA international look like HBA remote)
. 5b beamlet width in number of bits
- default 8 for W_beamlet = 8 bit, instead of BM = beamlet mode
- Use 5 bit to even fit 16b mode like in LOFAR1)
. 6b PN ID = UniBoard2 FPGA ID
- Instead of RSP_ID in LOFAR1
- 16 FPGAs for LBA, 16 for HBA in International Station, so maximum 32, but use one bit extra
- the PN ID implicitly also reveals the array ID (core station 1 LBA, 2 HBA/HBA0, 3 HBA1, ...)
- NOF_ANTENNA_IN_BEAM 8b
Number of antenna in beam (core, LBA, HBA inner to make HBA international look like HBA remote),
. maximum S_ant = 192.
- BEAMLET_SCALE 16b
Beamlet scale setting:
- 18b --> 8b, scale = 1 yields lowest bits, scale = 1024 yields highest bits
- 18b --> 4b, scale = 1 yields lowest bits, scale = 4096 yields highest bits
- CONFIGURATION_ID 8b (used in LOFAR1? intended to refer to the parset that defines this observation)
==> observation ID 32b
- STATION_ID 16b (idem as LOFAR1)
==> or 8b because there are only ~50 stations
- NOF_BEAMLETS_PER_SET 16b = S_sub_bf = 488
- SET_INDEX 8b
- LANE_INDEX 8b
- NOF_LANES 8b
- One packet per range of Station beamlets out of 488 beamlets
. Full band : S_sub_bf * W_beamlet * N_complex / W_byte = 488 * 8b * 2 / 8b = 976 octets
. NOF_BEAMLETS_PER_BANK not needed anymore
. nof_streams = Number of beamlet streams
. global beamlet index = SET_INDEX * NOF_BEAMLETS_PER_SET + bl * NOF_LANES + LANE_INDEX
. stream index = SET_INDEX * NOF_LANES + LANE_INDEX
- Separate destination address per stream
- LOFAR1 supports 4 streams
- LOFAR1 supports 4 streams (4 lanes from RSP ring, staggered so rsp_id identifies lane)
- LOFAR2.0 preferrably supports >> 4 streams
- beamlet_id to identify start beamlet in stream (provides more info than a stream ID)
- NOF_BEAMLETS_PER_BLOCK to identify range of beamlets from beamlet_id
- LOFAR1: beamlet_id = 0 and NOF_BEAMLETS_PER_BLOCK = 61 (dual pol beamlets, 4 streams):
- LOFAR2.0 preferrably outputs only 1 stream
- CEP with N processing nodes would like N streams, Cobalt has N = 22
- S_sub_bf = 488 = 2*2*2* 61, so only NOF_LANES = 1, 2, 4, and 8 yield a fixed integer number
of NOF_BEAMLETS_PER_BLOCK.
? Is it useful to support LANE_INDEX and NOF_LANES > 1 at SDP but < 22 which is optimum for CEP?
- NOF_BLOCKS 16b in payload
. Multiple beamlet time slots in one packet to increase payload efficiency.
. For W_beamlet = 8 bit there can be maximum 9 blocks per payload (9 * 976 = 8784 octets < 9000)
. With nof_streams >> 4 the NOF_BLOCKS can become larger, therefore use 16b. For example:
- NOF_BEAMLETS_PER_BLOCK = S_sub_bf / nof_streams = 488 / 32 = 16
- NOF_BEAMLETS_PER_BLOCK * W_beamlet * N_complex / W_byte = 16 * 8b * 2 / 8b = 32 octets
- 9000 / 32 = 281 > 256 --> use 16b for NOF_BLOCKS
- nof_streams = 22 destination nodes, each with 8k Byte payload, possibly a double buffer:
22 * 8 kByte * 2 = 352 kByte = 176 BRAM (1 BRAM = 2 kByte, FPGA has 2713 BRAM)
- 488 / 22 = 22.18, so 488 = 4 * 23 + 18 * 22
. Only send correct data to CEP (so no need for SOURCE_INFO/payload error bit).
. How to handle blocks that got lost within the Station?
- TIMESTAMP 64b (instead of 32b seconds TIMESTAMP and 32b BLOCK_SEQUENCE_NUMBER within second)
. A 64 bit timestamp in 0.2 ns resolution since t_base = 1970 for first block in payload:
- to fit both T_adc = 5 ns and 6.4 ns
- for 116 year span since t_base = 1970 --> 2086
- BLOCK_PERIOD 16b
. bit block period in 0.2 ns resolution
. 2**16 * 0.2 ns = 13.1 us block period (block rate > 76 kHz) fits T_sub
- NOF_BEAMLETS_PER_BLOCK
. Equals floor or ceil of NOF_BEAMLETS_PER_SET / NOF_BEAM_LANES dependent on LANE_INDEX,
so redudant if all beamlets are send, but could be used to send less beamlets.
. Instead of NOF_BEAMLETS_PER_BANK in LOFAR1
. LOFAR1 NOF_BEAMLETS_PER_BLOCK = 61 (dual pol beamlets, 4 streams):
. Maximum NOF_BEAMLETS_PER_BLOCK when NOF_LANES = 1:
W_beamlet = 8b : N_pol * S_sub_bf = 2 * 488 = 976 beamlets, * N_complex = 1952 octets
W_beamlet = 4b : 1952 beamlets
W_beamlet = 2b : 3904 beamlets
- BSN 64b
. Block sequence number since t_base = 1970 of first block in payload, increments by 1 for every block
. Used to detect lost blocks and to align blocks from different stations
- NOF_BLOCKS_PER_PACKET 8b
. Multiple beamlet time slots in one packet to increase payload efficiency.
. Maximum NOF_BLOCKS_PER_PACKET is about 4 * NOF_LANES, because:
NOF_LANES = 1: 4 --> 4 * 1952 = 7808 octets < 9000 Jumbo
. LOFAR1 has 4 streams (lanes) and 16 blocks per packet
. LOFAR1 has payload ok bit in SOURCE_INFO to indicate that at least one block in the packet
has incorrect data
- TIMESTAMP 50b
. Instead of 32b seconds TIMESTAMP and 32b BLOCK_SEQUENCE_NUMBER within second)
. Block Sequence Number (BSN) used to detect lost blocks and to align blocks from different stations
. BSN unit T_sub, 50b yields > 100 year span (1970 - 2070)
- BLOCK_PERIOD 13b
. Subband period T_sub in ns resolution, 5120 ns @ 200 MHz, Ros = 1
- TX_PACKET_COUNT 32b
......
......@@ -551,7 +551,7 @@ The beamformer function has the following sub functions:
if transit node:
- Encode beamlet sums packet to ring
else:
- "Beamlet data output" : On output node scale and output final beamlet sums
- "Beamlet data output" : On output node scale and output final 8 bit beamlet sums
- "Beamlet statistics (BST)": Calculate BST for beamlet sums, output node has final BST
......
......@@ -73,6 +73,7 @@ Vijf principes:
- Use GIT
- Rename master/slave, mosi/miso
- Understand AXI4 streaming (versus avalon, RL =0)
. wrap between AXI4 - Avalon for MM and DP
- Global reset only on sosi info not on sosi data
......@@ -98,6 +99,7 @@ Vijf principes:
- polarization correction via subband weights is not needed, so X and Y can be on different PN
- EMI between X and Y, but X and Y have only about 40 dB isolation
- EMI between single pol inputs get suppressed dependent on the station digital beam pointing
- No need for pseudo random PFB input decorrelator function like in LOFAR1?
. LBA inner signal inputs and LBA outer signal inputs on different subracks or arbitrary. Inner is not used
Instead they use sparse odd and sparse even to have two more or less random antenna allocations.
. HBA core station sub-array inputs on different Uniboard2 to reduce EMI
......@@ -214,7 +216,14 @@ Station:
. obtain
. get
. read
* Alternatives to master/slave (mosi, miso)
. client / server --> cosi, ciso
. primary, main / secondary, replica, subordinate
. initiator, requester / target, responder
. controller, host / device, worker, proxy
. leader / follower
. director / performer
*******************************************************************************
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment