Skip to content
Snippets Groups Projects
Commit db2be832 authored by Eric Kooistra's avatar Eric Kooistra
Browse files

From master.

parents a2933a2e 17ca52ba
No related branches found
No related tags found
1 merge request!308Resolve L2SDP-877
Pipeline #44411 passed
......@@ -7,6 +7,7 @@ Daarvoor moeten we een component toevoegen die gebruik maakt van de 25MHz crysta
Na clock wissel 200M --> 160M of andersom is het volgende nodig en genoeg voor SC richting SDP:
* doe FPGA_boot_image_RW zodat de images opnieuw geladen worden
* poll tot bijv FPGA_firmware_version_R de juiste naam weergeeft (dan is image op)
* write FPGA_pps_expected_cnt_RW met 160M of 200M
JDM: Als ik FPGA_boot_image_RW schrijf naar de huidige waarde, hoe kan ik dan zien of de FPGAs gereboot zijn? wachten op TR_FPGA_communication_error_R == False oid?
......
Detailed design: Transient Buffer function (LIFT)
Detailed design: Transient Buffer (TBuf) function for LIFT project
0) MVP = minimal viable product:
1) DDR4 memory per receiver input
2) Meeting EK-BH 24 okt 2022 [2]
3) TBB (Transient Buffer Board) LOFAR1
4) TBuf (Transient Buffer) Design
5) TBuf ICD SC-SDP, SDPTR-SDPFW
6) TBuf ICD STAT/SDP-CEP
7) Transient detection (TDet) Design
References:
[1] LIFT requirements: https://plm.astron.nl/polarion/#/project/LOFAR2System/wiki/Overview%20pages/LIFT%20Reference
https://git.astron.nl/desp/hdl/-/blob/L2SDP-857/applications/lofar2/doc/prestudy/lift_sdp_transient_buffer.txt
[2] https://support.astron.nl/confluence/display/L2M/2022-10-24+LIFT+meeting+notes
https://support.astron.nl/confluence/display/L2M/2023-02-08+LIFT+meeting+notes
[3] L1 LOFAR2 Decision: Transport of buffer data from Station to CEP, https://support.astron.nl/confluence/pages/viewpage.action?pageId=94766339
[4] LOFAR1 TBB: https://support.astron.nl/confluence/display/L2M/Temporary+storage+of+documents+and+papers
https://support.astron.nl/confluence/pages/viewpage.action?spaceKey=L2M&title=Temporary+storage+of+documents+and+papers&preview=/17335979/23069390/TBB_Design_Description_ASTRON_SDD_047.pdf
[5] LOFAR2 PDR: https://support.astron.nl/confluence/pages/viewpage.action?spaceKey=L2M&title=2019-07-01+Meeting+notes%3A+Transient+Buffer+functionality
0) MVP = minimal viable product:
- omvat TB, maar nog niet Transient Detectie (TDet)
- omvat TBuf, maar nog niet Transient Detectie (TDet)
- needs ring, because 6 antennas per event are typically not connected to one FPGA
......@@ -35,8 +56,8 @@ In LOFAR12060 [1] time series data en pulse data, wat is pulse data :
Hoe streng is de 3.33 s, mag 3.2 s ook ?
Gebruik een 16 GByte module per FPGA, zodat uitbreiding naar 6.66 s mopgelijkl is door beide slots the gebruiken
Uitlezen per receiver input, zodat uitlezen van een deel vd receiver inputs mogelijk is (bijv 12 vd 192 in [1])
Defineer TB functie per receiver input:
. zodat de TB functie makkelijk uitbreidbaar is naar meer inputs en naar meer DDR4 modules.
Defineer TBuf functie per receiver input:
. zodat de TBuf functie makkelijk uitbreidbaar is naar meer inputs en naar meer DDR4 modules.
. data capture en uitlezen van een receiver input onafhankelijk kan van de andere receiver inputs
......@@ -60,24 +81,75 @@ Design decision 16GByte DDR4 na L2SDP-854, 850
- Buffer lengte versus nof antennes
- Self trigger
<<<<<<< HEAD
3) Design
- buffer raw data, no need to buffer subbands
- no self triggering yet for MVP
=======
3) TBB (Transient Buffer Board) LOFAR1
- From 2.3 in [4]
. uses 2048 Byte pages
. addressed based -> typically for write/store
time based -> typically for read/retriev
. 16 channels free size --> not fragmented, not overlap, nof pages/channel, circular
4) Transient Buffer (TBuf) Design
- buffer raw data, no need to buffer subbands
- choose fixe 14b data, so not e.g. 8 Msbits for lighting and 8 Lsbits for
cosmic ray. Always using full W_adc = 14b makes design and usage more clear.
>>>>>>> master
- Station --> CEP --> Data Writer
. SDP output UDP directly to CEP or to LCU so that LCU can pass it on via TCP, to
recover from data loss
. SDP output via 10GbE
<<<<<<< HEAD
. SDP CP for speed dial output, to avoid data loss
- treat all signal inputs independently (even though X and Y are always needed together)
=======
. SDP CP for speed dial (= throttle) output, to avoid data loss
- treat all signal inputs independently (even though X and Y are always needed together)
- timing
. Use sample sequence number (SSN) or mem_bsn:
- SSN increments by nof_samples_per_page = 8176
- mem_bsn increments by 1 per page, so per block of nof_samples_per_page = 8176.
. SSN counts sample periods (5 ns) since t_epoch = 1970, can fit
2**64 / (365.25 * 24 * 3600 / 5e-9) > 2922 years
. TBuf uses sop and eop to mark nof_samples_per_page = 8176
. TBuf does not need sync ?
. Start SSN or mem BSN at same time as SDP BSN by FPGA_processing_enable_RW.
>>>>>>> master
- CP per signal input buffer
. flexible start and end address (so flexible buffer time per signal input)
. freeze, unfreeze
. no need to whipe (zero) buffer contents after unfreeze ?
<<<<<<< HEAD
=======
- State
. rst --> stop <--> record
- Block diagram:
per si: pack 14b to 64b --> add mem hdr (= ssn or bsn) --> add crc --> pack 64b to 256b
mux 12 si to 1 --> mux with + 1 MM --> write to DDR4
read from DDR4 --> demux to
. 1 MM
. 1 retrieve --> unpack 256b to 64b --> check CRC --> add output hdr --> dump
>>>>>>> master
- support MP on buffer state
. signal input index
. frozen, buffering, reading
......@@ -85,8 +157,13 @@ Design decision 16GByte DDR4 na L2SDP-854, 850
- Provide direct MM access interface to DDR4
. New access multiplexer component to interface with io_ddr with:
<<<<<<< HEAD
. write 12 signal input streams + 1 MM write stream
. read 1 stream for TB readout + 1 MM read stream
=======
. write 12 signal input streams for TBuf recording + 1 MM write stream
. read 1 stream for TBuf readout + 1 MM read stream
>>>>>>> master
. Write multiplexer for 12 + 1 = 13 inputs will take ~100 M20K,
because it needs to multiplex and FIFO streams of 256 bit each and
256 bit requires 256 /40 = 7 M20K in parallel, so 13 * 7 = 91 M20K.
......@@ -94,17 +171,29 @@ Design decision 16GByte DDR4 na L2SDP-854, 850
16 kByte, so FIFO can fit (almost) two 8 kB payloads, which seems
sufficient.
<<<<<<< HEAD
Use 1 DDR4 module / FPGA
. Because 16GB is enough for T_tbuf = 3.3 s
. 1 DDR4 @ 200MHz yields 200MHz * 256b/8b = 6.4 GB/s maximum write
access. Samples data from 12 ADCs is 12 * 200MHz * 16b/8b = 4.8 GB/s.
Hence the TB function then uses 4.8 / 6.4 = 0.75 of the capacity,
=======
- Use 1 DDR4 module / FPGA
. Because 16GB is enough for T_tbuf = 3.3 s
. 1 DDR4 @ 200MHz yields 200MHz * 256b/8b = 6.4 GB/s maximum write
access. Samples data from 12 ADCs is 12 * 200MHz * 16b/8b = 4.8 GB/s.
Hence the TBuf function then uses 4.8 / 6.4 = 0.75 of the capacity,
>>>>>>> master
which is fine and leaves sufficient spare capacity for some buffer
read out, because 10Gbps / 8b = maximum 1.2 GB/s.
. If we would use 2 DDR4 modules/ FPGA, then treat them as one big
buffer with extended address space by DDR4 II, so use them
sequentially, rather than in parallel, and to still have full
<<<<<<< HEAD
freedom of allocationg memory space to signal inputs.
=======
freedom of allocating memory space to signal inputs.
>>>>>>> master
- support partial dump
. lightning >~ 1 s, cosmic ray >~ 1 ms
......@@ -115,9 +204,28 @@ Use 1 DDR4 module / FPGA
. packtetize at 64b or 256b ?
. 16b -> 64b packetize --> 64b --> 256b store
. data in buffer must have CRC --> 64b CRC ?
<<<<<<< HEAD
- dp_offload_tx header is the same for all 12 signal inputs, only si differs,
so create one header for all and modify si field to save logic and RAM
=======
. ddr page packet format: SSN + packed data + CRC
- 8 KByte page to have integer number of pages (= slots) in 16G memory, so
that DDR4-I can wrap without a gap or extend to DDR-II without a gap.
- 14b packed data
- SSN = 64b = 8B
- CRC = 64b = 8B
- 8K - 8 - 8 = 8192 - 16 = 8176 B / 14b = 4672 samples per page
- 4672 * 5 ns = 23.36 us per page, so ~42.8 pages / ms
- dp_offload_tx header is the same for all 12 signal inputs, only si differs,
so create one header for all and modify si field to save logic and RAM
. readout 1 page per tx packet
. add additional eth/ip/udp header and application header
. send packed 14b data
>>>>>>> master
- 12 input multiplexer with 12 x 256b in and 256b out to write 256b words @ 200 MHz
- use SSN as timestamp, SSN = BSN * N_fft, so can be derived from bsn_source BSN,
or do we need a dp_ssn_source.vhd?
......@@ -147,9 +255,52 @@ Use 1 DDR4 module / FPGA
- Maximum number of packets per dump
. max memory size 16GB
. max payload size 8kB
<<<<<<< HEAD
--> 16G / 8k = 2M packets --> log2(2e6) = 20.93b
. use packet serial number, instead of sop, eop bit fields, to show progress of
the packet dump to CEP
=======
--> 16G / 8k = 2M packets --> log2(2M) = 21b
. use packet serial number, instead of sop, eop bit fields, to show progress of
the packet dump to CEP
. allocate start_page/nof_pages per si to memory, wrap at max memory size
circular buffer per si, wrap after nof_pages
keep track of nof_recorded pages, when > nof pages then circular buffer is
full and carries only fresh data
keep track of ssn + page index of last recorded page
5) TBuf ICD SC-SDP, SDPTR-SDPFW
- Control Points (CP):
. FPGA_tbuf_alloc_RW [pn][si] --> start page, nof_pages (or as seperate CP?)
- nof_pages = 0 means si has no buffer, > 0 means si has buffer
. FPGA_tbuf_record_RW [pn][si] --> start/continue (True) or stop (= freeze) (False) recording
. FPGA_tbuf_retrieve_RW --> pn, si, ssn, nof_pre_pages, nof_post_pages (or as seperate CP?)
- allow only retrieve from one (pn, si) at a time
- total nof pages = nof_pre_pages + 1 (pointed by ssn) + nof_post_pages
. FPGA_tbuf_output_hdr_eth_destination_mac_RW
. FPGA_tbuf_output_hdr_ip_destination_address_RW
. FPGA_tbuf_output_hdr_udp_destination_port_RW
. FPGA_tbuf_output_enable_RW
- Monitor Points (MP):
. FPGA_tbuf_total_nof_pages_R --> 16G / 8k = 2M
. FPGA_tbuf_page_size_R --> 8 kByte
. FPGA_tbuf_nof_samples_per_page_R --> 8176
. FPGA_tbuf_page_period_R --> 23.36 us
. FPGA_tbuf_recording_R [pn][si]
. FPGA_tbuf_retrieving_R [pn][si]
Maybe:
. FPGA_tbuf_last_page [pn][si] --> index of last recorded page
. FPGA_tbuf_last_ssn [pn][si] --> ssn of last recorded page
. FPGA_tbuf_nof_recorded_pages[pn][si] --> number of fresh recorded pages <= alloc nof_pages
6) TBuf ICD STAT/SDP-CEP
>>>>>>> master
- application header fields:
. 8b marker
......@@ -159,7 +310,11 @@ Use 1 DDR4 module / FPGA
- 1b antenna_band_index
- 1b nyquist_zone_index
- 1b f_adc --> sample period is 5 ns or 6.25 ns
<<<<<<< HEAD
- 1b payload_error --> based on DDR4 read CRC
=======
- 1b memory_error --> based on DDR4 read CRC
>>>>>>> master
- 5b sample_width --> 14b
. 8b signal_input_index
. 16b nof_samples_per_packet
......@@ -171,4 +326,24 @@ Use 1 DDR4 module / FPGA
- 5b gn_index --> signal_input_index provides already all this information
<<<<<<< HEAD
=======
7) Transient detection (TDet) Design
- no self triggering yet for MVP
- will use Hilbert transform of real input and > 30MHz BPF
https://nl.mathworks.com/help/signal/ug/single-sideband-modulation-via-the-hilbert-transform.html
For the FIR Hilbert transformer we will use an odd length filter which is
computationally more efficient than an even length filter. Albeit even
length filters enjoy smaller passband errors. The savings in odd length
filters is a result that these filters have several of the coefficients that
are zero. Also, using an odd length filter will require a shift by an
integer time delay, as opposed to a fractional time delay that is required
by an even length filter. For an odd length filter, the magnitude response
of a Hilbert Transformer is zero for w=0 and w=π. For even length filers the
magnitude response doesn't have to be 0 at π, therefore they have increased
bandwidths. So for odd length filters the useful bandwidth is limited to
0 < w < π.
>>>>>>> master
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment