Skip to content
Snippets Groups Projects
Commit 17ca52ba authored by Eric Kooistra's avatar Eric Kooistra
Browse files

Updates.

parent 293b7313
No related branches found
No related tags found
No related merge requests found
Pipeline #44405 passed
...@@ -7,6 +7,7 @@ Daarvoor moeten we een component toevoegen die gebruik maakt van de 25MHz crysta ...@@ -7,6 +7,7 @@ Daarvoor moeten we een component toevoegen die gebruik maakt van de 25MHz crysta
Na clock wissel 200M --> 160M of andersom is het volgende nodig en genoeg voor SC richting SDP: Na clock wissel 200M --> 160M of andersom is het volgende nodig en genoeg voor SC richting SDP:
* doe FPGA_boot_image_RW zodat de images opnieuw geladen worden * doe FPGA_boot_image_RW zodat de images opnieuw geladen worden
* poll tot bijv FPGA_firmware_version_R de juiste naam weergeeft (dan is image op)
* write FPGA_pps_expected_cnt_RW met 160M of 200M * write FPGA_pps_expected_cnt_RW met 160M of 200M
JDM: Als ik FPGA_boot_image_RW schrijf naar de huidige waarde, hoe kan ik dan zien of de FPGAs gereboot zijn? wachten op TR_FPGA_communication_error_R == False oid? JDM: Als ik FPGA_boot_image_RW schrijf naar de huidige waarde, hoe kan ik dan zien of de FPGAs gereboot zijn? wachten op TR_FPGA_communication_error_R == False oid?
......
Detailed design: Transient Buffer function (LIFT) Detailed design: Transient Buffer (TBuf) function for LIFT project
0) MVP = minimal viable product:
1) DDR4 memory per receiver input
2) Meeting EK-BH 24 okt 2022 [2]
3) TBB (Transient Buffer Board) LOFAR1
4) TBuf (Transient Buffer) Design
5) TBuf ICD SC-SDP, SDPTR-SDPFW
6) TBuf ICD STAT/SDP-CEP
7) Transient detection (TDet) Design
References: References:
[1] LIFT requirements: https://plm.astron.nl/polarion/#/project/LOFAR2System/wiki/Overview%20pages/LIFT%20Reference [1] LIFT requirements: https://plm.astron.nl/polarion/#/project/LOFAR2System/wiki/Overview%20pages/LIFT%20Reference
https://git.astron.nl/desp/hdl/-/blob/L2SDP-857/applications/lofar2/doc/prestudy/lift_sdp_transient_buffer.txt
[2] https://support.astron.nl/confluence/display/L2M/2022-10-24+LIFT+meeting+notes [2] https://support.astron.nl/confluence/display/L2M/2022-10-24+LIFT+meeting+notes
https://support.astron.nl/confluence/display/L2M/2023-02-08+LIFT+meeting+notes
[3] L1 LOFAR2 Decision: Transport of buffer data from Station to CEP, https://support.astron.nl/confluence/pages/viewpage.action?pageId=94766339 [3] L1 LOFAR2 Decision: Transport of buffer data from Station to CEP, https://support.astron.nl/confluence/pages/viewpage.action?pageId=94766339
[4] LOFAR1 TBB: https://support.astron.nl/confluence/display/L2M/Temporary+storage+of+documents+and+papers [4] LOFAR1 TBB: https://support.astron.nl/confluence/display/L2M/Temporary+storage+of+documents+and+papers
https://support.astron.nl/confluence/pages/viewpage.action?spaceKey=L2M&title=Temporary+storage+of+documents+and+papers&preview=/17335979/23069390/TBB_Design_Description_ASTRON_SDD_047.pdf
[5] LOFAR2 PDR: https://support.astron.nl/confluence/pages/viewpage.action?spaceKey=L2M&title=2019-07-01+Meeting+notes%3A+Transient+Buffer+functionality [5] LOFAR2 PDR: https://support.astron.nl/confluence/pages/viewpage.action?spaceKey=L2M&title=2019-07-01+Meeting+notes%3A+Transient+Buffer+functionality
0) MVP = minimal viable product: 0) MVP = minimal viable product:
- omvat TB, maar nog niet Transient Detectie (TDet) - omvat TBuf, maar nog niet Transient Detectie (TDet)
- needs ring, because 6 antennas per event are typically not connected to one FPGA - needs ring, because 6 antennas per event are typically not connected to one FPGA
...@@ -35,8 +56,8 @@ In LOFAR12060 [1] time series data en pulse data, wat is pulse data : ...@@ -35,8 +56,8 @@ In LOFAR12060 [1] time series data en pulse data, wat is pulse data :
Hoe streng is de 3.33 s, mag 3.2 s ook ? Hoe streng is de 3.33 s, mag 3.2 s ook ?
Gebruik een 16 GByte module per FPGA, zodat uitbreiding naar 6.66 s mopgelijkl is door beide slots the gebruiken Gebruik een 16 GByte module per FPGA, zodat uitbreiding naar 6.66 s mopgelijkl is door beide slots the gebruiken
Uitlezen per receiver input, zodat uitlezen van een deel vd receiver inputs mogelijk is (bijv 12 vd 192 in [1]) Uitlezen per receiver input, zodat uitlezen van een deel vd receiver inputs mogelijk is (bijv 12 vd 192 in [1])
Defineer TB functie per receiver input: Defineer TBuf functie per receiver input:
. zodat de TB functie makkelijk uitbreidbaar is naar meer inputs en naar meer DDR4 modules. . zodat de TBuf functie makkelijk uitbreidbaar is naar meer inputs en naar meer DDR4 modules.
. data capture en uitlezen van een receiver input onafhankelijk kan van de andere receiver inputs . data capture en uitlezen van een receiver input onafhankelijk kan van de andere receiver inputs
...@@ -60,3 +81,215 @@ Design decision 16GByte DDR4 na L2SDP-854, 850 ...@@ -60,3 +81,215 @@ Design decision 16GByte DDR4 na L2SDP-854, 850
- Buffer lengte versus nof antennes - Buffer lengte versus nof antennes
- Self trigger - Self trigger
3) TBB (Transient Buffer Board) LOFAR1
- From 2.3 in [4]
. uses 2048 Byte pages
. addressed based -> typically for write/store
time based -> typically for read/retriev
. 16 channels free size --> not fragmented, not overlap, nof pages/channel, circular
4) Transient Buffer (TBuf) Design
- buffer raw data, no need to buffer subbands
- choose fixe 14b data, so not e.g. 8 Msbits for lighting and 8 Lsbits for
cosmic ray. Always using full W_adc = 14b makes design and usage more clear.
- Station --> CEP --> Data Writer
. SDP output UDP directly to CEP or to LCU so that LCU can pass it on via TCP, to
recover from data loss
. SDP output via 10GbE
. SDP CP for speed dial (= throttle) output, to avoid data loss
- treat all signal inputs independently (even though X and Y are always needed together)
- timing
. Use sample sequence number (SSN) or mem_bsn:
- SSN increments by nof_samples_per_page = 8176
- mem_bsn increments by 1 per page, so per block of nof_samples_per_page = 8176.
. SSN counts sample periods (5 ns) since t_epoch = 1970, can fit
2**64 / (365.25 * 24 * 3600 / 5e-9) > 2922 years
. TBuf uses sop and eop to mark nof_samples_per_page = 8176
. TBuf does not need sync ?
. Start SSN or mem BSN at same time as SDP BSN by FPGA_processing_enable_RW.
- CP per signal input buffer
. flexible start and end address (so flexible buffer time per signal input)
. freeze, unfreeze
. no need to whipe (zero) buffer contents after unfreeze ?
- State
. rst --> stop <--> record
- Block diagram:
per si: pack 14b to 64b --> add mem hdr (= ssn or bsn) --> add crc --> pack 64b to 256b
mux 12 si to 1 --> mux with + 1 MM --> write to DDR4
read from DDR4 --> demux to
. 1 MM
. 1 retrieve --> unpack 256b to 64b --> check CRC --> add output hdr --> dump
- support MP on buffer state
. signal input index
. frozen, buffering, reading
. start address (time), end address (time)
- Provide direct MM access interface to DDR4
. New access multiplexer component to interface with io_ddr with:
. write 12 signal input streams for TBuf recording + 1 MM write stream
. read 1 stream for TBuf readout + 1 MM read stream
. Write multiplexer for 12 + 1 = 13 inputs will take ~100 M20K,
because it needs to multiplex and FIFO streams of 256 bit each and
256 bit requires 256 /40 = 7 M20K in parallel, so 13 * 7 = 91 M20K.
. One M20K = 20b * 1024 words = 40b * 512 words, 512 words of 256b =
16 kByte, so FIFO can fit (almost) two 8 kB payloads, which seems
sufficient.
- Use 1 DDR4 module / FPGA
. Because 16GB is enough for T_tbuf = 3.3 s
. 1 DDR4 @ 200MHz yields 200MHz * 256b/8b = 6.4 GB/s maximum write
access. Samples data from 12 ADCs is 12 * 200MHz * 16b/8b = 4.8 GB/s.
Hence the TBuf function then uses 4.8 / 6.4 = 0.75 of the capacity,
which is fine and leaves sufficient spare capacity for some buffer
read out, because 10Gbps / 8b = maximum 1.2 GB/s.
. If we would use 2 DDR4 modules/ FPGA, then treat them as one big
buffer with extended address space by DDR4 II, so use them
sequentially, rather than in parallel, and to still have full
freedom of allocating memory space to signal inputs.
- support partial dump
. lightning >~ 1 s, cosmic ray >~ 1 ms
. dump t0 - t1
. dump last dt
- packetize voor buffer write of na buffer read? --> voor
. packtetize at 64b or 256b ?
. 16b -> 64b packetize --> 64b --> 256b store
. data in buffer must have CRC --> 64b CRC ?
. ddr page packet format: SSN + packed data + CRC
- 8 KByte page to have integer number of pages (= slots) in 16G memory, so
that DDR4-I can wrap without a gap or extend to DDR-II without a gap.
- 14b packed data
- SSN = 64b = 8B
- CRC = 64b = 8B
- 8K - 8 - 8 = 8192 - 16 = 8176 B / 14b = 4672 samples per page
- 4672 * 5 ns = 23.36 us per page, so ~42.8 pages / ms
- dp_offload_tx header is the same for all 12 signal inputs, only si differs,
so create one header for all and modify si field to save logic and RAM
. readout 1 page per tx packet
. add additional eth/ip/udp header and application header
. send packed 14b data
- 12 input multiplexer with 12 x 256b in and 256b out to write 256b words @ 200 MHz
- use SSN as timestamp, SSN = BSN * N_fft, so can be derived from bsn_source BSN,
or do we need a dp_ssn_source.vhd?
- unb2c_test_ddr_16G resource usage
. git/hdl/boards/uniboard2c/designs/unb2c_test/revisions/unb2c_test_ddr_16G/unb2c_test_ddr_16G_resource_usage.jpg
. per module:
wr_fifo 13 M20K
tech_ddr 9 M20K
rd_fifo 4 M20K
diag db 0 M20K
diag bg 0 M20K
--> Total 26 M20K/DDR4 module
. board common:
MMM : 69 M20K voor Nios memory
ctrl: 42 M20K voor MMAP ROM en 1GbE
- store and send 14b packed data
. so do not use 16b (with 2b sign extension), to optimize for memory usage and
transport capacity (at the expense of requiring tools to observe the payload
contents).
. store application packet with CRC in DDR4
. store packed 14b data for 16/14 = 1.14 more buffer space (3.3s --> 3.8s)
. send unpacked 16b data to CEP with new CRC
. CRC = 64b, header multiple of 64b, nof samples per payload multiple of 64b
- Maximum number of packets per dump
. max memory size 16GB
. max payload size 8kB
--> 16G / 8k = 2M packets --> log2(2M) = 21b
. use packet serial number, instead of sop, eop bit fields, to show progress of
the packet dump to CEP
. allocate start_page/nof_pages per si to memory, wrap at max memory size
circular buffer per si, wrap after nof_pages
keep track of nof_recorded pages, when > nof pages then circular buffer is
full and carries only fresh data
keep track of ssn + page index of last recorded page
5) TBuf ICD SC-SDP, SDPTR-SDPFW
- Control Points (CP):
. FPGA_tbuf_alloc_RW [pn][si] --> start page, nof_pages (or as seperate CP?)
- nof_pages = 0 means si has no buffer, > 0 means si has buffer
. FPGA_tbuf_record_RW [pn][si] --> start/continue (True) or stop (= freeze) (False) recording
. FPGA_tbuf_retrieve_RW --> pn, si, ssn, nof_pre_pages, nof_post_pages (or as seperate CP?)
- allow only retrieve from one (pn, si) at a time
- total nof pages = nof_pre_pages + 1 (pointed by ssn) + nof_post_pages
. FPGA_tbuf_output_hdr_eth_destination_mac_RW
. FPGA_tbuf_output_hdr_ip_destination_address_RW
. FPGA_tbuf_output_hdr_udp_destination_port_RW
. FPGA_tbuf_output_enable_RW
- Monitor Points (MP):
. FPGA_tbuf_total_nof_pages_R --> 16G / 8k = 2M
. FPGA_tbuf_page_size_R --> 8 kByte
. FPGA_tbuf_nof_samples_per_page_R --> 8176
. FPGA_tbuf_page_period_R --> 23.36 us
. FPGA_tbuf_recording_R [pn][si]
. FPGA_tbuf_retrieving_R [pn][si]
Maybe:
. FPGA_tbuf_last_page [pn][si] --> index of last recorded page
. FPGA_tbuf_last_ssn [pn][si] --> ssn of last recorded page
. FPGA_tbuf_nof_recorded_pages[pn][si] --> number of fresh recorded pages <= alloc nof_pages
6) TBuf ICD STAT/SDP-CEP
- application header fields:
. 8b marker
. 8b version_id
. 16b station_id
. 32b source_info
- 1b antenna_band_index
- 1b nyquist_zone_index
- 1b f_adc --> sample period is 5 ns or 6.25 ns
- 1b memory_error --> based on DDR4 read CRC
- 5b sample_width --> 14b
. 8b signal_input_index
. 16b nof_samples_per_packet
. 24b packet serial number in current dump
. 24b total nof packets in current dump
. 64b SSN = Sample Sequence Number
No need for:
- 32b observation_id --> also not in LOFAR1
- 5b gn_index --> signal_input_index provides already all this information
7) Transient detection (TDet) Design
- no self triggering yet for MVP
- will use Hilbert transform of real input and > 30MHz BPF
https://nl.mathworks.com/help/signal/ug/single-sideband-modulation-via-the-hilbert-transform.html
For the FIR Hilbert transformer we will use an odd length filter which is
computationally more efficient than an even length filter. Albeit even
length filters enjoy smaller passband errors. The savings in odd length
filters is a result that these filters have several of the coefficients that
are zero. Also, using an odd length filter will require a shift by an
integer time delay, as opposed to a fractional time delay that is required
by an even length filter. For an odd length filter, the magnitude response
of a Hilbert Transformer is zero for w=0 and w=π. For even length filers the
magnitude response doesn't have to be 0 at π, therefore they have increased
bandwidths. So for odd length filters the useful bandwidth is limited to
0 < w < π.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment